Skip to content

Commit

Permalink
sulong: add toolchain
Browse files Browse the repository at this point in the history
Some changes contributed by "Gilles Duboscq <[email protected]>"
  • Loading branch information
zapster committed Jul 10, 2019
1 parent 5ebbbd6 commit d5eebf7
Show file tree
Hide file tree
Showing 28 changed files with 1,758 additions and 33 deletions.
5 changes: 5 additions & 0 deletions sulong/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Version 19.2.0

New features:

* Preliminary support for compiling to bitcode using the LLVM toolchain.
See [docs/TOOLCHAIN.md](docs/TOOLCHAIN.md) for more details.

Improvements:

* Improved display of pointers to foreign objects in the LLVM debugger.
Expand Down
115 changes: 115 additions & 0 deletions sulong/docs/TOOLCHAIN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Toolchain

*The toolchain* is a set of tools and APIs for compiling native project, such as C, C++,
to bitcode that can be executed with the GraalVM LLVM runtime.
Its aim is to simplify the ahead-of-time compilation aspect for both, users
and language implementers, who want to use the LLVM runtime.

## Use Cases

1. **Simplify compilation to bitcode**
If *GraalVM users* want to run a native project via the LLVM runtime in GraalVM,
they need to compile their project to LLVM bitcode.
Although it is possible to do this with standard LLVM tools (`clang`, `llvm-link`, etc.),
it requires some considerations (for example certain optimizations or manual linking).
The toolchain aims to provide drop in replacements to compile many native projects for
the GraalVM LLVM runtime out of the box.

2. **Compilation of native extensions**
*GraalVM language implementers* use the LLVM runtime to execute native extensions.
Often, these extensions are installed by the user using some kind of package manager.
In python, for example, packages are usually added via `pip install`.
To enable this, languages need a way of compiling native extension on demand.
The toolchain provides Java API for languages to access the tools that are
(optionally) bundled with GraalVM.

3. **Compiling to bitcode at build time**
*GraalVM languages* that integrate with the LLVM runtime usually need to build
bitcode libraries to integrate with native pieces of their implementation.
The toolchain can be used as a build-time dependency do achieve this in a
standardized and compatible way.

## File Format

To be compatible with existing build systems, the toolchain will by default
produce native executables with embedded bitcode (ELF files on Linux, Mach-O
files on MacOS).

## Toolchain Identifier

The GraalVM LLVM runtime can run in different configurations, which require bitcode to be compiled differently.
Users of the toolchain do not need to care about this.
The LLVM runtime knows in which mode it is running and will always provide the right toolchain.
However, since a language implementation might want to store the result of a
toolchain compilation for later use, it need to be able to identify it.
To do so, the toolchain provides an *identifier*.
Conventionally, the identifier is used as a directory name and the results of a
compilation are placed in there.
The internal LLVM runtime library layout follows the same approach.

## Java API

Language implementations can access the toolchain via the [`Toolchain`](../../sulong/projects/com.oracle.truffle.llvm.api/src/com/oracle/truffle/llvm/api/Toolchain.java) service.
The service provides two methods:

* `TruffleFile getToolPath(String tool)`
The method returns the path to the executable for a given tool.
Every implementation is free to choose its own set of supported tools.
The command line interface of the executable is specific to the tool.
If a tool is not supported or not known, `null` is returned.
Consult the Javadoc for a list known tools.
* `String getIdentifier()`
Returns the identifier for the toolchain.
It can be used to distinguish results produced by different toolchains.
The identifier can be used as a path suffix to place results in distinct locations,
therefore it does not contain special characters like slashes or spaces.

The `Toolchain` lives in the `SULONG_API` distribution.
The LLVM runtime will always provide a toolchain that matches its current mode.
The service can be looked-up via the `Env`:

```Java
LanguageInfo llvmInfo = env.getPublicLanguages().get("llvm");
Toolchain toolchain = env.lookup(llvmInfo, Toolchain.class);
TruffleFile toolPath = toolchain.getToolPath("CC");
String toolchainId = toolchain.getIdentifier();
```

## `mx` integration

On the `mx` side, the toolchain can be accessed via the *substituions* `toolchainGetToolPath` and `toolchainGetIdentifier`.
Note that they expect a toolchain name as first argument. See for example the following snippet from a `suite.py` file:

```python
"buildEnv" : {
"CC": "<toolchainGetToolPath:native,CC>",
"CXX": "<toolchainGetToolPath:native,CXX>",
"PLATFORM": "<toolchainGetIdentifier:native>",
},
```

## GraalVM Deployment

On the implementation side, _the toolchain_ consists of multiple ingredients:

* The **LLVM.org component** is similar to a regular [LLVM release](https://llvm.org) (clang, lld, llvm-* tools)
but includes a few patches that are not yet [upstream](https://github.com/llvm/llvm-projec).
Those patches are general feature improvements that are not specific to GraalVM.
In GraalVM, the LLVM.org component is located in `$GRAALVM/jre/lib/llvm/`.
This component is considered as internal and should not be directly used.
The LLVM.org component might not be install by default. If that is the case, it can be installed via `gu install llvm-toolchain`.
* The **toolchain wrappers** are GraalVM launchers that invoke the tools from the LLVM.org component with special flags
to produce results that can be executed by the LLVM runtime of GraalVM. The Java and `mx` APIs return paths to those wrappers.
In GraalVM, the wrappers live in `$GRAALVM/jre/languages/llvm/$TOOLCHAIN_ID/bin/`. The wrappers are shipped with the
GraalVM LLVM runtime and do not need to be installed separately.
Those are meant to be drop in replacements for the C/C++ compiler when compiling a native project.
The goal is to produce a GraalVM LLVM runtime executable result by simply pointing any build system to those wrappers,
for example via `CC`/`CXX` environment variables or by setting `PATH`.

## Bootstrapping Toolchain

During building, the LLVM.org component is available in the `mxbuild/SULONG_LLVM_ORG` distribution.
Bootstrapping wrappers can be found in the `mxbuild/SULONG_BOOTSTRAP_TOOLCHAIN` distribution.
However, the APIs will take care of providing the right one.
These distributions are for manual usage only and are considered unstable and might change without notice.
Do not depend on them.
2 changes: 1 addition & 1 deletion sulong/mx.sulong/copyrights/oracle.copyright.regex.hash
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#
#(?:!.*)?
# Copyright \(c\) (\d\d\d\d, )?(\d\d\d\d), Oracle and\/or its affiliates\.
#
# All rights reserved\.
Expand Down
176 changes: 173 additions & 3 deletions sulong/mx.sulong/mx_sulong.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
# OF THE POSSIBILITY OF SUCH DAMAGE.
#
import sys
import tarfile
import os
from os.path import join
Expand All @@ -35,6 +36,7 @@
from argparse import ArgumentParser

import mx
import mx_gate
import mx_subst
import mx_sdk
import re
Expand Down Expand Up @@ -173,6 +175,10 @@ def _sulong_gate_runner(args, tasks):
_sulong_gate_testsuite('Args', 'other', tasks, args, tags=['args', 'sulongMisc', 'sulongCoverage'], testClasses=['com.oracle.truffle.llvm.test.MainArgsTest'])
_sulong_gate_testsuite('Callback', 'other', tasks, args, tags=['callback', 'sulongMisc', 'sulongCoverage'], testClasses=['com.oracle.truffle.llvm.test.CallbackTest'])
_sulong_gate_testsuite('Varargs', 'other', tasks, args, tags=['vaargs', 'sulongMisc', 'sulongCoverage'], testClasses=['com.oracle.truffle.llvm.test.VAArgsTest'])
with Task('TestToolchain', tasks, tags=['toolchain', 'sulongMisc', 'sulongCoverage']) as t:
if t:
mx.command_function('clean')(['--project', 'toolchain-launchers-tests'] + args.extra_build_args)
mx.command_function('build')(['--project', 'toolchain-launchers-tests'] + args.extra_build_args)


add_gate_runner(_suite, _sulong_gate_runner)
Expand Down Expand Up @@ -537,6 +543,30 @@ def getClangImplicitArgs():

mx_subst.path_substitutions.register_no_arg('clangImplicitArgs', getClangImplicitArgs)


def get_mx_exe():
mxpy = join(mx._mx_home, 'mx.py')
commands = [sys.executable, '-u', mxpy, '--java-home=' + mx.get_jdk().home]
return ' '.join(commands)


mx_subst.path_substitutions.register_no_arg('mx_exe', get_mx_exe)


def get_jacoco_setting():
return mx_gate._jacoco


mx_subst.path_substitutions.register_no_arg('jacoco', get_jacoco_setting)


mx.add_argument('--jacoco-exec-file', help='the coverage result file of JaCoCo', default='jacoco.exec')


def mx_post_parse_cmd_line(opts):
mx_gate.JACOCO_EXEC = opts.jacoco_exec_file


def getGCCVersion(gccProgram):
"""executes the program with --version and extracts the GCC version string"""
versionString = getVersion(gccProgram)
Expand Down Expand Up @@ -684,14 +714,140 @@ def getResults(self):

mx_benchmark.add_bm_suite(mx_sulong_benchmarks.SulongBenchmarkSuite())


_toolchains = {}


def _get_toolchain(toolchain_name):
if toolchain_name not in _toolchains:
mx.abort("Toolchain '{}' does not exists! Known toolchains {}".format(toolchain_name, ", ".join(_toolchains.keys())))
return _toolchains[toolchain_name]


def _get_toolchain_tool(name_tool):
name, tool = name_tool.split(",", 2)
return _get_toolchain(name).get_toolchain_tool(tool)


mx_subst.path_substitutions.register_with_arg('toolchainGetToolPath', _get_toolchain_tool)
mx_subst.path_substitutions.register_with_arg('toolchainGetIdentifier',
lambda name: _get_toolchain(name).get_toolchain_subdir())


class ToolchainConfig(object):
_tool_map = {
"CC": ["graalvm-{name}-clang", "graalvm-clang", "clang", "cc", "gcc"],
"CXX": ["graalvm-{name}-clang++", "graalvm-clang++", "clang++", "c++", "g++"],
}

def __init__(self, name, dist, bootstrap_dist, tools, suite):
self.name = name
self.dist = dist
self.bootstrap_dist = bootstrap_dist
self.tools = tools
self.suite = suite
self.mx_command = self.name + '-toolchain'
self.tool_map = {tool: [alias.format(name=name) for alias in aliases] for tool, aliases in ToolchainConfig._tool_map.items()}
self.exe_map = {exe: tool for tool, aliases in self.tool_map.items() for exe in aliases}
# register mx command
mx.update_commands(_suite, {
self.mx_command: [self._toolchain_helper, 'launch {} toolchain commands'.format(self.name)],
})
if self.name in _toolchains:
mx.abort("Toolchain '{}' registered twice".format(self.name))
_toolchains[self.name] = self

def _toolchain_helper(self, args=None, out=None):
parser = ArgumentParser(prog='mx ' + self.mx_command, description='launch toolchain commands',
epilog='Additional arguments are forwarded to the LLVM image command.', add_help=False)
parser.add_argument('command', help='toolchain command', metavar='<command>',
choices=self._supported_exes())
parsed_args, tool_args = parser.parse_known_args(args)
main = self._tool_to_main(self.exe_map[parsed_args.command])
if "JACOCO" in os.environ:
mx_gate._jacoco = os.environ["JACOCO"]
return mx.run_java(mx.get_runtime_jvm_args([self.dist]) + [main] + tool_args, out=out)

def _supported_exes(self):
return [exe for tool in self._supported_tools() for exe in self._tool_to_aliases(tool)]

def _supported_tools(self):
return self.tools.keys()

def _tool_to_exe(self, tool):
return self._tool_to_aliases(tool)[0]

def _tool_to_aliases(self, tool):
self._check_tool(tool)
return self.tool_map[tool]

def _tool_to_main(self, tool):
self._check_tool(tool)
return self.tools[tool]

def _check_tool(self, tool):
if tool not in self._supported_tools():
mx.abort("The {} toolchain (defined by {}) does not support tool '{}'".format(self.name, self.dist, tool))

def get_toolchain_tool(self, tool):
return os.path.join(mx.distribution(self.bootstrap_dist).get_output(), 'bin', self._tool_to_exe(tool))

def get_toolchain_subdir(self):
return self.name

def get_launcher_configs(self):
return [
mx_sdk.LauncherConfig(
destination=os.path.join(self.name, 'bin', self._tool_to_exe(tool)),
jar_distributions=[self.suite.name + ":" + self.dist],
main_class=self._tool_to_main(tool),
build_args=[
'--macro:truffle', # we need tool:truffle so that Engine.findHome works
'-H:-ParseRuntimeOptions', # we do not want `-D` options parsed by SVM
],
is_main_launcher=False,
default_symlinks=False,
links=[os.path.join(self.name, 'bin', e) for e in self._tool_to_aliases(tool)],
) for tool in self._supported_tools()
]


class ToolchainLauncherProject(mx.NativeProject):
def __init__(self, suite, name, deps, workingSets, subDir, results=None, output=None, buildRef=True, **attrs):
results = ["bin/" + e for e in suite.toolchain._supported_exes()]
projectDir = attrs.pop('dir', None)
if projectDir:
d = join(suite.dir, projectDir)
elif subDir is None:
d = join(suite.dir, name)
else:
d = join(suite.dir, subDir, name)
super(ToolchainLauncherProject, self).__init__(suite, name, subDir, [], deps, workingSets, results, output, d, **attrs)

def getBuildEnv(self, replaceVar=mx_subst.path_substitutions):
env = super(ToolchainLauncherProject, self).getBuildEnv(replaceVar=replaceVar)
env['RESULTS'] = ' '.join(self.results)
return env


_suite.toolchain = ToolchainConfig('native', 'SULONG_TOOLCHAIN_LAUNCHERS', 'SULONG_BOOTSTRAP_TOOLCHAIN',
# unfortunately, we cannot define those in the suite.py because graalvm component
# registration runs before the suite is properly initialized
tools={
"CC": "com.oracle.truffle.llvm.toolchain.launchers.Clang",
"CXX": "com.oracle.truffle.llvm.toolchain.launchers.ClangXX",
},
suite=_suite)


mx_sdk.register_graalvm_component(mx_sdk.GraalVmLanguage(
suite=_suite,
name='Sulong',
short_name='slg',
dir_name='llvm',
license_files=[],
third_party_license_files=[],
truffle_jars=['sulong:SULONG'],
truffle_jars=['sulong:SULONG', 'sulong:SULONG_API'],
support_distributions=[
'sulong:SULONG_HOME',
'sulong:SULONG_GRAALVM_DOCS',
Expand All @@ -703,10 +859,23 @@ def getResults(self):
main_class='com.oracle.truffle.llvm.launcher.LLVMLauncher',
build_args=[],
language='llvm',
)
],
),
] + _suite.toolchain.get_launcher_configs()
))

mx_sdk.register_graalvm_component(mx_sdk.GraalVmComponent(
suite=_suite,
name='LLVM.org toolchain',
short_name='llp',
installable=True,
installable_id='llvm-toolchain',
dir_name='jre/lib/llvm',
license_files=[],
third_party_license_files=[],
support_distributions=['sulong:SULONG_LLVM_ORG']
))


COPYRIGHT_HEADER_BSD = """\
/*
* Copyright (c) 2016, 2019, Oracle and/or its affiliates.
Expand Down Expand Up @@ -746,6 +915,7 @@ def create_asm_parser(args=None, out=None):
"""create the inline assembly parser using antlr"""
mx.suite("truffle").extensions.create_parser("com.oracle.truffle.llvm.asm.amd64", "com.oracle.truffle.llvm.asm.amd64", "InlineAssembly", COPYRIGHT_HEADER_BSD, args, out)


mx.update_commands(_suite, {
'lli' : [runLLVM, ''],
'test-llvm-image' : [_test_llvm_image, 'test a pre-built LLVM image'],
Expand Down
Loading

0 comments on commit d5eebf7

Please sign in to comment.