An LLVM backend for the Accelerate Array Language
This package compiles Accelerate code to LLVM IR, and executes that code on
multicore CPUs as well as NVIDIA GPUs. This avoids the need to go through
clang. For details on Accelerate, refer to the main repository.
We love all kinds of contributions, so feel free to open issues for missing features as well as report (or fix!) bugs on the issue tracker.
- Installing LLVM Homebrew Debian/Ubuntu * Building from source
- Installing Accelerate-LLVM * libNVVM
Haskell dependencies are available from Hackage, but there are several external library dependencies that you will need to install as well:
libFFI(if using the
accelerate-llvm-nativebackend for multicore CPUs)
CUDA(if using the
accelerate-llvm-ptxbackend for NVIDIA GPUs)
A docker container is provided with this package
preinstalled (via stack) at
/opt/accelerate-llvm. Note that if you wish to use
accelerate-llvm-ptx GPU backend, you will need to install the NVIDIA
docker plugin; see that page for more
$ docker run -it tmcdonell/accelerate-llvm
When installing LLVM, make sure that it includes the
libLLVM shared library.
If you want to use the GPU targeting
accelerate-llvm-ptx backend, make sure
you install (or build) LLVM with the 'nvptx' target.
Example using Homebrew on macOS:
$ brew install llvm-hs/homebrew-llvm/llvm-6.0
For Debian/Ubuntu based Linux distributions, the LLVM.org website provides binary distribution packages. Check apt.llvm.org for instructions for adding the correct package database for your OS version, and then:
$ apt-get install llvm-6.0-dev
Building from source
If your OS does not have an appropriate LLVM distribution available, you can also build from source. Detailed build instructions are available on the LLVM.org website. Note that you will require at least CMake 3.4.3 and a recent C++ compiler; at least Clang 3.1, GCC 4.8, or Visual Studio 2015 (update 3).
Download and unpack the LLVM-6.0 source code. We'll refer to the path that the source tree was unpacked to as
LLVM_SRC. Only the main LLVM source tree is required, but you can optionally add other components such as the Clang compiler or Polly loop optimiser. See the LLVM releases page for the complete list.
Create a temporary build directory and
cdinto it, for example:
$ mkdir /tmp/build $ cd /tmp/build
Execute the following to configure the build. Here
INSTALL_PREFIXis where LLVM is to be installed, for example
$ cmake $LLVM_SRC -DCMAKE_INSTALL_PREFIX=$INSTALL_PREFIX -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON
See options and variables for a list of additional build parameters you can specify.
Build and install:
$ cmake --build . $ cmake --build . --target install
For macOS only, some additional steps are useful to work around issues related to System Integrity Protection:
cd $INSTALL_PREFIX/lib ln -s libLLVM.dylib libLLVM-6.0.dylib install_name_tool -id $PWD/libLTO.dylib libLTO.dylib install_name_tool -id $PWD/libLLVM.dylib libLLVM.dylib install_name_tool -change '@rpath/libLLVM.dylib' $PWD/libLLVM.dylib libLTO.dylib
Once the dependencies are installed, we are ready to install
For example, installation using
just requires you to point it to the appropriate configuration file:
$ ln -s stack-8.2.yaml stack.yaml $ stack setup $ stack install
Note that the version of
used must match the installed version of LLVM, which is currently 6.0.
accelerate-llvm-ptx backend can optionally be compiled to generate GPU
code using the
libNVVM library, rather than LLVM's inbuilt NVPTX code
libNVVM is a closed-source library distributed as part of the
NVIDIA CUDA toolkit, and is what the
nvcc compiler itself uses internally when
compiling CUDA C code.
libNVVM may improve GPU performance compared to the code generator
built in to LLVM. One difficulty with using it however is that since
is also based on LLVM, and typically lags LLVM by several releases, you must
accelerate-llvm with a "compatible" version of LLVM, which will depend
on the version of the CUDA toolkit you have installed. The following table shows
| | LLVM-3.3 | LLVM-3.4 | LLVM-3.5 | LLVM-3.8 | LLVM-3.9 | LLVM-4.0 | LLVM-5.0 | |:------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:| | CUDA-7.0 | ⭕ | ❌ | | | | | | | CUDA-7.5 | | ⭕ | ⭕ | ❌ | | | | | CUDA-8.0 | | | ⭕ | ⭕ | ❌ | ❌ | | | CUDA-9.0 | | | | | | ❌ | ❌ |
Where ⭕ = Works, and ❌ = Does not work.
Note that the above restrictions on CUDA and LLVM version exist only if you want to use the NVVM component. Otherwise, you should be free to use any combination of CUDA and LLVM.
Also note that
accelerate-llvm-ptx itself currently requires at least LLVM-3.5.
stack, either edit the
stack.yaml and add the following section:
flags: accelerate-llvm-ptx: nvvm: true
Or install using the following option on the command line:
$ stack install accelerate-llvm-ptx --flag accelerate-llvm-ptx:nvvm
If installing via
$ cabal install accelerate-llvm-ptx -fnvvm
Notable changes to the project will be documented in this file.
188.8.131.52 - 2018-04-03
runvariants which do not take an explicit execution context now execute on the first available device in an exclusive fashion. Multi-GPU systems can specify the default set of GPUs to use with environment variable
ACCELERATE_LLVM_PTX_DEVICESas a list of device ordinals.
- support for half-precision floats
- support for struct-of-array-of-struct representations
- support 64-bit atomic-add instruction in forward permutations (#363)
- support for LLVM-6.0
- support for GHC-8.4
Special thanks to those who contributed patches as part of this release:
- Trevor L. McDonell (@tmcdonell)
- Moritz Kiefer (@cocreature)
184.108.40.206 - 2018-01-08
add support for building with CUDA-9.x
220.127.116.11 - 2017-09-21
- support for GHC-8.2
- caching of compilation results (accelerate-llvm#17)
- support for ahead-of-time compilation (
Fixed synchronisation bug in multidimensional reduction
18.104.22.168 - 2017-05-25
device kernel image is invalid (#386)
22.214.171.124 - 2017-03-31