An embedded language for accelerated array processing

Version on this page:
LTS Haskell 11.22:
Stackage Nightly 2018-03-12:
Latest on Hackage:

See all snapshots accelerate appears in

BSD-3-Clause licensed by Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Sean Lee, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried
This version can be pinned in stack with:accelerate-,10443

Module documentation for

  • Data
    • Data.Array
      • Data.Array.Accelerate
        • Data.Array.Accelerate.AST
        • Data.Array.Accelerate.Analysis
          • Data.Array.Accelerate.Analysis.Match
          • Data.Array.Accelerate.Analysis.Shape
          • Data.Array.Accelerate.Analysis.Stencil
          • Data.Array.Accelerate.Analysis.Type
        • Data.Array.Accelerate.Array
          • Data.Array.Accelerate.Array.Data
          • Data.Array.Accelerate.Array.Representation
          • Data.Array.Accelerate.Array.Sugar
        • Data.Array.Accelerate.Data
        • Data.Array.Accelerate.Debug
        • Data.Array.Accelerate.Error
        • Data.Array.Accelerate.Interpreter
        • Data.Array.Accelerate.Pretty
        • Data.Array.Accelerate.Smart
        • Data.Array.Accelerate.Trafo
        • Data.Array.Accelerate.Tuple
        • Data.Array.Accelerate.Type

Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.

A simple example

As a simple example, consider the computation of a dot product of two vectors of floating point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using Data.Array.Accelerate.CUDA it may be on-the-fly off-loaded to the GPU.

Available backends

Currently, there are two backends:

  1. An interpreter that serves as a reference implementation of the intended semantics of the language, which is included in this package.

  2. A CUDA backend generating code for CUDA-capable NVIDIA GPUs:

Several experimental and/or incomplete backends also exist. If you are particularly interested in any of these, especially with helping to finish them, please contact us.

  1. Cilk/ICC and OpenCL:

  2. Another OpenCL backend:

  3. A backend to the Repa array library:

  4. An infrastructure for generating LLVM code, with backends targeting multicore CPUs and NVIDIA GPUs:

Additional components

The following support packages are available:

  1. accelerate-cuda: A high-performance parallel backend targeting CUDA-enabled NVIDIA GPUs. Requires the NVIDIA CUDA SDK and, for full functionality, hardware with compute capability 1.1 or greater. See the table on Wikipedia for supported GPUs:

  2. accelerate-examples: Computational kernels and applications showcasing Accelerate, as well as performance and regression tests.

  3. accelerate-io: Fast conversion between Accelerate arrays and other formats, including vector and repa.

  4. accelerate-fft: Computation of Discrete Fourier Transforms.

Install them from Hackage with cabal install PACKAGE

Examples and documentation

Haddock documentation is included in the package, and a tutorial is available on the GitHub wiki:

The accelerate-examples package demonstrates a range of computational kernels and several complete applications, including:

  • An implementation of the Canny edge detection algorithm

  • An interactive Mandelbrot set generator

  • A particle-based simulation of stable fluid flows

  • An n-body simulation of gravitational attraction between solid particles

  • A cellular automata simulation

  • A "password recovery" tool, for dictionary lookup of MD5 hashes

  • A simple interactive ray tracer

Mailing list and contacts
Hackage note

The module documentation list generated by Hackage is incorrect. The only exposed modules should be:

  • Data.Array.Accelerate

  • Data.Array.Accelerate.Interpreter

  • Data.Array.Accelerate.Data.Complex


  • Compiles with ghc-7.8 and ghc-7.10

  • Minor bug fixes

  • Bug fixes and performance improvements.

  • New iteration constructs.

  • Additional Prelude-like functions.

  • Improved code generation and fusion optimisation.

  • Concurrent kernel execution in the CUDA backend.

  • Bug fixes.

  • New array fusion optimisation.

  • New foreign function interface for array and scalar expressions.

  • Additional Prelude-like functions.

  • New example programs.

  • Bug fixes and performance improvements.

  • Full sharing recovery in scalar expressions and array computations.

  • Two new example applications in package accelerate-examples: Real-time Canny edge detection and an interactive fluid flow simulator (both including a graphical frontend).

  • Bug fixes.

  • New Prelude-like functions zip*, unzip*, fill, enumFrom*, tail, init, drop, take, slit, gather*, scatter*, and shapeSize.

  • New simplified AST (in package accelerate-backend-kit) for backend writers who want to avoid the complexities of the type-safe AST.

  • Complete sharing recovery for scalar expressions (but currently disabled by default).

  • Also bug fixes in array sharing recovery and a few new convenience functions.

  • Streaming computations

  • Precompilation

  • Repa-style array indices

  • Additional collective operations supported by the CUDA backend: stencils, more scans, rank-polymorphic fold, generate.

  • Conversions to other array formats

  • Bug fixes

  • Bug fixes and some performance tweaks.

  • More collective operations supported by the CUDA backend: replicate, slice and foldSeg. Frontend and interpreter support for stencil.

  • Bug fixes.

  • Initial release of the CUDA backend