An embedded language for accelerated array processing

Version on this page:
LTS Haskell 11.22:
Stackage Nightly 2018-03-12:
Latest on Hackage:

See all snapshots accelerate appears in

BSD-3-Clause licensed by Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried
Maintained by Trevor L. McDonell
This version can be pinned in stack with:accelerate-,14870

Module documentation for

  • Data
    • Data.Array
      • Data.Array.Accelerate
        • Data.Array.Accelerate.AST
        • Data.Array.Accelerate.Analysis
          • Data.Array.Accelerate.Analysis.Match
          • Data.Array.Accelerate.Analysis.Shape
          • Data.Array.Accelerate.Analysis.Stencil
          • Data.Array.Accelerate.Analysis.Type
        • Data.Array.Accelerate.Array
          • Data.Array.Accelerate.Array.Data
          • Data.Array.Accelerate.Array.Remote
            • Data.Array.Accelerate.Array.Remote.Class
            • Data.Array.Accelerate.Array.Remote.LRU
            • Data.Array.Accelerate.Array.Remote.Table
          • Data.Array.Accelerate.Array.Representation
          • Data.Array.Accelerate.Array.Sugar
          • Data.Array.Accelerate.Array.Unique
        • Data.Array.Accelerate.Async
        • Data.Array.Accelerate.Data
        • Data.Array.Accelerate.Debug
        • Data.Array.Accelerate.Error
        • Data.Array.Accelerate.FullList
        • Data.Array.Accelerate.Interpreter
        • Data.Array.Accelerate.Lifetime
        • Data.Array.Accelerate.Pretty
        • Data.Array.Accelerate.Product
        • Data.Array.Accelerate.Smart
        • Data.Array.Accelerate.Trafo
        • Data.Array.Accelerate.Type

Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.

A simple example

As a simple example, consider the computation of a dot product of two vectors of floating point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using Data.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.

Additional components

The following supported add-ons are available as separate packages. Install them from Hackage with cabal install <package>

  • accelerate-llvm-native: Backend supporting parallel execution on multicore CPUs.

  • accelerate-llvm-ptx: Backend supporting parallel execution on CUDA-capable NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater. See the following table for supported GPUs:

  • accelerate-cuda: Backend targeting CUDA-enabled NVIDIA GPUs. Requires a GPU with compute compatibility 1.2 or greater. /NOTE: This backend is being deprecated in favour of accelerate-llvm-ptx./

  • accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite, supporting function and performance testing.

  • accelerate-io: Fast conversions between Accelerate arrays and other array formats (including vector and repa).

  • accelerate-fft: Discrete Fourier transforms, with FFI bindings to optimised implementations.

  • accelerate-bignum: Fixed-width large integer arithmetic.

  • colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL).

  • gloss-accelerate: Generate gloss pictures from Accelerate.

  • gloss-raster-accelerate: Parallel rendering of raster images and animations.

  • lens-accelerate: Lens operators for Accelerate types.

  • linear-accelerate: Linear vector spaces in Accelerate.

  • mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers.

Examples and documentation

Haddock documentation is included in the package

The accelerate-examples package demonstrates a range of computational kernels and several complete applications, including:

  • An implementation of the Canny edge detection algorithm

  • An interactive Mandelbrot set generator

  • A particle-based simulation of stable fluid flows

  • An n-body simulation of gravitational attraction between solid particles

  • An implementation of the PageRank algorithm

  • A simple interactive ray tracer

  • A particle based simulation of stable fluid flows

  • A cellular automata simulation

  • A "password recovery" tool, for dictionary lookup of MD5 hashes

lulesh-accelerate is an implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is highly simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

Mailing list and contacts


  • Many API and internal changes

  • Bug fixes and other enhancements

  • Bug fixes and performance improvements.

  • New iteration constructs.

  • Additional Prelude-like functions.

  • Improved code generation and fusion optimisation.

  • Concurrent kernel execution in the CUDA backend.

  • Bug fixes.

  • New array fusion optimisation.

  • New foreign function interface for array and scalar expressions.

  • Additional Prelude-like functions.

  • New example programs.

  • Bug fixes and performance improvements.

  • Full sharing recovery in scalar expressions and array computations.

  • Two new example applications in package accelerate-examples: Real-time Canny edge detection and an interactive fluid flow simulator (both including a graphical frontend).

  • Bug fixes.

  • New Prelude-like functions zip*, unzip*, fill, enumFrom*, tail, init, drop, take, slit, gather*, scatter*, and shapeSize.

  • New simplified AST (in package accelerate-backend-kit) for backend writers who want to avoid the complexities of the type-safe AST.

  • Complete sharing recovery for scalar expressions (but currently disabled by default).

  • Also bug fixes in array sharing recovery and a few new convenience functions.

  • Streaming computations

  • Precompilation

  • Repa-style array indices

  • Additional collective operations supported by the CUDA backend: stencils, more scans, rank-polymorphic fold, generate.

  • Conversions to other array formats

  • Bug fixes

  • Bug fixes and some performance tweaks.

  • More collective operations supported by the CUDA backend: replicate, slice and foldSeg. Frontend and interpreter support for stencil.

  • Bug fixes.

  • Initial release of the CUDA backend