accelerate

An embedded language for accelerated array processing

https://github.com/AccelerateHS/accelerate/

Version on this page:	1.1.1.0
LTS Haskell 11.22:	1.1.1.0
Stackage Nightly 2018-03-12:	1.1.1.0
Latest on Hackage:	1.3.0.0

See all snapshots accelerate appears in

BSD-3-Clause licensed by Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried

Maintained by Trevor L. McDonell

This version can be pinned in stack with:accelerate-1.1.1.0@sha256:ddfa440f8ff7ff1962cbe5eea2dac6bcca81fccc1c6959fdf854eb238cfb8d9e,14736

Module documentation for 1.1.1.0

Data
- Data.Array
  - Data.Array.Accelerate
    - Data.Array.Accelerate.AST
    - Data.Array.Accelerate.Analysis
      - Data.Array.Accelerate.Analysis.Hash
      - Data.Array.Accelerate.Analysis.Match
      - Data.Array.Accelerate.Analysis.Shape
      - Data.Array.Accelerate.Analysis.Stencil
      - Data.Array.Accelerate.Analysis.Type
    - Data.Array.Accelerate.Array
      - Data.Array.Accelerate.Array.Data
      - Data.Array.Accelerate.Array.Remote
        Data.Array.Accelerate.Array.Remote.Class
        
        Data.Array.Accelerate.Array.Remote.LRU
        
        Data.Array.Accelerate.Array.Remote.Table
      - Data.Array.Accelerate.Array.Representation
      - Data.Array.Accelerate.Array.Sugar
      - Data.Array.Accelerate.Array.Unique
    - Data.Array.Accelerate.Async
    - Data.Array.Accelerate.Data
    - Data.Array.Accelerate.Debug
    - Data.Array.Accelerate.Error
    - Data.Array.Accelerate.FullList
    - Data.Array.Accelerate.Interpreter
    - Data.Array.Accelerate.Lifetime
    - Data.Array.Accelerate.Pretty
    - Data.Array.Accelerate.Product
    - Data.Array.Accelerate.Smart
    - Data.Array.Accelerate.Trafo
    - Data.Array.Accelerate.Type

Depends on 19 packages(full list with versions):

ansi-wl-pprint, base, base-orphans, containers, deepseq, directory, exceptions, fclabels, filepath, ghc-prim, hashable, hashtables, mtl, template-haskell, time, transformers, unique, unix, unordered-containers

Used by 18 packages in lts-10.3(full list with versions):

accelerate-arithmetic, accelerate-bignum, accelerate-blas, accelerate-examples, accelerate-fft, accelerate-fftw, accelerate-fourier, accelerate-io, accelerate-llvm, accelerate-llvm-native, accelerate-llvm-ptx, accelerate-utility, colour-accelerate, gloss-accelerate, gloss-raster-accelerate, lens-accelerate, linear-accelerate, mwc-random-accelerate

An Embedded Language for Accelerated Array Computations

Data.Array.Accelerate defines an embedded language of array computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations (such as maps, reductions, and permutations). These computations are online-compiled and executed on a range of architectures.

For more details, see our papers:

There are also slides from some fairly recent presentations:

Chapter 6 of Simon Marlow’s book Parallel and Concurrent Programming in Haskell contains a tutorial introduction to Accelerate.

Trevor’s PhD thesis details the design and implementation of frontend optimisations and CUDA backend.

Table of Contents

An Embedded Language for Accelerated Array Computations

A simple example

As a simple example, consider the computation of a dot product of two vectors of single-precision floating-point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance; for example, using Data.Array.Accelerate.LLVM.PTX.run it may be on-the-fly off-loaded to a GPU.

Availability

Package accelerate is available from

Hackage: accelerate - install with cabal install accelerate
GitHub: AccelerateHS/accelerate - get the source with git clone https://github.com/AccelerateHS/accelerate.git. The easiest way to compile the source distributions is via the Haskell stack tool.

Additional components

The following supported add-ons are available as separate packages:

accelerate-llvm-native: Backend targeting multicore CPUs
accelerate-llvm-ptx: Backend targeting CUDA-enabled NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater (see the table on Wikipedia)
accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite (supporting function and performance testing)
accelerate-io: Fast conversion between Accelerate arrays and other array formats (for example, Repa and Vector)
accelerate-fft: Fast Fourier transform implementation, with FFI bindings to optimised implementations
accelerate-blas: BLAS and LAPACK operations, with FFI bindings to optimised implementations
accelerate-bignum: Fixed-width large integer arithmetic
colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL)
gloss-accelerate: Generate gloss pictures from Accelerate
gloss-raster-accelerate: Parallel rendering of raster images and animations
lens-accelerate: Lens operators for Accelerate types
linear-accelerate: Linear vector spaces in Accelerate
mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers
numeric-prelude-accelerate: Lifting the numeric-prelude to Accelerate

Install them from Hackage with cabal install PACKAGENAME.

Documentation

Haddock documentation is included and linked with the individual package releases on Hackage.
Haddock documentation for in-development components can be found here.
The idea behind the HOAS (higher-order abstract syntax) to de-Bruijn conversion used in the library is described separately.

Examples

accelerate-examples

The accelerate-examples package provides a range of computational kernels and a few complete applications. To install these from Hackage, issue cabal install accelerate-examples. The examples include:

An implementation of canny edge detection
An interactive mandelbrot set generator
An N-body simulation of gravitational attraction between solid particles
An implementation of the PageRank algorithm
A simple ray-tracer
A particle based simulation of stable fluid flows
A cellular automata simulation
A “password recovery” tool, for dictionary lookup of MD5 hashes

LULESH

LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is a highly simplified application, hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

LULESH mesh

Λ ○ λ (Lol)

Λ ○ λ (Lol) is a general-purpose library for ring-based lattice cryptography. Lol has applications in, for example, symmetric-key somewhat-homomorphic encryption schemes. The lol-accelerate package provides an Accelerate backend for Lol.

Additional examples

Accelerate users have also built some substantial applications of their own. Please feel free to add your own examples!

Henning Thielemann, patch-image: Combine a collage of overlapping images
apunktbau, bildpunkt: A ray-marching distance field renderer
klarh, hasdy: Molecular dynamics in Haskell using Accelerate
Alexandros Gremm used Accelerate as part of the 2014 CSCS summer school (code)

Mailing list and contacts

Mailing list: [email protected] (discussions on both use and development are welcome)
Sign up for the mailing list at the Accelerate Google Groups page.
Bug reports and issues tracking: GitHub project page.

The maintainers of Accelerate are Manuel M T Chakravarty [email protected] and Trevor L McDonell [email protected].

Citing Accelerate

If you use Accelerate for academic research, you are encouraged (though not required) to cite the following papers (BibTeX):

Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. Accelerating Haskell Array Codes with Multicore GPUs. In DAMP ’11: Declarative Aspects of Multicore Programming, ACM, 2011.
Trevor L. McDonell, Manuel M. T. Chakravarty, Gabriele Keller, and Ben Lippmeier. Optimising Purely Functional GPU Programs. In ICFP ’13: The 18th ACM SIGPLAN International Conference on Functional Programming, ACM, 2013.
Robert Clifton-Everest, Trevor L. McDonell, Manuel M. T. Chakravarty, and Gabriele Keller. Embedding Foreign Code. In PADL ’14: The 16th International Symposium on Practical Aspects of Declarative Languages, Springer-Verlag, LNCS, 2014.
Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton. Type-safe Runtime Code Generation: Accelerate to LLVM. In Haskell ’15: The 8th ACM SIGPLAN Symposium on Haskell, ACM, 2015.

Accelerate is primarily developed by academics, so citations matter a lot to us. As an added benefit, you increase Accelerate’s exposure and potential user (and developer!) base, which is a benefit to all users of Accelerate. Thanks in advance!

What’s missing?

Here is a list of features that are currently missing:

Preliminary API (parts of the API may still change in subsequent releases)

Changes

Change Log

Notable changes to the project will be documented in this file.

The format is based on Keep a Changelog and the project adheres to the Haskell Package Versioning Policy (PVP)

1.1.1.0 - 2017-09-26

Changed

Improve and colourise the pretty-printer

1.1.0.0 - 2017-09-21

Added

Additional EKG monitoring hooks (#340)
Operations from RealFloat

Changed

Changed type of scanl', scanr' to return an Acc tuple, rather than a tuple of Acc arrays.
Specialised folds sum, product, minimum, maximum, and, or, any, all now reduce along the innermost dimension only, rather than reducing all elements. You can recover the old behaviour by first flatten-ing the input array.
Add new stencil boundary condition function, to apply the given function to out-of-bounds indices.

Fixed

#390: Wrong number of arguments in printf

1.0.0.0 - 2017-03-31

Many API and internal changes
Bug fixes and other enhancements

0.15.1.0

Fix type of allocateArray

0.15.0.0

Bug fixes and performance improvements.

0.14.0.0

New iteration constructs.
Additional Prelude-like functions.
Improved code generation and fusion optimisation.
Concurrent kernel execution in the CUDA backend.
Bug fixes.

0.13.0.0

New array fusion optimisation.
New foreign function interface for array and scalar expressions.
Additional Prelude-like functions.
New example programs.
Bug fixes and performance improvements.

0.12.0.0

Full sharing recovery in scalar expressions and array computations.
Two new example applications in package accelerate-examples (both including a graphical frontend):
- A real-time Canny edge detection
- An interactive fluid flow simulator
Bug fixes.

0.11.0.0

New Prelude-like functions zip*, unzip*, fill, enumFrom*, tail, init, drop, take, slit, gather*, scatter*, and shapeSize.
New simplified AST (in package accelerate-backend-kit) for backend writers who want to avoid the complexities of the type-safe AST.

0.10.0.0

Complete sharing recovery for scalar expressions (but currently disabled by default).
Also bug fixes in array sharing recovery and a few new convenience functions.

0.9.0.0

Streaming computations
Precompilation
Repa-style array indices
Additional collective operations supported by the CUDA backend: stencils, more scans, rank-polymorphic fold, generate.
Conversions to other array formats
Bug fixes

0.8.1.0

Bug fixes and some performance tweaks.

0.8.0.0

More collective operations supported by the CUDA backend: replicate, slice and foldSeg. Frontend and interpreter support for stencil.
Bug fixes.

0.7.1.0

Initial release of the CUDA backend