# accelerate

An embedded language for accelerated array processing https://github.com/AccelerateHS/accelerate/

Version on this page: | 0.15.1.0 |

LTS Haskell 9.5: | 1.0.0.0 |

Stackage Nightly 2017-09-20: | 1.0.0.0 |

Latest on Hackage: | 1.1.0.0 |

**Manuel M T Chakravarty, Robert Clifton-Everest, Gabriele Keller, Ben Lever, Trevor L. McDonell, Ryan Newtown, Sean Seefried**

**Trevor L. McDonell**

#### Module documentation for 0.15.1.0

- Data
- Data.Array
- Data.Array.Accelerate
- Data.Array.Accelerate.Data
- Data.Array.Accelerate.Interpreter

- Data.Array.Accelerate

- Data.Array

# An Embedded Language for Accelerated Array Computations

`Data.Array.Accelerate`

defines an embedded language of array computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations (such as maps, reductions, and permutations). These computations are online-compiled and executed on a range of architectures.

For more details, see our papers:

- Accelerating Haskell Array Codes with Multicore GPUs
- Optimising Purely Functional GPU Programs (slides)
- Embedding Foreign Code
- Type-safe Runtime Code Generation: Accelerate to LLVM (slides) (video)

There are also slides from some fairly recent presentations:

- Embedded Languages for High-Performance Computing in Haskell
- GPGPU Programming in Haskell with Accelerate (video) (workshop)

Chapter 6 of Simon Marlow's book Parallel and Concurrent Programming in Haskell contains a tutorial introduction to Accelerate.

Trevor's PhD thesis details the design and implementation of frontend optimisations and CUDA backend.

**Table of Contents**

An Embedded Language for Accelerated Array Computations - A simple example - Availability - Additional components - Requirements - Documentation - Examples - Mailing list and contacts - Citing Accelerate - What's missing?

## A simple example

As a simple example, consider the computation of a dot product of two vectors of single-precision floating-point numbers:

```
dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)
```

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance; for example, using `Data.Array.Accelerate.LLVM.PTX.run`

it may be on-the-fly off-loaded to a GPU.

## Availability

Package accelerate is available from

- Hackage: accelerate - install with
`cabal install accelerate`

- GitHub: AccelerateHS/accelerate - get the source with
`git clone https://github.com/AccelerateHS/accelerate.git`

. The easiest way to compile the source distributions is via the Haskell stack tool.

## Additional components

The following supported add-ons are available as separate packages:

- accelerate-llvm-native: Backend targeting multicore CPUs
- accelerate-llvm-ptx: Backend targeting CUDA-enabled NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater (see the table on Wikipedia)
- accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite (supporting function and performance testing)
- accelerate-io: Fast conversion between Accelerate arrays and other array formats (for example, Repa and Vector)
- accelerate-fft: Fast Fourier transform implementation, with FFI bindings to optimised implementations
- accelerate-blas: BLAS and LAPACK operations, with FFI bindings to optimised implementations
- accelerate-bignum: Fixed-width large integer arithmetic
- colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL)
- gloss-accelerate: Generate gloss pictures from Accelerate
- gloss-raster-accelerate: Parallel rendering of raster images and animations
- lens-accelerate: Lens operators for Accelerate types
- linear-accelerate: Linear vector spaces in Accelerate
- mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers
- numeric-prelude-accelerate: Lifting the numeric-prelude to Accelerate

Install them from Hackage with `cabal install PACKAGENAME`

.

## Documentation

- Haddock documentation is included and linked with the individual package releases on Hackage.
- Haddock documentation for in-development components can be found here.
- The idea behind the HOAS (higher-order abstract syntax) to de-Bruijn conversion used in the library is described separately.

## Examples

### accelerate-examples

The accelerate-examples package provides a range of computational kernels and a few complete applications. To install these from Hackage, issue `cabal install accelerate-examples`

. The examples include:

- An implementation of canny edge detection
- An interactive mandelbrot set generator
- An N-body simulation of gravitational attraction between solid particles
- An implementation of the PageRank algorithm
- A simple ray-tracer
- A particle based simulation of stable fluid flows
- A cellular automata simulation
- A "password recovery" tool, for dictionary lookup of MD5 hashes

### LULESH

LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is a highly simplified application, hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

### Λ ○ λ (Lol)

Λ ○ λ (Lol) is a general-purpose library for ring-based lattice cryptography. Lol has applications in, for example, symmetric-key somewhat-homomorphic encryption schemes. The lol-accelerate package provides an Accelerate backend for Lol.

### Additional examples

Accelerate users have also built some substantial applications of their own. Please feel free to add your own examples!

- Henning Thielemann, patch-image: Combine a collage of overlapping images
- apunktbau, bildpunkt: A ray-marching distance field renderer
- klarh, hasdy: Molecular dynamics in Haskell using Accelerate
- Alexandros Gremm used Accelerate as part of the 2014 CSCS summer school (code)

## Mailing list and contacts

- Mailing list:
`accelerate-haskell@googlegroups.com`

(discussions on both use and development are welcome) - Sign up for the mailing list at the Accelerate Google Groups page.
- Bug reports and issues tracking: GitHub project page.

The maintainers of Accelerate are Manuel M T Chakravarty and Trevor L McDonell .

## Citing Accelerate

If you use Accelerate for academic research, you are encouraged (though not required) to cite the following papers (BibTeX):

Manuel M. T. Chakravarty, Gabriele Keller, Sean Lee, Trevor L. McDonell, and Vinod Grover. Accelerating Haskell Array Codes with Multicore GPUs. In

*DAMP '11: Declarative Aspects of Multicore Programming*, ACM, 2011.Trevor L. McDonell, Manuel M. T. Chakravarty, Gabriele Keller, and Ben Lippmeier. Optimising Purely Functional GPU Programs. In

*ICFP '13: The 18th ACM SIGPLAN International Conference on Functional Programming*, ACM, 2013.Robert Clifton-Everest, Trevor L. McDonell, Manuel M. T. Chakravarty, and Gabriele Keller. Embedding Foreign Code. In

*PADL '14: The 16th International Symposium on Practical Aspects of Declarative Languages*, Springer-Verlag, LNCS, 2014.Trevor L. McDonell, Manuel M. T. Chakravarty, Vinod Grover, and Ryan R. Newton. Type-safe Runtime Code Generation: Accelerate to LLVM. In

*Haskell '15: The 8th ACM SIGPLAN Symposium on Haskell*, ACM, 2015.

Accelerate is primarily developed by academics, so citations matter a lot to us. As an added benefit, you increase Accelerate's exposure and potential user (and developer!) base, which is a benefit to all users of Accelerate. Thanks in advance!

## What's missing?

Here is a list of features that are currently missing:

Preliminary API (parts of the API may still change in subsequent releases)

## Changes

# Change Log

Notable changes to the project will be documented in this file.

The format is based on Keep a Changelog and the project adheres to the Haskell Package Versioning Policy (PVP)

## 1.1.0.0 - 2017-09-21

### Added

- Additional EKG monitoring hooks (#340)
- Operations from
`RealFloat`

### Changed

- Changed type of
`scanl'`

,`scanr'`

to return an`Acc`

tuple, rather than a tuple of`Acc`

arrays. - Specialised folds
`sum`

,`product`

,`minimum`

,`maximum`

,`and`

,`or`

,`any`

,`all`

now reduce along the innermost dimension only, rather than reducing all elements. You can recover the old behaviour by first`flatten`

-ing the input array. - Add new stencil boundary condition
`function`

, to apply the given function to out-of-bounds indices.

### Fixed

## 1.0.0.0 - 2017-03-31

- Many API and internal changes
- Bug fixes and other enhancements

## 0.15.1.0

Fix type of

`allocateArray`

## 0.15.0.0

Bug fixes and performance improvements.

## 0.14.0.0

- New iteration constructs.
- Additional Prelude-like functions.
- Improved code generation and fusion optimisation.
- Concurrent kernel execution in the CUDA backend.
- Bug fixes.

## 0.13.0.0

- New array fusion optimisation.
- New foreign function interface for array and scalar expressions.
- Additional Prelude-like functions.
- New example programs.
- Bug fixes and performance improvements.

## 0.12.0.0

- Full sharing recovery in scalar expressions and array computations.
- Two new example applications in package
`accelerate-examples`

(both including a graphical frontend):- A real-time Canny edge detection
- An interactive fluid flow simulator

- Bug fixes.

## 0.11.0.0

- New Prelude-like functions
`zip*`

,`unzip*`

,`fill`

,`enumFrom*`

,`tail`

,`init`

,`drop`

,`take`

,`slit`

,`gather*`

,`scatter*`

, and`shapeSize`

. - New simplified AST (in package
`accelerate-backend-kit`

) for backend writers who want to avoid the complexities of the type-safe AST.

## 0.10.0.0

- Complete sharing recovery for scalar expressions (but currently disabled by default).
- Also bug fixes in array sharing recovery and a few new convenience functions.

## 0.9.0.0

- Streaming computations
- Precompilation
- Repa-style array indices
- Additional collective operations supported by the CUDA backend:
`stencil`

s, more`scan`

s, rank-polymorphic`fold`

,`generate`

. - Conversions to other array formats
- Bug fixes

## 0.8.1.0

Bug fixes and some performance tweaks.

## 0.8.0.0

- More collective operations supported by the CUDA backend:
`replicate`

,`slice`

and`foldSeg`

. Frontend and interpreter support for`stencil`

. - Bug fixes.

## 0.7.1.0

Initial release of the CUDA backend