cassava
A CSV parsing and encoding library
https://github.com/haskell-hvr/cassava
| LTS Haskell 24.16: | 0.5.4.1 |
| Stackage Nightly 2025-10-24: | 0.5.4.1 |
| Latest on Hackage: | 0.5.4.1 |
cassava-0.5.4.1@sha256:2e55c02c5e64f642c49d1a41734264e37d329ca04fa404e7f435fe6ed4df92ef,5150Module documentation for 0.5.4.1
cassava: A CSV parsing and encoding library
Please refer to the package description for an overview of cassava.
Usage example
Here’s the two second crash course in using the library. Given a CSV file with this content:
John Doe,50000
Jane Doe,60000
here’s how you’d process it record-by-record:
{-# LANGUAGE ScopedTypeVariables #-}
import qualified Data.ByteString.Lazy as BL
import Data.Csv
import qualified Data.Vector as V
main :: IO ()
main = do
csvData <- BL.readFile "salaries.csv"
case decode NoHeader csvData of
Left err -> putStrLn err
Right v -> V.forM_ v $ \ (name, salary :: Int) ->
putStrLn $ name ++ " earns " ++ show salary ++ " dollars"
If you want to parse a file that includes a header, like this one
name,salary
John Doe,50000
Jane Doe,60000
use decodeByName:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import qualified Data.ByteString.Lazy as BL
import Data.Csv
import qualified Data.Vector as V
data Person = Person
{ name :: !String
, salary :: !Int
}
instance FromNamedRecord Person where
parseNamedRecord r = Person <$> r .: "name" <*> r .: "salary"
main :: IO ()
main = do
csvData <- BL.readFile "salaries.csv"
case decodeByName csvData of
Left err -> putStrLn err
Right (_, v) -> V.forM_ v $ \ p ->
putStrLn $ name p ++ " earns " ++ show (salary p) ++ " dollars"
You can find more code examples in the examples/ folder as well as smaller usage examples in the Data.Csv module documentation.
Project Goals for cassava
There’s no end to what people consider CSV data. Most programs don’t
follow RFC4180 so one has to
make a judgment call which contributions to accept. Consequently, not
everything gets accepted, because then we’d end up with a (slow)
general purpose parsing library. There are plenty of those. The goal
is to roughly accept what the Python
csv module accepts.
The Python csv module (which is implemented in C) is also considered
the base-line for performance. Adding options (e.g. the above
mentioned parsing “flexibility”) will have to be a trade off against
performance. There’s been complaints about performance in the past,
therefore, if in doubt performance wins over features.
Last but not least, it’s important to keep the dependency footprint light, as each additional dependency incurs costs and risks in terms of additional maintenance overhead and loss of flexibility. So adding a new package dependency should only be done if that dependency is known to be a reliable package and there’s a clear benefit which outweights the cost.
Further reading
The primary API documentation for cassava is its Haddock documentation which can be found at http://hackage.haskell.org/package/cassava/docs/Data-Csv.html
Below are listed additional recommended third-party blogposts and tutorials
Changes
Version 0.5.4.1
Andreas Abel, 2025-09-02
- Bump dependency lower bounds to at least GHC 8.0 and Stackage LTS 7.0.
- Build tested with GHC 8.0 - 9.14 alpha1.
- Functionality tested with GHC 8.4 - 9.14 alpha1.
Version 0.5.4.0
Andreas Abel, 2025-06-10
- Add
decodeWithPanddecodeByNameWithPtoStreaminginterface (PR #237). - Build tested with GHC 8.0 - 9.12.2.
- Functionality tested with GHC 8.4 - 9.12.2.
Version 0.5.3.2
Andreas Abel, 2024-08-03
- Proper exception on hanging doublequote (PR #222).
- Allow latest
hashable. - Build tested with GHC 8.0 - 9.10.1.
- Functionality tested with GHC 8.4 - 9.10.1.
Version 0.5.3.1
Andreas Abel, 2024-04-23
- Remove support for GHC 7.
- Remove cabal flag
bytestring--LT-0_10_4and support forbytestring < 0.10.4. - Tested with GHC 8.0 - 9.10 alpha3
Version 0.5.3.0 revision 2
- Allow
bytestring-0.12 - Tested with GHC 7.4 - 9.6.2
Version 0.5.3.0 revision 1
- Allow
base-4.18 - Tested with GHC 7.4 - 9.6.1 alpha
Version 0.5.3.0
Andreas Abel, 2022-07-10
- Improve error messages for
lookupand NamedRecord parsers (#197) - Fix bug (infinite loop) in
FromField Constinstance (#185) - Turn flag
bytestring--LT-0_10_4off by default (#183) - Doc: Add cassava usage example of reading/writing to file (#97)
- Update to latest version of dependencies (#190, #193, #199)
- Tested with GHC 7.4 - 9.4 (#184, #204)
Version 0.5.2.0
Herbert Valerio Riedel, 2019-09-01
- Add
FromField/ToFieldinstances forIdentityandConst(#158) - New
typeclass-less decoding functionsdecodeWithPanddecodeByNameWithP(#67,#167) - Support for final phase of MFP / base-4.13
Version 0.5.1.0
Herbert Valerio Riedel, 2017-08-12
- Add
FromField/ToFieldinstance forNatural(#141,#142) - Add
FromField/ToFieldinstances forScientific(#143,#144) - Add support for modifying Generics-based instances (adding
Options,defaultOptions,fieldLabelModifier,genericParseRecord,genericToRecord,genericToNamedRecord,genericHeaderOrder) (#139,#140) - Documentation improvements
Version 0.5.0.0
Herbert Valerio Riedel, 2017-06-19
Semantic changes
- Don’t unecessarily quote spaces with
QuoteMinimal(#118,#122,#86) - Fix semantics of
foldl'(#102) - Fix field error diagnostics being mapped to
endOfInputinParsermonad. (#99) - Honor
encIncludeHeaderin incremental API (#136)
Other changes
- Support GHC 8.2.1
- Use factored-out
Onlypackage - Add
FromField/ToFieldinstance forShortText - Add
MonadFailandSemigroupinstance forParser - Add
Semigroupinstance for incremental CSV APIBuilder&NamedBuilder - Port to
ByteStringbuilder & drop dependency onblaze-builder
Version 0.4.5.1
Herbert Valerio Riedel, 2016-11-10
- Restore GHC 7.4 support (#124)
Version 0.4.5.0
Herbert Valerio Riedel, 2016-01-19
-
Support for GHC 8.0 added; support for GHC 7.4 dropped
-
Fix defect in
Foldable(foldr)implementation failing to skip unconvertable records (#102) -
Documentation fixes
-
Maintainer changed
Version 0.4.4.0
Johan Tibell, 2015-08-30
-
Added record instances for larger tuples.
-
Support attoparsec 0.13.
-
Add field instances for short bytestrings.
Version 0.4.3.0
Johan Tibell, 2015-06-02
-
Documentation overhaul with more examples.
-
Add Data.Csv.Builder, a low-level bytestring builder API.
-
Add a high-level builder API to Data.Csv.Incremental.
-
Generalize the default FromNamedRecord/ToNamedRecord instances.
-
Improved support for deriving instances using GHC.Generics.
-
Added some control over quoting.
Version 0.4.2.4
- Support attoparsec 0.13.
Version 0.4.2.3
- Support GHC 7.10.
Version 0.4.2.2
-
Support blaze-builder 0.4.
-
Make sure inlining doesn’t prevent rules from firing.
-
Fix incorrect INLINE pragmas.
Version 0.4.2.1
- Support deepseq-1.4.
Version 0.4.2.0
-
Minor performance improvements.
-
Add 8 and 9 tuple instances for From/ToRecord.
-
Support text-1.2.
Version 0.4.1.0
-
Ignore whitespace when converting numeric fields.
-
Accept \r as a line terminator.
-
Support attoparsec-0.12.