DOM traversal by CSS selectors for xml-conduit package https://github.com/nebuta/
|Latest on Hackage:||0.2.0.1|
This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.
CSS selector support for xml-conduit/html-conduit. This package supports compile-time checking of CSS selectors using quasiquotes. All DOM traversals are purely functional.
-- The following pragmas should be put first (Haddock does not accept a pragma notation.) -- LANGUAGE OverloadedStrings, QuasiQuotes module Main (main) where import Text.XML.Cursor (fromDocument) import Text.HTML.DOM (parseLBS) import qualified Data.Text.Lazy.IO as TI (putStrLn) import Control.Monad (mapM_) import Text.XML.Scraping (innerHtml) import Text.XML.Selector.TH import Network.HTTP.Conduit import Data.Conduit.Binary main :: IO () main = do root <- fmap (fromDocument . parseLBS) $ simpleHttp "https://news.google.com/" let cs = queryT [jq| h2 span.titletext |] root mapM_ (TI.putStrLn . innerHtml) cs
You can use some elementary CSS selectors for traversing a DOM tree.
Ver 0.2.1: Inappropriate Safe Haskell pragma was removed.
Ver 0.2: All scraping functions in Text.XML.Scraping return lazy text now. They are implemented with a type class.