scrapbook

Automatically derive Kotlin class to query servant webservices https://github.com/matsubara0507/scrapbook#readme

Latest on Hackage:0.3.3

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

MIT licensed and maintained by MATSUBARA Nobutada

scrapbook

Hackage Build Status

This is cli tool that collect posts of site that is wrote in config yaml using feed or scraping.

Usage

  1. clone this repository or add scrapbook package to extra-deps in stack.yaml
  2. run stack install

e.g.

$ stack exec -- scrapbook -o "example" example/sites.yaml

Docker

$ docker run --rm -v `pwd`:/work matsubara0507/scrapbook bin/bash -c "cd work && scrapbook -o 'example' example/sites.yaml"

Command

scrapbook [options] [input-file]
  -o DIR                --output=DIR                 Write output to DIR instead of stdout.
  -t FORMAT, -w FORMAT  --to=FORMAT, --write=FORMAT  Specify output format. default is `feed`.
                        --version                    Show version

GHCi

>> import Control.Lens ((^.))
>> import Data.Maybe
>> conf <- fromJust <$> readConfig "example/sites.yaml"
>> (Right posts) <- collect . fmap concat $ mapM (fetch . toSite) (conf ^. #sites)
>> collect $ writeFeed "example" (fromJust $ conf ^. #feed) posts
Right ()

Example

see matsuara0507/scrapbook-example

Documentation

How to write config yaml file.

# configuration for generating Atom feed (Optional)
feed:
  ## write as site title to Atom feed
  title: "Sample Site Posts"
  ## write as site url to Atom feed
  baseUrl: "https://example.com"
  ## file name (Optional)
  ### if nothing, use same name from input file
  name: atom.xml

# Haskeller's site configuration
sites:
    ## Title of site
  - title: "ひげメモ"
    ## Author of site
    author: matsubara0507
    ## URL of site
    url: https://matsubara0507.github.io
    ## Feed url of site
    ### there are several field to set feed url
    ### `feed` is basic field. This field auto branch to Atom or RSS 2.0.
    feed: https://matsubara0507.github.io/feed
  - title: "Kuro's Blog"
    author: "Hiroyuki Kurokawa"
    url: http://kurokawh.blogspot.com/
    ### `atom` is for Atom feed.  
    atom:
      ### feed url of Atom
      url: http://kurokawh.blogspot.com/feeds/posts/default
      ### set attr as constraint for link on each entry of Atom feed (Optional)
      ### if nothing, choice head. if set multiple attr, conjunction.
      linkAttrs:
        rel: alternate
  - title: "あどけない話"
    author: "kazu-yamamoto"
    url: http://d.hatena.ne.jp/kazu-yamamoto
    ### `rss` is for RSS 2.0 feed.
    ### set feed url.
    rss: http://d.hatena.ne.jp/kazu-yamamoto/rss2

Changes

Changelog for scrapbook

Unreleased changes

0.3.3

  • Misc: update package.yaml info for Hackage

0.3.2

  • Misc: remove deps lib
    • extensible-instances
    • data-default-instances-text
  • Misc: update extensible to 0.5

0.3.1

  • Refactor: update resolver to lts-12.26
  • Misc: support docker image
  • Misc: add TravisCI

0.3.0

  • Refactor: update resolver to lts-12

0.2.0

  • Feat: version option
  • Feat: RSS 2.0
  • Fix: remove namespace in xml tag
  • Feat: summary
  • Feat: mltiple input files
  • Fix: occur error when write file on no exist directory
  • Feat: default output file name is input file name
  • Fix: help message
  • Feat: add json output format
  • Feat: add config to filter links with attr on Atom feed
  • Refactor: use rio library
  • Refactor: change several functions to polymorphic with extensible
  • Fix: don’t exit whole program when raise fetch exception

Alpha

  • alpha release
comments powered byDisqus