archiver

Archive supplied URLs in WebCite & Internet Archive

Latest on Hackage:0.6.2.1

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD-3-Clause licensed by Gwern
Maintained by Gwern

archiver is a daemon which will process a specified text file, each line of which is a URL, and will (randomly) one by one request that the URLs be archived or spidered by http://www.webcitation.org, http://www.archive.org, and http://www.wikiwix.com for future reference. (One may optionally specify an arbitrary sh command like wget to download URLs locally.)

Because the interface is a simple text file, this can be combined with other scripts; for example, a script using Sqlite to extract visited URLs from Firefox, or a program extracting URLs from Pandoc documents. (See http://www.gwern.net/Archiving%20URLs.)

For explanation of the derivation of the code in Network.URL.Archiver, see http://www.gwern.net/haskell/Wikipedia%20Archive%20Bot.