Commits

Author Commit Message Labels Comments Date
Thomas Figg
empty file handling
Thomas Figg
init.py changes
Thomas Figg
fixes
Thomas Figg
more headers/conversion record uses correct header to refer to original
Thomas Figg
adding conversion records, warcextract.py, some nicer error messages
Thomas Figg
merge
cathcart
Single r where there should be two.
Thomas Figg
readline, error handling, spelling
Thomas Figg
adding hash digests to warcs
Thomas Figg
metadata
Thomas Figg
cleaning readme
Thomas Figg
date util
Thomas Figg
typo
Thomas Figg
adding record makers
Thomas Figg
tidyup and hex output for non-printable chars
Thomas Figg
allowing broken warcs with wrong newlines, lack of terminators and wrong compression
Thomas Figg
missing file
Thomas Figg
+x flags
Thomas Figg
adding setup.py, cleaning up readme, removing todo
Thomas Figg
errors on faulty newlines
Thomas Figg
cleanup
Thomas Figg
adding arc2warc, primitive indexer, warc2warc can write record files
Thomas Figg
adding warcfilter.py
Thomas Figg
adding warcvalid.py skeleton
Thomas Figg
adding arc file parsing, autodetect of archive type on opening
Thomas Figg
adding some front end bits and some tidyup
Thomas Figg
next todo steps
Thomas Figg
token docstrings
Thomas Figg
reading a warc file: token example
Thomas Figg
reading a warc file
  1. Prev
  2. Next