Commits

Author Commit Message Labels Comments Date
Frederic De Groef
detect links with no target, classify and tag them
Frederic De Groef
use format_exc() instead of format_stack when logging errors (because well, it's way more meaningful)
Frederic De Groef
Use unified html cleanup fund, with optional stripping
Frederic De Groef
detect if an article has an introduction. Use the unified html cleanup fund.
Frederic De Groef
Added the possibility to strip chars when cleaning up html fragments
Frederic De Groef
renamed added metainfo generator script to python's bin folder
Frederic De Groef
helper funcs to get string version of date and time
Frederic De Groef
deprecated decorator moved to some module
Frederic De Groef
new db-wide funcs for the viewer (summed metainfo, last update metainfo)
Frederic De Groef
new style statistics: cached metainfo, stored per day (+ additional scripts to delete and regenerate them)
Frederic De Groef
sort items when creating list of queued items
Frederic De Groef
fix: set.add() is only for one element
Frederic De Groef
queued items are not in a dict anymore
Frederic De Groef
sort batch articles based on the url
Frederic De Groef
compute queued items count
Frederic De Groef
queued items by day are in a list of tuples instead of a dict
Frederic De Groef
get articles and errorcounts per batch
Frederic De Groef
updated hgignore
Frederic De Groef
db-wide funcs
Frederic De Groef
new source summary: get article and error count
Frederic De Groef
DB-wide functions were moved in a separate file
Frederic De Groef
forgot to rename a func in call
Frederic De Groef
dont make an egg when installing the module in site-packages
Frederic De Groef
missing __init__.py
Frederic De Groef
check that the url exist before trying to access it
Frederic De Groef
[dhnet] detect embedded content
Frederic De Groef
remove formatting in links
Frederic De Groef
using the constants
Frederic De Groef
handle the 'article was removed' case
Frederic De Groef
don't create already existing directories
  1. Prev
  2. Next