Commits

Author Commit Message Labels Comments Date
Frederic De Groef
added a test for classification of urls starting with '//'
Frederic De Groef
cosmetics
Frederic De Groef
Merge
Juliette De Maeyer
minor details
Juliette De Maeyer
[lesoir_new] ghost link detection enhanced
Juliette De Maeyer
[lesoir_new] fixed error + enhanced ghost link detection
Juliette De Maeyer
[lesoir_new] fixed error
Juliette De Maeyer
[lesoir_new] fixed error + test
Juliette De Maeyer
[lesoir_new] better ghost link detection
Juliette De Maeyer
[septsursept] plaintext link extraction in intro, minor adjustment
Juliette De Maeyer
[lesoir_new] fixed error + added link extraction in intro + test
Juliette De Maeyer
[lalibre] fixed some errors
Frederic De Groef
Filter queue items using {{SOURCE}}.filter_news_items(). Filter out paywalled articles which have been detected after call to {{SOURCE}}.extract_article_data()
Frederic De Groef
[parsers] added a top-level module function for filtering news items from other stuff on the frontpage (photoalbums, etc). Only really used for lesoir_new at the moment.
Frederic De Groef
Merge
Frederic De Groef
[lavenir] updated parser for the new sports page template (and also, tests)
Frederic De Groef
always indent the generate teat function, to make it easier to paste in the test class.
Juliette De Maeyer
[lesoir_new] filter polls/photo albums + paywalled articles
Juliette De Maeyer
[sudinfo, tests] new test for embedded dailymotion video extraction
Juliette De Maeyer
[tagging, tests] updated tagging conventions
Juliette De Maeyer
[sudinfo] fixed errors
Juliette De Maeyer
[dhnet] fixed errors
Juliette De Maeyer
[lesoir_new] errors fixed
Juliette De Maeyer
[lesoir_new] embedded iframes in top box + test
Juliette De Maeyer
[lesoir_new] new embedded media types detection + tests
Frederic De Groef
[lavenir] support for embedded videos in the 'highlighted' section
Frederic De Groef
[lesoir_new] corrected crappy merge. yay for tests
Frederic De Groef
Merge
Frederic De Groef
[lavenir] removed useless code
Frederic De Groef
sources are filtered through command line option
  1. Prev
  2. Next