Commits

Jakub Wilk  committed c018c7d

Tidy rp.pl articles more.

  • Participants
  • Parent commits 2bcae88

Comments (0)

Files changed (1)

         htmlstr = htmlbytes
     document = lxml.html.document_fromstring(htmlstr)
     subdocument = document.find(".//div[@id='gazeta_article']")
+    if subdocument is None:
+        subdocument = document.find(".//div[@id='story']")
     if subdocument is not None:
         document = subdocument
     document = lxml.etree.ElementTree(document)
         "//div[@id='gazeta_article_tools']",
         "//div[@id='recommendations']",
         "//div[@id='socialNewTools']",
+        "//h3[@id='tags']",
         "//ul[@id='articleToolbar']",
         '//img',
         '//like',