Commits

David Larlet committed d76b04d

Deal with images' rendering, fixes #5

  • Participants
  • Parent commits 0e88acb

Comments (0)

Files changed (2)

 
 ## TODO
 
-* deal with images, for now the link transformation alters `src` attributes too (maybe a feature though :p)
 * pre-fetching of links in background, need to be asynchronuous
 * add the location of the browsing using [doko](https://bitbucket.org/larsyencken/doko), can be useful for search (I often remember where I read articles)
 * deal with non-article pages (homepages, lists, etc)

File src/browser.py

         document = BrowserDocument(response.text)
 
         # Explicitely parse the HTML to be able to rewrite links
+        # with base URL and prepend the proxy to all URLs (but images)
         html = document._html()
-        html.rewrite_links(self.__prepend_proxy_url, base_href=base_url)
+        for element, attribute, link, position in html.iterlinks():
+            if attribute == "src":  # Do not modify images
+                continue
+            link = link.strip()
+            if link.startswith("/"):
+                link = base_url + link
+            element.attrib[attribute] = self.__prepend_proxy_url(link)
         document.html = html
 
         # The short title is more concise and readable