Sounded like a good idea. So I did it. I is available on the trunk, it will be on the next release that I will make this week.
I think there are still things that could be done for making it better for screen scrapping:
- make it possible to use the BeautifulSoup parser (I think it's compatible with lxml so it wouldn't be a problem)
- make it possible to use auth and headers
I have just tested this and while it works for hyperlinks it doesn't resolve other links in the document (as lxml's make_links_absolute does). I see how you've implemented it - could you expand your implementation or use lxml's native one?