Commits

Author Commit Message Labels Comments Date
Chris Adams
Clients: better response error handling * Won't run response processors at all on error pages (arguably there should be a separate error processor for people who care) * Spider requests will include a referer header, which can be included in errors
Chris Adams
check_site: expand user paths in report files This allows you to get the expected result if you use e.g. ~/Desktop/mysite.html
Chris Adams
Added a default timeout to Retriever, Spider This makes it easy for things like check_site.py to have a command-line option to change the request timeout.
Chris Adams
check site report: tweaked CSS for footer
Chris Adams
Clients: spider now has minimal circular redirect handling This should become nicer than an assertion error at some point
Chris Adams
clients: Spider.queue now accepts kwargs This allows callers to pass in options for the underlying client
Chris Adams
Bugfix: now possible to save page/resource lists Missed during the last refactor
Chris Adams
check_site: better HTML report errors Now we give a friendlier error if the report filename contains ".htm" but the format wasn't set to HTML.
Chris Adams
Updated check_site to use Jinja2 for text reports
Chris Adams
Added an option to follow offsite redirects
Chris Adams
Fixed logging references to global logger vs. instance
Chris Adams
Better handling of external redirects Now we won't blindly follow redirects but we still need an option to follow them for reporting broken off-site links
Chris Adams
Partial attempt to get spider working
Chris Adams
Initial experiment switching to a twisted backend
Chris Adams
Renamed tornado_bench to http_bench
Chris Adams
Minor cleanup
Chris Adams
Initial setup.py for use on PyPI
Chris Adams
Merge branch 'master' of ssh://github.com/acdha/webtoolbox
Chris Adams
Force Unicode docstring handling (closes #1)
Chris Adams
Module docstring
Chris Adams
Improved URL normalization to ignore anchors
Chris Adams
Track response time per-request
Chris Adams
Merge branch 'master' of ssh://github.com/acdha/webtoolbox
Chris Adams
check_site: added command-line control for number of simultaneous connections
Chris Adams
Updated requirements.pip
Chris Adams
Fallback for language when even chardet fails
Chris Adams
Made it easier to configure HTTP Request options Now Fetcher/Spider.queue accepts kwargs which are passed directly to the HTTPRequest object, allowing you to configure things like request timeouts.
Chris Adams
Spider: better content length warning
Chris Adams
check_site: major reporting overhaul * Switched to Jinja2 templating for HTML report, with substantial cleanup for everything related * Enabled better reporting for features needed in the report * Code & doc maintenance
Chris Adams
Updated check_site to use Jinja2 for text reports
  1. Prev
  2. Next