Commits

Author Commit Message Labels Comments Date
Gregory Petukhov
Working on ng spider
Gregory Petukhov
Add keep_childre argument to tools.lxml_tools.drop_node. Add replace_node_with_text function
Gregory Petukhov
Working on ng
Gregory Petukhov
Add use case of spider with limits
Gregory Petukhov
Automated merge with ssh://bitbucket.org/lorien/grab
Gregory Petukhov
Add smart_copy_file function to grab.tools.files
Gregory Petukhov
Fix the "incorrect" processing of &#[128-160]; entities. Add new options: fix_special_entities, which is on by default
Gregory Petukhov
Start working on new generation of Spider
Gregory Petukhov
Fix verbose logging in spider
Gregory Petukhov
Add new option: body_storage_filename. Write tests for options: body_inmemory, body_storage_dir, body_storage_filename
Gregory Petukhov
Add another usecase
Gregory Petukhov
Update update_site.sh script
Gregory Petukhov
Add update_site.sh script for internal usage
Gregory Petukhov
Add pypi.sh script for internal usage
Gregory Petukhov
Version 0.4.8
Gregory Petukhov
Fix #74: fix fatal error in html parsing with lxml
Gregory Petukhov
Fix #73. Support of IDN domains
Gregory Petukhov
Now xml declaration is not stripped in unicode_body(). It is stripped only when body is passed to DOM builder
Gregory Petukhov
Drop strip_xml_declaration option. It is deprecated now
Gregory Petukhov
Remove obsoleted tests
Gregory Petukhov
Remove test.py
Gregory Petukhov
Change method of processing of multicurl handlers
Gregory Petukhov
Verby verbose logging feaure in the spider
Gregory Petukhov
More friendly error message in Task constructor
Gregory Petukhov
Remove test\d+ files
Gregory Petukhov
Better logging messages in grab.spider.pattern
Gregory Petukhov
Cleanup code of extension system
Gregory Petukhov
Start working on py3k compatibility
Gregory Petukhov
Small fix to on-disk response body processing
Gregory Petukhov
Add ability to save network response to file (not to memory)
  1. Prev
  2. Next