Wiki
Clone wikiPynav / Pynav-0.6
Pynav 0.6
Pynav 0.6 is no longer maintained, you should use the new version: Pynav-0.7 branch
Introduction
Pynav is a Python programmatic web browser to fetch data and test web sites.
Bug reporting and features asking are welcome.
Pynav on pypi : http://pypi.python.org/pypi/pynav/.
Features
- Post authentication
- User agent support
- Automatic cookie handling
- HTTP Basic Authentication support
- HTTPS support
- Proxy support
- Timeout support
- Reg exp searching
- Links fetching with reg exp filter
- History (pages, posts and responses)
- Save and load history from a file and replay navigation
- Random sleep time beetween pages
- Errors handling
- Document type and server headers information, real url (in case of redirection)
TODO
- Best files handle : Read header of the http server response to get the file type and the real file name.
- File upload support in post values.
Licence
GNU General Public License (GPL)
Installation
Requirements
Minimum Python version: 2.5
Works on Python 2.6
Works on Android with ASE
Latest stable version with pip
$ pip install pynav
Latest stable version with easy_install
$ easy_install pynav
or a specific version:
$ easy_install http://bitbucket.org/sloft/pynav/downloads/pynav-0.6-py2.6.egg
Latest stable version from tar.gz archive
Download pynav-0.6.5.tar.gz and extract it:
$ wget http://bitbucket.org/sloft/pynav/downloads/pynav-0.6.5.tar.gz $ tar xzf pynav-0.6.5.tar.gz
Go into the extracted directory and run setup.py:
$ cd pynav-0.6.5/ $ python setup.py install
Dev version from hg source
$ hg clone https://bitbucket.org/sloft/pynav/
Examples
Post authentication, images and files downloading with simple filter or regular expression
from pynav import Pynav
def test1():
p = Pynav()
p.go('http://www.example.com/connexion', {'login' : 'toto', 'pass' : 'toto'})
if p.find('My profile'):
print 'connected into profile area'
p.go('http://www.example.com/photos/')
for image in p.get_all_images('.png'):
p.download(image, '/tmp/images/')
for link in n.get_all_links('download_part.*?\.zip'):
p.download(link)
-
Using HTTP Basic authentication, post authentication and cookie check
def test2():
p = Pynav(timeout=5)
p.auto_referer=True
p.set_http_auth('http://example.com', 'login', 'pass')
p.go('http://example.com/private/')
p.go('http://www.example.com/private/connexion', {'login' : 'toto', 'pass' : 'toto'})
if p.cookie_exists('id'):
print 'Connected
p.set_page_delay(2, 4)
for link in p.get_all_links('news'):
print link
p.go(link)
for page in p.history:
print page['url'], ':', page['post']
-
Using proxy
def test3():
p = Pynav(timeout=6, proxy='http://www.example.com:3128/')
p.verbose=True
p.referer = 'http://www.example.com'
page = p.go('http://www.example.com/tracks')
print p.strip_tags(page)
Updated