pyquery / docs / tips.txt

Tips
====

Making links absolute
---------------------

You can make links absolute which can be usefull for screen scrapping::

    >>> d = pq(url='http://www.w3.org/', parser='html')
    >>> d('a[title="Available positions"]').attr('href')
    '/Consortium/Recruitment/'
    >>> d.make_links_absolute()
    [<html>]
    >>> d('a[title="Available positions"]').attr('href')
    'http://www.w3.org/Consortium/Recruitment/'

Using different parsers
-----------------------

By default pyquery uses the lxml xml parser and then if it doesn't work goes on
to try the html parser from lxml.html. The xml parser can sometimes be
problematic when parsing xhtml pages because the parser will not raise an error
but give an unusable tree (on w3c.org for example).

You can also choose which parser to use explicitly::

   >>> pq('<html><body><p>toto</p></body></html>', parser='xml')
   [<html>]
   >>> pq('<html><body><p>toto</p></body></html>', parser='html')
   [<html>]
   >>> pq('<html><body><p>toto</p></body></html>', parser='html_fragments')
   [<p>]

The html and html_fragments parser are the ones from lxml.html.
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.