Overview

Scrapr (Web Scraping Framework)

Scrapr makes it easy to setup a model for finding specific tags, links, or text on a web page.

Simple Example

Example Extractor:

    from scrapr import extractor

    class SimpleSite(extractor.Extractor):
    title = extractor.DocAttribute('title',return_str=True)
    description = extractor.DocAttribute('meta',value_attr='content',
                                     **{'name':'description'})

>>import requests
>>req = requests.get('http://www.dnsly.net')
>>ss = SimpleSite(req.content)
>>ss.title
u'DNSly - Simple DNS Management for Amazon Route 53 and Rackspace Cloud DNS'
>>ss.description
u'Simple DNS Management for Amazon Route 53 and Rackspace Cloud DNS.'

Extractors

The model

Doc Attributes

Attributes of the html document

Installation

Dependencies

  • 'beautifulsoup>=3.2.1'
  • 'beautifulsoup4>=4.0.5'