======= scraper ======= scraper is a wrapper for Mechanize's Browser class meant specifically for scraping websites. Currently scraper only supports searching for class attributes within valid html tags (via BeautifulSoup). ------------ Requirements ------------ You need to have three modules installed: argparse, mechanize and BeautifulSoup. You can install these via any normal means (easy_install, pip, etc.). .. _argparse: http://code.google.com/p/argparse/ .. _mechanize: http://wwwsearch.sourceforge.net/mechanize/ .. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/ --- Use --- scraper.py [-h] [-p page | -s search_term] [-r regex] [-o output] optional arguments: -h, --help show this help message and exit -p page the page you want to search. -s search_term Search term for Google News - put a "+" in for spaces. -r regex the regular expression you want to use to specify the class attribute to search for. -o output Specify the output file. -------- Epilogue -------- This is very much an alpha release. I only threw it up here because I thought it might be useful to someone. I use it to print a nice single page of stories related to things I usually search for every morning.