Files changed (1)
+ the out1.json file is the output i got while running it with www.google.com for 5 levels (ie 5th hierarchy).try it out with a big site for 5 levels and the program will never end ( :P )
+get_page.py will download the entire page and changes the all third-party urls to relative urls. it assumes that a local server is running. so just double clicking on the downloaded file won't work.
to get all the links in the page use online json viewer to view the links . http://jsonviewer.stack.hu/