Commits

Sai Krishna K committed 4fc6f5d

trying adding comments

Comments (0)

Files changed (1)

+
+  the link_parser.py gets urls in that page 
+  the out1.json file is the output i got while running it with www.google.com for 5 levels (ie 5th hierarchy).try it out with a big site for 5 levels and the program will never end ( :P ) 
+
+
+
+get_page.py will download the entire page and changes the all third-party urls to relative urls. it assumes that a local server is running. so just double clicking on the downloaded file won't work.
+
+
+
 Usage
 	 python get_page.py 
+
+
 make sure to excute the script in a seperate folder
 
+
 Use
 	python link_parser.py
 to get all the links in the page use online json viewer to view the links . http://jsonviewer.stack.hu/