Sai Krishna k avatar Sai Krishna k committed 4fc6f5d

trying adding comments

Comments (0)

Files changed (1)

+
+  the link_parser.py gets urls in that page 
+  the out1.json file is the output i got while running it with www.google.com for 5 levels (ie 5th hierarchy).try it out with a big site for 5 levels and the program will never end ( :P ) 
+
+
+
+get_page.py will download the entire page and changes the all third-party urls to relative urls. it assumes that a local server is running. so just double clicking on the downloaded file won't work.
+
+
+
 Usage
 	 python get_page.py 
+
+
 make sure to excute the script in a seperate folder
 
+
 Use
 	python link_parser.py
 to get all the links in the page use online json viewer to view the links . http://jsonviewer.stack.hu/
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.