E-mails and Links crawler extractor
This script asks for the url of a website, then crawls from page to page making a list of all the links in the website and his sub-pages - but it ignores the link if is external (can be easily deactivated in the code). While crawling the links it also crawls for e-mail address's. In the end it saves 2 txt files: one withh all the links, other with all the emails.
I made this long ago because I needrf all the e-mails of town councils and some other stuff (like restaurants and museums) in the north area of portugal. I found a webpgae with all that but the emails were disperse in a ton of different links. This was the solution, and I got my mailing list.
Dind't tested with more websites, but in theory it works with the majority. Everyone can use this in any way they like, I don't care about licenses or copyright, just have fun. (note: it was made for small scale crawl and only for myself, it only saves the list after getting everything - not ideal for websites with tons of sub-pages, buit can be easily improved)
If you don't have Python installed run the file "main.exe" (compiled it with pyinstaller, but never tested).
To run the source code like a boss the only requirement is: -An installed version of Python (used 2.7, not tested with other versions) -lxml:
pip install lxml
Then just run the file "main.py".
My Homepage: www.paulojorgepm.net