WHAT IS DISALLOWZ ?
Disallowz is a bash script that will query a web server's robots.txt file.
If the robots.txt file exits, Disallowz will download the file, and parse it
creating a masterlist of "Disallowed" links and directories specified by the
original robots.txt disallow filter.
Good crawlers such as google, bing, etc honor a sites disallow filters by not
caching the content of those pages/directories thus they will not appear in the
google or bing search engine. Disallowz will attempt to connect to those
disallowed locations and report the response codes back to the user.
WHAT METHODS ARE USED FOR MAKING REQUESTS ?
Disallowz uses 2 types of requests using curl and wget. The Curl request will
send a HEAD request to verify the response code of the server and wget will make
a GET request to download the robots.txt file.
HOW DO I RUN DISALLOWZ ?
Pretty simple, just run the command:
OR you can set the script as executable and run like so:
chmod +x rob0tz
FUTURE PLANS ?
proxy support using proxychains