twools / twitter-graphs /

Filename Size Date modified Message
4.4 KB
1.1 KB
19.6 KB

This utility is designed to extract the follower or friend graph from a set of users using
a wrapper to the Twitter API. This utility can extract the graph and the tweets of the followers as
well. This utility is designed to obey Twitter API rate limits of the REST API. Therefore, it will sleep
the script once the rate limit is hit each hour.

Andy Luong

I. Requirements

Python 2.7.1.
	-There are 3 modules required (Tested Vers):
		1. httplib2 (0.7.1) 
		2. oauth2 (build 170) 
		3. python-twitter (0.8.2 modified READ below) 

Python 2.6.7
	- Need above and ...
		4. Argparser (1.2.1) 

Modified ''
	-This module has been modified for two features
		1. Status::NewFromJsonDict has been modified to parse down the 'geo' JSON subtree to get coordinates 
		2. API:GetFollowers has been modified to accept a user parameter, rather just the api's self

II. Installation & Setup

1. Install the required modules:
	- Run these two commands in each of the module directories:
		% python build
		% python install
	NOTE: We are using a modified version of python-twitter. Therefore, if you are downloading the
		  python-twitter module from its source depo, make sure you copy over the modified
		  from this repository or manually modify yourself. In order to install the modified version,
		  the copy must sit in 'build/lib/' of the twitter-python folder.

2. Acquire an Authorization Key:
	- Register an application at (Under 'My Applications')
	- Generate the access tokens
	- Create a key file (ex: autho-keys.txt ). The file will contain exactly 4 lines:
		<Consumer key>
		<Consumer secret>
		<Access token>
		<Access token secret>
		NOTE: Do not add any additional words or characters other than the keys.

3. Acquire a list of users:
	- Create a users file (ex: mytwitterusers.txt )
	- The file has one user (ID or SN) on each line

III. Run the script

% python -h
	- This will show you all the parameters you may tweak

Only Follower Graph:
% python -k autho-keys.txt -u mytwitterusers.txt -f followers.graph

Only Friends Graph:
% python -k autho-keys.txt -u mytwitterusers.txt -f friends.graph

Follower Graph and Follower Tweaks:
% python -k autho-keys.txt -u mytwitterusers.txt -f followers.graph followers.tweets

Tweets of Users:
% python -k autho-keys.txt -u mytwitterusers.txt -t users.tweets
III. Additional Helper Scripts
-This script is only a helper script in case you are processing a large set of users and the main script
 fails prematurely.

% mytwitterusers.txt followers.graph.complete followers.graph	

What is happening...?
	1. This script will look at the 'followers.graph' file and find the "last user" that was processed.
	2. It will then remove all users processed before the "last user", including the last user from
	3. Then it will append 'followers.graph' to 'followers.graph.complete'; create 'followers.graph.complete'
	   if it does not exist already.
	4. Now you can rerun the original script with its old paramets to resume since where the script broke.

NOTE: You are supplying 3 parameters to It is a dumb script that does not do
	  any write protection or safety checks. BE CAREFUL or write your own.

IV. Other Information

	-Logs of processed users and unprocessable users are stored in the same base foler as the user input file

Twitter Parameters:
	-The default parameters that can be viewed using -h are set to help obtain "important" followers. 
	 Current research has found that users with over 1000 followers tend to be bots, news media, or celebrities
	 Also, there is a maximum number of 200 tweets that be extracted at one time using the Twitter API.

Twitter API Rate Limits:
	- Essentially, if you have different IPs, you can make up to 350 API calls an hour.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.