IMDBReview_NaiveBayes / readme

This is a experimental python project that extracts IMDB reviews for a movie, classifies them and generate a result html file
The project is based on the programming assignments in **Udacity CSC 101 class** and **Stanford NLP class**
The script is also used as a plugin in my term project for my Distributed System class this semster

Usage:
To Train:
	python imdb.py -t LIST_FILE MAX_COMMENT_COUNT

To Classify:
	python imdb.py -c OUTPUT_HTML_PATH MOVIE_TITLE [MAX_COMMENT_COUNT]

The files in the lists directory is the movie lists I used to train the NaiveBayes classifier, they come from random titles in
	* IMDB Top 250 (http://www.imdb.com/chart/top)
	* IMDB Bottom 100 (http://www.imdb.com/chart/bottom)
	* New York Times The Best 1,000 Movies Ever Made (http://www.nytimes.com/ref/movies/1000best.html)

Trained data is stored in trained.raw as a plain text file

* Future Works:
	Try other algorithms to improve the accuracy
	Make a online version
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.