Clone wiki

nltk-extras / Home

NLTK Extras

Currently, this contains, a module for building FreqDists on top of Redis. Also included is a copy of the python interface. It is based on the 0.9.9 release of NLTK and 0.096 release of Redis. Because it relies on the internal implementation of FreqDist, and given that Redis is still very much in beta, I can't guarantee it'll work with future versions of NLTK and/or Redis.


RedisFreqDist works just like NLTK's FreqDist, but stores samples and frequency counts as keys and values. That means samples must be strings. And of course, you'll need a running redis-server for RedisFreqDist to work. Below is a simple example function for creating a RedisFreqDist and counting samples.

def make_freq_dist(samples, host='localhost', port=6379, db=0):
	freqs = RedisFreqDist(host=host, port=port, db=db)
	for sample in samples:

All of the other FreqDist functions are supported, allowing you to get a list of all the samples, and to lookup the frequency count of each sample. For more info, see my article Building a NLTK FreqDist on Redis.

ConditionalRedisFreqDist also includes ConditionalRedisFreqDist, which works just like NLTK's ConditionalFreqDist. There's also RedisConditionalFreqDist for when you have a large number of conditions, but it's not very well tested.