Snippets

Jordi Deu Code to calculate words frequencies to create a word cloud

Created by Jordi Deu last modified Jordi Deu-Pons
import csv
from nltk.corpus import stopwords
from nltk import word_tokenize, FreqDist

with open('text.txt', 'rt') as fd:
    freq = FreqDist(w.lower() for w in word_tokenize(fd.read()) if len(w)>4 and w not in stopwords.words('english'))
    
with open('text_freq.txt', 'wt') as fd:
    writer = csv.writer(fd, delimiter='\t')
    for word, count in freq.most_common(150):
        writer.writerow([count // 5, word])    
        
# Create the word cloud using this web:
# http://www.wordclouds.com/

Comments (0)

HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.