How to remove stop words using nltk or python


Question

So I have a dataset that I would like to remove stop words from using

stopwords.words('english')

I'm struggling how to use this within my code to just simply take out these words. I have a list of the words from this dataset already, the part i'm struggling with is comparing to this list and removing the stop words. Any help is appreciated.

1
95
3/6/2013 11:53:28 AM

from nltk.corpus import stopwords
# ...
filtered_words = [word for word in word_list if word not in stopwords.words('english')]
181
11/12/2015 3:29:44 PM

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon