Project - How to Build a Sentiment Classifier using Python and IMDB Reviews

7 / 11

Truncating the Vocabulary

There are more than 50,000 words in the vocabulary. So let us truncate it to have only 10,000 most common words.

  • Set vocab_size to 10000.

    vocab_size = << your code comes here >>
  • Extract the top 10,000 most frequently occurring words from vocabulary and store these words in truncated_vocabulary list(let us use the list comprehension method to do so).

    << your code comes here >> = [ word for word, count in vocabulary.most_common()[:vocab_size]]
Get Hint See Answer

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...