Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural) Answer the following: 1. What directory does the corpus download to? 2. How...

attached


Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural)  Answer the following: 1. What directory does the corpus download to? 2. How many Files are there for that Corpus? 3. In 3-4 sentences, what is the purpose of that corpus and what genres does it cover? Problem 2 Below are examples of how to access your corpus Example Description fileids() the files of the corpus fileids([categories]) the files of the corpus corresponding to these categories categories() the categories of the corpus categories([fileids]) the categories of the corpus corresponding to these files raw() the raw content of the corpus raw(fileids=[f1,f2,f3]) the raw content of the specified files raw(categories=[c1,c2]) the raw content of the specified categories words() the words of the whole corpus words(fileids=[f1,f2,f3]) the words of the specified fileids words(categories=[c1,c2]) the words of the specified categories sents() the sentences of the whole corpus sents(fileids=[f1,f2,f3]) the sentences of the specified fileids sents(categories=[c1,c2]) the sentences of the specified categories abspath(fileid) the location of the given file on disk encoding(fileid) the encoding of the file (if known) open(fileid) open a stream for reading the given corpus file root if the path to the root of locally installed corpus readme() the contents of the README file of the corpus Answer the following: (you might have to try different corpora than question 1, try a few until you find one with the required info) 1. How many categories are in your corpus? 2. How many sentences are in the corpus? 3. How many sentences are in each category? For instance for brown you can import it by from nltk.corpus import brown brown.[function] brown.raw() Problem 3) first: pip install matplotlib import nltk from nltk.corpus import inaugural word1='country' word2='city' cfd = nltk.ConditionalFreqDist((target, fileid[:4])for fileid in inaugural.fileids()for w in inaugural.words(fileid)for target in [word1, word2] if w.lower().startswith(target)) cfd.plot() Try finding two words to replace country and city. Find one word that is becoming more popular in recent years (2009) and one that was popular but is not longer.
Jun 03, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here