Problem 1) import nltk nltk.download() Scroll through the list and find a corpus to download (do not use brown or inaugural) Answer the following: 1.What directory does the corpus download to? 2.How...



Problem 1)


import nltk


nltk.download()


Scroll through the list and find a corpus to download (do not use brown or inaugural)


Answer the following:


1.What directory does the corpus download to?


2.How many Files are there for that Corpus?


3.In 3-4 sentences, what is the purpose of that corpus and what genres does it cover?



Problem 2


Below are examples of how to access your corpus





















































































Example




Description



fileids()



the files of the corpus



fileids([categories])



the files of the corpus corresponding to these categories



categories()



the categories of the corpus



categories([fileids])



the categories of the corpus corresponding to these files



raw()



the raw content of the corpus



raw(fileids=[f1,f2,f3])



the raw content of the specified files



raw(categories=[c1,c2])



the raw content of the specified categories



words()



the words of the whole corpus



words(fileids=[f1,f2,f3])



the words of the specified fileids



words(categories=[c1,c2])



the words of the specified categories



sents()



the sentences of the whole corpus



sents(fileids=[f1,f2,f3])



the sentences of the specified fileids



sents(categories=[c1,c2])



the sentences of the specified categories



abspath(fileid)



the location of the given file on disk



encoding(fileid)



the encoding of the file (if known)



open(fileid)



open a stream for reading the given corpus file



root



if the path to the root of locally installed corpus



readme()



the contents of the README file of the corpus



Answer the following: (you might have to try different corpora than question 1, try a few until you find one with the required info)


1.How many categories are in your corpus?


2.How many sentences are in the corpus?


3.How many sentences are in each category?


For instance for brown you can import it by


from nltk.corpus import brown
brown.[function]
brown.raw()









Problem 3)



first: pip installmatplotlib


import nltk


from nltk.corpus import inaugural
word1='country'
word2='city'
cfd = nltk.ConditionalFreqDist((target, fileid[:4])for fileid in inaugural.fileids()for w in inaugural.words(fileid)for target in [word1, word2] if w.lower().startswith(target))
cfd.plot()


Try finding two words to replace country and city. Find one word that is becoming more popular in recent years (2009) and one that was popular but is not longer.


May 31, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here