- Question 2 This question is about word-cooccurences, collocations and distributional similarity. Throughout this question, reference will be made to the sample of English stored in text1 (Lewis...


- Question 2<br>This question is about word-cooccurences, collocations and distributional similarity.<br>Throughout this question, reference will be made to the sample of English stored in text1 (Lewis Carroll's Alice in Wonderland) - a sample of<br>which is output below.<br>###Run this cell.<br>Do not change the code in this cell<br>from nltk.tokenize import sent_tokenize, word_tokenize<br>from nltk.corpus import gutenberg<br>def get_rawtext(filename='carroll-alice.txt'):<br>text=gutenberg.raw(filename)<br>return text<br>def get_text(filename='carroll-alice.txt'):<br>text=gutenberg.raw(filename)<br>sentences=sent_tokenize(text)<br>tokenized= [word_tokenize (sent.lower()) for sent in sentences]<br>normalised= [[

Extracted text: - Question 2 This question is about word-cooccurences, collocations and distributional similarity. Throughout this question, reference will be made to the sample of English stored in text1 (Lewis Carroll's Alice in Wonderland) - a sample of which is output below. ###Run this cell. Do not change the code in this cell from nltk.tokenize import sent_tokenize, word_tokenize from nltk.corpus import gutenberg def get_rawtext(filename='carroll-alice.txt'): text=gutenberg.raw(filename) return text def get_text(filename='carroll-alice.txt'): text=gutenberg.raw(filename) sentences=sent_tokenize(text) tokenized= [word_tokenize (sent.lower()) for sent in sentences] normalised= [["Nth" if (token.endswith( ("nd","st","th")) and token [:-2].isdigit()) else token for token in sent] for sent in tokenized] normalised=[["NUM" if token.isdigit () else token for token in sent] for sent in normalised] filtered= [[word for word in sent if word.isalpha(] for sent in normalised] return filtered text1=get_text() text1[:10] a) Explain what each step in the get_text() function does,

Jun 05, 2022
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here