XXXXXXXXXXpoints)​ a) Define NER and what it is useful for: b) Give 3 examples of boundary issues in NER c) Run NLTK’s NER model on the following sentence: “Trump is going to Paris on France’s express...

1 answer below »
Here you go.


1. (25 points)​ a) Define NER and what it is useful for: b) Give 3 examples of boundary issues in NER c) Run NLTK’s NER model on the following sentence: “Trump is going to Paris on France’s express train with John Kelly” i) Show your code here: ii) Calculate the accuracy of the model on this sentence 2. ​(25 points) a) What is the difference between polarity sentiment analysis and categorical sentiment analysis? Give examples b) Give 3 sentences that would be hard to get correct sentiment for. Each one should show different language issue. i) ii) iii) c) Code in python and report the polarity sentiment for the following sentence: “This is the best exam in the world!” i) Show your code here: ii) Report the score: 3. (25 points) a)What is the main use of TFIDF? Give two examples of use cases for it b)Why doesn’t TFIDF return the most frequent word in a document as the most important in all cases? c)What is the downside to using TFIDF? What does it not do well? 4. (25 points) a) Explain a Zipf curve and what it detects. b) Does a Zipf curve look the same in all languages? What would cause a Zipf curve to be abnormal looking? c) Draw a Zipf Curve. Label the X and Y axis and show example values
Answered 3 days AfterJul 29, 2021

Answer To: XXXXXXXXXXpoints)​ a) Define NER and what it is useful for: b) Give 3 examples of boundary issues in...

Rajashekar answered on Jul 31 2021
145 Votes
1. (25 points)​
a) Define NER and what it is useful for:
b) Give 3 examples of boundary issues in NER
c) Run NLTK’s NER model on the following sentence: “Trump is going to Paris on France’s express train with John Kelly”
i) Show your code here:
ii) Calculate the accuracy of the model on this sentence
2. ​(25 points)
a) What is the difference between polarity sentiment analysis and categorical sentiment
analysis? Give examples
b) Give 3 sentences that would be hard to get correct sentiment for. Each one should show different language issue.
i)
ii)
iii)
c) Code in python and report the polarity sentiment for the following sentence: “This is the best exam in the world!”
i) Show your code here:
ii) Report the score:
3. (25 points)
a)What is the main use of TFIDF? Give two examples of use cases for it
b)Why doesn’t TFIDF return the most frequent word in a document as the most important in all cases?
c)What is the downside to using TFIDF? What does it not do well?
4. (25 points)
a) Explain a Zipf curve and what it detects.
b) Does a Zipf curve look the same in all languages? What would cause a Zipf curve to be abnormal looking?
c) Draw a Zipf Curve. Label the X and Y axis and show example values
a) Define NER and what it is useful for:
The classification of named entities that are present in pre-defined categories is known as Named Entity Recognition. It is also known as entity extraction. The different categories from which the extraction occurs are individuals, companies, places, organization, cities and others. Named Entity Recognition is a subtask of information extraction. Extracting key entities such as person names, locations, dates, specialized terms and product terminology from untreated text can not only help in improve keyword search but also paves the path for semantic search, targeted search and document repurposing. NER adds a wealth of semantic knowledge to any given content by improving the understandability of the subject in the given text.
The uses of NER:
1. Classifying content for newspapers
NER can easily scan the huge amount of digital data that newspapers come across. Managing them is very important to make the best out of each article. Identifying the relevant tags helps article classification and facilitates smooth discovery
2. Efficient search algorithms
NER can reduce search time for various articles by identifying the relevant tags and storin them separately. This means a search term will be matched with only the small list of entities discussed in each article leading to faster search execution.
3. Powering content Recommendations
Recommendation systems are very popular and evident in the current lifecycle. NER are at the forefront of recommendations systems and are preferred by many.
4. Customer Support
Categorizing complaints and feedbacks can make the customer support very satisfactory. NER help streamline this process
b) Give 3 examples of boundary issues in NER
a) Text preprocessing and feature extraction for requires the isolation of entities. However, as for any natural language, many articles contain ambiguities stemming from the equivocal use of synonyms, homonyms, multi-word/nested NEs, and other ambiguities in naming [1](Nayel et al., 2019). For instance, the same entity names can be written differently in different articles, e.g., “Lymphocytic Leukemia” and “Lymphoblastic Leukemia” (synonyms/British and American spelling differences).
b) Interchangeable words used in different places can alter the relevance of a word with respect to other words effecting performance. In medical field where words like “genes “and “proteins” can be used interchangeably the NER falls short.
c) Spelling variations play an import role in determining the success of NER techniques. The vowels play an important role in phonetics where words which do not make a major difference in sound but make a major difference in the way of writing and spelling.[2] (Sanjana et al.,2017)
c) Run NLTK’s NER model on the following sentence: “Trump is going to Paris on France’s express train with John Kelly”
i) Show your code...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here