You can submit the assignment in groups of 2. I would strongly suggest you to work in groups of 2. Using the GUM treebank from here:...

You can submit the assignment in groups of 2. I would strongly suggest you to work in groups of 2.

Using the GUM treebank from here:
https://github.com/UniversalDependencies/UD_English-GUM/blob/master/en_gum-ud-train.conllu(Links to an external site.)

The HMMs are well described here in the chapter 8.4. Link here: https://web.stanford.edu/~jurafsky/slp3/8.pdf

Components of a HMM tagger (40 points) [For everybody]

Undergrads and graduates: Use the equation in 8.4.3 to implement the emission and transition probabilities. Check equation 3.23 in chapter 3 in the book for implementing both the transition and emission probabilities if you want to add smoothing. Don't forget to add the

token when computing the transition probabilities.

Greedy Tagger (60 points) [For everybody]

Implement a greedy tagger. At each step, choose the tag that is the best. You don't have to implement the Viterbi algorithm to find the best tag sequence. At each step, select the tag that is the maximum of the product of the transition probability and the emission probability. Think greedy!

Viterbi Tagger (50 points) [For extra credit]

Implement the Viterbi tagger as given in 8.4.5. The backpointer part needs to be implemented for outputting the best sequence.

Reading: Read the section A.4 for worked out examples of the viterbi algorithm.

Don't hesitate to contact me for doubts about your code. Best of luck.

Testing:

Test your tagger on the test dataset here:
https://github.com/UniversalDependencies/UD_English-GUM/blob/master/en_gum-ud-test.conllu(Links to an external site.)

What is the accuracy and F-scores of your tagger? You can use sklearn's metrics to compute the metrics.

Grading: You will get partial credit for any submitted work.

Answered 10 days AfterApr 12, 2021

Answer To: You can submit the assignment in groups of 2. I would strongly suggest you to work in groups of 2....

Sandeep Kumar answered on Apr 22 2021

157 Votes

SOLUTION.PDF

You can submit the assignment in groups of 2. I would strongly suggest you to work in groups of 2. Using the GUM treebank from here:...

Components of a HMM tagger (40 points) [For everybody]

Greedy Tagger (60 points) [For everybody]

Viterbi Tagger (50 points) [For extra credit]

Testing:

Answer To: You can submit the assignment in groups of 2. I would strongly suggest you to work in groups of 2....

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment