Extracted text: Step 1: lexicon and context-free grammar (Some of the above might be grammatical in special contexts, e.g. assuming 'ellip- sis', i.e. omitted phrases that are understood from the larger context of a dialogue. In this practical, we don't consider ellipsis, nor do we consider uncommon usage of words that would require a stretch of imagination to justify.) Once more run /usr/local/python/bin/python3 P2.py. Consider the following positive examples: Bart giggles Homer giggled Bart and Lisa drank milk Bart wears blue shoes Step 3: feature grammar Homer serves Lisa and Bart a healthy green salad The grammar you wrote in Step 1 will likely accept some of the negative examples. This is because the following have not been modelled: Bart always drinks milk Lisa thinks Homer thinks Bart drinks milk Homer and Bart never drink milk in the kitchen after lunch • number agreement, when does Lisa drink milk in the kitchen • subcategorisation. when Homer drinks milk Bart giggles when do Lisa and Bart wear shoes For number agreement, first remember that English has five verb forms for ordinary verbs (and a few more for to be): Lisa puts the milk in the kitchen • base form: to write, you write Note that all punctuation has been removed in order to avoid complications, and we do not enforce capitalisation at the beginning of sentences. Extend P2.fcfg with more rules so that all words from the above sentences are • third person singular present: he writes included. You will need to introduce more parts of speech such as Det (e.g. 'the'), Prep (e.g. 'in', 'after'), Adj (e.g. 'green'), and a few more. Note that the two occurrences of 'when' have different functions, so need to be associated with different parts of speech. At this point, you may not want to distinguish between singular and plural noun phrases, nor between different verb forms, nor between verbs with different subcate- gorisation frames. Also add more context-free rules, so that the above sentences can be derived. Make • preterite (a.k.a. simple past): wrote • past participle: written • present participle (a.k.a. gerund if used as noun): writing For our simple examples, we only need the first form (used for third person plural present, and infinitive) and the second (third person singular present) and the third (preterite). Features can be added to the grammar to ensure that only the correct verb forms are allowed, and that there is number agreement for those verb forms where it sure the rules defining the start symbol (S) come first. (NLTK by default assumes that the first mentioned nonterminal is the start symbol.) When designing your grammar, beware of the distinction between argument and adjunct. In the above example sentences, PP 'in the kitchen' is once an adjunct and once an argument. is relevant. Subcategorisation should be implemented as illustrated by the following example (which ignores the issue of number agreement): S -> NP VP[SUBCAT=nil] VP [SUBCAT=?rest] -> VP [SUBCAT=[HEAD=?arg, TAIL=?rest]] ARG [CAT=?arg] VP [SUBCAT=?args] -> V[SUBCAT=?args] Step 2: intermediate testing of the grammar Add the above positive examples to P2.pos and add the below negative examples to P2.neg. ARG [CAT=np] -> NP ARG [CAT=pp] -> PP when Bart giggles when does Lisa Homer puts V [SUBCAT=nil] -> 'sneezes' V [SUBCAT=[HEAD=np, TAIL=[HEAD=pp, TAIL=nil]]] -> 'gave' Bart thinks the kitchen
Extracted text: S NP VP [SUBCAT=nil] VP [SUBCAT= [HEAD=pp,TAIL=nil]] ARG [CAT=pp] he VP [SUBCAT=[HEAD=np,TAIL= [HEAD=pp,TAIL=nil]]] ARG [CAT=np] PP V [SUBCAT= [HEAD=np,TAIL=[HEAD=pp,TAIL=nil]]] NP gave : to his brother the bike Figure 1: Graphical representation of the parse of he gave the bike to his brother. Note the two applications of VP[SUBCAT=?rest] -> VP [SUBCAT= [HEAD=?arg, TAIL=?rest]] ARG [CAT=?arg]. Also note the topmost VP has [SUBCAT=nil], which is needed to apply the rule with left-hand side S. How the rules for subcategorisation are applied is illustrated in Figure 1. In order to handle the verbs in our example sentences, further rules for V and ARG are needed, but it should be possible to reuse the rule VP[SUBCAT=?rest] -> VP[SUBCAT= [HEAD=?arg, TAIL=?rest]] ARG[CAT=?arg] for several verbs, regardless of their subcategorisation frames. Step 4: final testing Again test the positive and negative examples, and verify that all positive examples are accepted, and none of the negative examples are accepted. You may add more positive and negative examples (with words in the lexicon) to convince yourself that your grammar is satisfactory. Requirements Submit a zipped file containing: • P2. py (unmodified) • P2.fcfg (extended by you) ... - 2