Please use Ocaml programing to solve the problem. Please use Ocaml programing to solve the problem.

Please use Ocaml programing to solve the problem.
Please use Ocaml programing to solve the problem.


IIT CS440: Programming Languages and Translators Homework 2: DFAs, NFAs, Regexes, CFLs Prof. Stefan Muller TA: Xincheng Yang Out: Tuesday, Feb. 16 Due: Tuesday, Mar. 2 11:59pm CST This assignment contains 7 written tasks and 6 programming tasks, for a total of 85 points. 0 Logistics and Submission - Important The same rules as on HW0 and HW1 apply, except as specified below. 0.1 Compilation and Testing As noted on Piazza, the complicated skeleton code of this assignment will make testing in the ocaml toplevel or on TryOCaml very difficult. Instructions for testing your code are provided in the writeups of the programming problems. If you have any issues or questions, let us know or post on Piazza. 0.2 Submission Your answers to the programming problems will go in the files dfa.ml, nfa.ml and regex.ml. The other ML files are skeleton code that you will not need to modify and don’t need to submit. Unlike in previous assignments, Do not rename the .ml files. The skeleton code, and the compilation instructions in this writeup, assume the files still have the original names. As before, your written answers should be submitted as a pdf file named something like Doe_John_440_hw2.pdf. Put the four files (three .ml files and one .pdf file) into a folder named something like Doe_John_440_hw2 and zip the file to submit. Submit on Blackboard under the correct assignment. 0.3 Rules for Programming Problems Unless noted in specific problems, you do not need to program in any specific style (e.g., recursive, tail- recursive, using higher-order functions), though you are encouraged to use higher-order functions when possible as good practice and good coding style. You may (and are encouraged to) use functions from OCaml’s standard library when applicable. Docu- mentation of the standard library functions is available here: https://caml.inria.fr/pub/docs/manual-ocaml/libref/index.html 1 1 Regular Expressions and Finite Automata Task 1.1 (Written, 9 points). For each of the following language descriptions, give a regular expression that matches exactly that language. You don’t have to find the shortest possible regular expression. You may use expressions such as [a− z] to represent the letters a through z, and/or regular definitions: letter → [a− z] string → letter∗ To make things clearer, use to mean a single space (e.g. write "Hello world" instead of "Hello world".) (a) Strings over the alphabet {a, b, c, d} that are in alphabetical order. Examples: aabccc, abcd, bd, a, ad Not examples: dc, ca, cba (b) Natural numbers (0 and up), with no leading zero unless the whole thing is a single zero, and going right-to-left, groups of 3 digits are separated by commas. Examples: 0; 1; 12; 123; 1,234; 12,345; 1,000,000 Not examples: 01; 1000; 123,4; 12,34 (c) Strings consisting of letters a-z and spaces that do not begin or end with whitespace and all white space is one space long. Example: "so this is ok" Not examples: "this is bad", " this too", "and this " Task 1.2 (Written, 10 points). For each of the following NFAs, identify the language accepted by the NFA (describe it or give a regular expression). (a) 0start 1 3 2 4 � � a b a b (b) 0start 1 2 3 a, b a b b 2 1.1 NFA to DFA Consider the following NFA, which accepts strings that: • begin with b and end with b or aa (with 0 or more as and bs in between) or • begin and end with a (with 0 or more as and bs in between). (convince yourself this is the lanugage recognized by the NFA). 0start 1 2 3 4 5 6 7 8 9 10 11 � � b � � � b a a a a a, b a, b Task 1.3 (Written, 6 points). Recall that �−closure(s) for a state s is the set of states reachable from s using only �-transitions, and that this idea of “reachability” is transitive, that is, if there is an �-transition from s to s′ and one from s′ to s′′, then s′′ is in �−closure(s). Compute (a) �−closure(0) (b) �−closure(3) (c) �−closure(2) Task 1.4 (Written, 10 points). Use the subset construction to convert the NFA above into a DFA. You may present the resulting DFA using either a transition diagram or a transition table. Recall that, in the subset construction, states of the DFA correspond to sets of states of the NFA. Label your DFA states with the set of NFA states, e.g. {1, 2, 3}. The start state should be �−closure(0). Task 1.5 (Written, 5 points). Why does the DFA you constructed above not have an “error” state? (Think about when, during scanning an input, we would realize that a string cannot be in this language.) 2 DFA Simulator In the next section, we’re going to build a regular expression matcher, which requires simulating an NFA. As a warmup, in this section, you’re going to simulate a DFA, which is a little easier. Your code for this section will be in dfa.ml. At the top, we give a few type definitions for DFAs: states and symbols are just integers and characters, respectively, but we still define them using “type synonyms”: type state = int 3 This defines a type state that’s just the same as int, but we can refer to state to make it clearer when we mean a state in the code. (This is also good software engineering practice because if we decide we don’t want to just represent states using ints later, we can change it easily.) DFAs are implemented as records: type dfa = { states : int; (* Number of states *) (* States are numbered 0 through states * State 0 is the start state *) accept : state list; (* List of accept states *) alpha : symbol list; (* Alphabet of the DFA *) trans : (state * symbol * state) list (* List of transitions in the form * (from state, transition symbol, to state) *) } For example, we could represent the DFA 0start 1 2 3 b a a b b a a b with the record { states = 4; (* 4 states, 0-3 *) accept = [2; 3]; (* states 2 and 3 are accepting *) alpha = [’a’; ’b’]; (* alphabet is a, b *) trans = [(0, ’b’, 0); (0, ’a’, 1); (1, ’a’, 1); (1, ’b’, 2); (2, ’b’, 2); (2, ’a’, 3); (3, ’a’, 3); (3, ’b’, 2)] } Writing large DFAs like that is hard though, so we’ve provided functions that parse DFAs and inputs from input files. The format of the input file is as follows: An example is in examples/dfa1. You may want to write your own examples for testing. The function Parser.parse_file : string -> dfa * symbol list takes a filename and returns a pair of the DFA and the input. The parsing code is in a different file, so you won’t be able to use it by just copying and pasting code into the toplevel or TryOCaml. You can still test using asserts, though. For example: assert (transition (fst (Parser.parse_file "examples/dfa1")) ’a’ 0 = 1) You can then check if your assert passes by compiling the file and running it: ocamlc -o dfa_test parseDFA.ml dfa.ml ./dfa_test 4 If you don’t see any errors, then your assertion passed. You can also test that your whole DFA simulator works using the instructions at the end of this section. Task 2.1 (Programming, 5 points). Implement the function transition : (state * symbol * state) list) -> symbol -> state -> state. transition d.trans sy st should return the new state that we’ll be in if we see symbol sy while in state st, where d.trans is the list of transitions of the DFA (in the form above). For example, if d is the DFA above, • transition d.trans ’a’ 0 = 1 • transition d.trans ’b’ 1 = 2 You should raise the exception IllformedInput if the state or symbol isn’t in the transition list (e.g. if st is 5 or sy is ’q’). The exception IllformedInput takes a string, so you can output a helpful message that will be printed, e.g. raise (IllformedInput (Printf.sprintf "State, symbol pair (%d, %c) not in transitions" state symb)) Now you’ll write the function that actually simulates the DFA. Applying dfa_sim dfa st input sim- ulates execution of a DFA starting in state st on the input input. It should return true if we end up in an accepting state, and false otherwise. We should get an IllformedInput exception if something is wrong (e.g., a transition is missing from dfa.trans or we see a symbol not in dfa.alpha). The initial state of the DFA is always state 0, but you’ll probably want dfa_sim to call itself recursively when you’re in a new state, so dfa_sim takes the current state as an argument. Task 2.2 (Programming, 8 points). Implement the function dfa_sim: dfa -> state -> symbol list -> bool as described above. Re- member that a DFA only accepts or rejects once it has read all of the input: we can pass through an accepting state, then read more input, and end up in a rejecting state. 2.1 Testing Run make dfa to compile the whole DFA simulator. If you don’t have make installed, you can run ocamlc -o dfa parseDFA.ml dfa.ml dfaMain.ml instead. Then run ./dfa to parse the DFA and input from the given file and run your DFA simulator on it. The program will print either “ACCEPTED” or “REJECTED” depending on whether your simulator returns true or false. 3 Regular Expression Matcher Lots of software has to solve the problem of deciding whether or not a piece of text matches a given regular expression. In fact, the find/replace feature of the editor you’re using to write your code probably implements it. In this section, you’ll build part of a regular expression matcher. To do this, you’ll write code that simulates an NFA (recall that we can “compile” any regular expression into an NFA that determines whether input matches it). 3.1 NFA Simulator You’ll implement the NFA simulator in nfa.ml. This file already includes type definitions for NFAs. They’re similar to the types for DFAs above, with the difference that the type
May 19, 2022
SOLUTION.PDF

Get Answer To This Question

Submit New Assignment

Copy and Paste Your Assignment Here