Static dictionary techniques are characterized by using a predefined dictionary of patterns encoded with unique codewords. After a dictionary is established, the problem of using it most efficiently...




Static dictionary techniques are characterized by using a predefined dictionary of patterns encoded with unique codewords. After a dictionary is established, the problem of using it most efficiently still remains. For example, for a dictionary = {ability, ility, pec, re, res, spect, tab}, the word respectability can be broken down in two ways: res, pec, tab, ility and re, spect, ability; that is, the first division requires four codewords for this word, whereas the second requires only three. The algorithm parses the word or words and determines which of the two choices will be made. Of course, for a large dictionary, there may be more than two possible parsings of the same word or phrase. By far the most frequently used technique is a greedy algorithm that finds the longest match in the dictionary. For our example, the match res is longer than re; therefore, the word respectability is divided into four components with the greedy strategy. An optimal parsing can be found by adapting a shortest path algorithm. (Bell, Cleary, and Witten, 1990; Schuegraf and Heaps, 1974). Write a program that for a dictionary of patterns compresses a text file. For each string s, create a digraph with s.length nodes. Edges are labeled with the dictionary patterns, and their codeword lengths are edges’ costs. Two nodes i and j are connected with an edge if the dictionary contains a pattern s.charAt(i) . . . s.charAt(j–1). The shortest path represents the shortest sequences of codewords for patterns found in the path.




Dec 15, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here