I just need help with the second part of stage 1, the part of the document here is attached the files of the project.
1 Specification Version: March 3, 2019 Data Structures Semester Project, Spring 2019 Over the course of the semester, students will build a very simple search engine. The project is split into multiple stages, each stage due at a different point in the semester. Each stage depends on the successful completion of the stages that came before it, so do not fall behind! In addition to the primary focus, i.e. understanding and using data structures, the project also will teach some basic software engineering skills. Learning Goals 1. Implement your first data structures, thus beginning down the path of more sophisticated design and software engineering. 2. Implement a general-purpose class and then using it for a specific application. This is a small step from a code perspective, but a giant leap in terms of how you design your code. This is you first foray into modular design. 3. Experience with, and competency in, professional software engineering skills and tools: a. Create your first Maven project for building your system and for dependency management. Real-world software is built using build systems, not javac at the command line. b. First exposure to independently learning about and reusing third party libraries – Apache Commons Compress 4. Build your first real piece of software (a search engine) 5. Teach yourself, and use, more advanced features of the Java language. Requirements for Your GitHub Repository Structure and Project Submissions You must layout the folders of your YU GitHub repository as described below. Failure to do so may result, at my discretion, in you getting a zero. 1. Under the root of your repository there MUST be a directory named “DataStructures”, NO SPACES. a. For example, if your repository is “UncleMoishe”, then the following directory must exist: https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures 2. Under DataStructures, you will create a subdirectory called “project”, like this: https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project 3. Under project, you will create separate folders for each stage of the project. So, assuming we have 5 stages in this project, you will create the following subdirectories: a. https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project/stage1 b. https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project/stage2 2 Specification Version: March 3, 2019 c. https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project/stage3 d. https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project/stage4 e. https://github.com/Yeshiva-University-CS/UncleMoishe/DataStructures/project/stage5 4. When you check your code in for each stage, you will check kin the pom.xml file and the java code in the standard maven project layout. That means that under each “stage” directory listed above, the following will exist: a. Your Maven build file - pom.xml b. Your project code - src/main/java/ c. Your project Junit test code - src/test/java 5. Even when/if I do not require any specific tests as part of the project, I urge you to crate unit tests to test your code and check them in. This will both help you do a better job, and also make it more likely that you get credit for parts that work even if you don’t get a given stage 100% correct. General Points About Your Code The following rules apply to all stages of the semester project: 1. You may not use any static methods in your code besides public static void main. Every other method must be an instance method. You are writing object oriented code, not old-fashioned procedural code. 2. Your code may not have any “monster” methods; no method in your code may be longer than 30 lines (not counting comments.) Get used to breaking logic down into smaller chunks, i.e. methods that you call from within another method. 3. You must use Maven for building your code and for dependency declaration and resolution. Do not manually download JAR files for Apache Commons Compress or Junit. Do not manually add them to your classpath. Only declare them as a Maven dependency, and Maven will download them for you. Stage 1: Build an In-Memory Document Store Due: Sunday, March 17, 11:59 PM EST Description In stage 1 you are building a very simple storage mechanism with no search capability, only “get” and “put.” Documents are stored in memory in a HashTable, and can only be retrieved using the key with which they are stored. Logic Requirements 1. Implement a HashTable from Scratch Implement a hash table, with separate chaining to deal with collisions. You may not use any Java collection classes, rather you must implement the hash table yourself. Your hash table class must be a general purpose class, i.e. not specific to any application or key- value types 3 Specification Version: March 3, 2019 You must implement the interfaces defined in the code posted on Piazza. Other than constructors, you may not have any public methods in your implementations of these interfaces that are not specified in the interfaces. Any additional methods you add must be private. 2. Implement a Document Store As specified in the API, your DocumentStore provides a public API to: o put documents in the store o get documents out of the store, either as a plain String or as a byte[] which is the compressed version of the String o set the default compression format/algorithm to use if one is not specified when putting a document o delete a document Your document store will use an instance of your HashTable implementation to store documents in memory Your code will receive documents as an InputStream, and the document’s key as an instance of java.net.URI. When a document is added to your document store, you must do the following: o Read in the entire contents of the document from the InputStream as a byte[], and then create a String from the byte[] which you read. o Create a hash code of the string. o If a doc already is present in the HashTable with this URI and the same hash code, return from your method (i.e. stop here), since this is the same key-value pair you already have stored and nothing more needs to be done. o Compress the document using the format specified by the caller Must support zip, 7z, gzip, jar, bzip2, all using the Apache Commons Compress library o If the caller does not specify a compression format, use the default compression format o Create a document object that holds the following: Compressed file as a byte[] Hash code URI Compression format (one of the values of DocumentStore.CompressionFormat) o Insert document object into the hash table with URI as the key and document object as the value The default compression format is ZIP unless setDefaultCompressionFormat is called to set it to something else For further information about the requirements for put, get, and delete, see the comments on the Java interface Stage 2: Add Undo Support to the Document Store Using a Stack Due: Sunday, March 31, 11:59 PM EST 4 Specification Version: March 3, 2019 Description In this stage you add support for two different types of undo 1) undo the last action, no matter what document it was done on 2) undo the last action on a specific document. You will also get your first experience with functional programming in Java, as well as Logic Requirements 1. Add Undo via a Command Stack 1. Every put and delete done on your DocumentStore must result in the adding of a new Command instance to a single Stack which serves as your command stack a. No other class besides your document store may have any direct references to the stack; it should be a private field within the DocumentStore b. You must use the supplied Command class to model commands. You may not alter or subclass Command. 2. If a user calls undo(), then your DocumentStore must undo the last command on the stack 3. If a user calls undo(URI),then your DocumentStore must undo the last command on the stack that was done on the Document whose key is the given URI, without having any permanent effects on any commands that are on top of it in the command stack. 4. In order to undo and/or redo, the DocumentStore must use Command.undo and Command.redo methods; it may not implement the actual undo or redo logic, although it must manage the command stack itself and determine which undo or redo on which Command to call. 2. Undo/Redo Logic There are two cases you must deal with to undo a put: o The put added a brand new Document to the DocumentStore o The put resulted in overwriting an existing Document with the same URI in the DocumentStore To undo a delete, you put whatever was deleted back into the DocumentStore DO NOT add any commands to the command stack in your undo/redo logic Redo IS NOT part of the public API – it is only there to assist in your undoing a command which is not at the top of the command stack. 3. Functional Implementations for Undo and Redo As stated above, every put and delete done on your DocumentStore must result in the adding of a new Command onto your command stack. Undo and redo must be defined as lambda functions that are passed in to the Command constructor. You must read Chapter 17 in “Java How to Program” (by Deitel) to learn about lambdas, closures, and functions. (Alternatively, you can read Modern Java Recipes.) Once you learn about lambdas and closures, you should understand how to implement your undo and redo as lambda functions. 5 Specification Version: March 3, 2019 Stage 3: Keyword Search Using a Trie Due: Sunday, April 14, 11:59 PM EST Stage 4: Memory Management, Part 1: Tracking Document Usage via a Heap Due: Sunday, May 5, 11:59 PM EST Stage 5: Memory Management, Part 2: Two Tier Storage (RAM and Disk) Using a Btree Due: Friday, May 31, 5:00 PM EST