is anyone to help me?
PA-3 PA-3.1CS 7071 CS 7071: Advanced Database Management Programming Assignment (PA) 3 Seokki Lee PA-3.2CS 7071 Record Manager § Handling tables with a fixed schema. § Clients can insert, delete, update records, and scan through the records in a table. § A scan is associated with a search condition and only returns records that match the search condition. § Each table should be stored in a separate page file and your record manager should access the pages of the file through the buffer manager (implemented in PA2). PA-3.3CS 7071 Record Manager § Record Representation • The size of a record is fixed for a given schema § The data types we consider for this PA are all fixed length. § Page Layout • Define how to layout records on pages • Need to reserve some space on each page for managing the entries on the page § Table information pages • Reserve one or more pages of a page file to store, e.g., the schema of the table. PA-3.4CS 7071 Record Manager § Record IDs • A combination of page and slot number (as discussed in the class). § Free Space Management • You need to track available free space on pages (e.g., for deletion). • Option 1: Link pages with free space by reserving space for a pointer to the next free space on each page. One of the table information pages can have a pointer to the first page with free space. • Option 2: Use several pages to store a directory that records how much free space you have for each page. PA-3.5CS 7071 Tables.h § Basic data structures for schemas, tables, records, record ids (RIDs), and values. § Functions for serializing these data structures as strings • The serialization functions are provided in rm_serializer.c. § Four datatypes that can be used for records of a table • integer (DT_INT), float (DT_FLOAT), strings of a fixed length (DT_STRING), and boolean (DT_BOOL) § A record is simply a record id (consisting of a page number and slot number) and the concatenation of the binary representation of its attributes according to the schema (data). PA-3.6CS 7071 Schema § A schema consists of a number of attributes (numAttr). § For each attribute, it records the name (attrNames) and data type (dataTypes). • For attributes of type DT_STRING, we record the size of the strings in typeLength. § A schema can have a key defined. • The key is represented as an array of integers that are the positions of the attributes of the key (keyAttrs). § For example, considering a relation R(a,b,c), keyAttrs for ’a’ would be [0]. PA-3.7CS 7071 Data Types and Binary Representation § Values of a data type are represented using the Value struct. • The value struct represents the values of a data type using standard C data types. § For example, a string is a char * and an integer using a C int. § Values are only used for expressions and for returning data to the client of the record manager. § Attribute values in records are stored slightly different if the data type is string. • In C, a string is an array of characters ended by a 0 byte. • In a record, strings are stored without the additional 0 byte in the end, e.g., strings of length 4 should occupy 4 bytes in the data field of the record. PA-3.9CS 7071 Expr.h § Defining data structures and functions to deal with expressions for scans implemented in expr.c. § Expressions can be constants (stored as a Value struct), references to attribute values (represented as the position of an attribute in the schema), and operator invocations. § Operators are either comparison operators (equals and smaller) that are defined for all data types or boolean operators AND, OR, and NOT. Operators have one or more expressions as input. § The expression framework allows for arbitrary nesting of operators as long as their input types are correct. • For example, you cannot use an integer constant as an input to a boolean AND operator. PA-3.11CS 7071 Record Manager Functions (Record_mgr) § Table and Record Manager Functions • There are functions to initialize and shutdown a record manager. Furthermore, there are functions to create, open, and close a table. • createTable should create the underlying page file and store information about the schema, free-space, and so on in the Table Information pages. • All operations on a table such as scanning or inserting records require the table to be opened first. Clients can then use the RM_TableData struct to interact with the table. • closeTable should cause all outstanding changes to the table to be written to the page file. • The getNumTuples function returns the number of tuples in the table. PA-3.12CS 7071 Record Manager Functions (Record_mgr) § Record Functions • These functions are used to retrieve a record with a certain RID, to delete a record with a certain RID, to insert a new record, and to update an existing record with new values. • When a new record is inserted, the record manager should assign an RID to this record and update the record parameter passed to insertRecord. PA-3.13CS 7071 Record Manager Functions (Record_mgr) § Scan Functions • A client can initiate a scan to retrieve all tuples from a table that fulfill a certain condition (represented as an Expr). • Starting a scan initializes the RM_ScanHandle data structure passed as an argument to startScan. Afterwards, calls to the next method should return the next tuple that fulfills the scan condition. § If NULL is passed as a scan condition, then all tuples of the table should be returned. § next should return RC_RM_NO_MORE_TUPLES once the scan is completed and RC_OK otherwise (unless an error occurs). PA-3.14CS 7071 Record Manager Functions (Record_mgr) § Schema Functions • Helper functions that are used to return the size in bytes of records for a given schema and create a new schema PA-3.15CS 7071 Record Manager Functions (Record_mgr) § Attribute Functions • Functions that are used to get or set the attribute values of a record and create a new record for a given schema • createRecord should allocate enough memory to the data field to hold the binary representations for all attributes of this record as determined by the schema. PA-3.16CS 7071 Optional Extensions § TIDs and tombstones • Implement the TID and Tombstone concepts introduced in class. § Null values • Add support for SQL style NULL values to the data types and expressions. This requires changes to the expression code, values, and binary record representation. § Check primary key constraints • On inserting and updating tuples, check that the primary key constraint for the table holds, i.e., checking whether no record with the same key attribute values as the new record already exists in the table. PA-3.17CS 7071 Optional Extensions § Ordered scans • Add a parameter to the scan that determines a sort order of results, i.e., you should pass a list of attributes to sort on. • Try to implement this using external sorting, so you can sort arbitrarily large data. § Interactive interface • Implement a simple user interface • A user should be able to define new tables, insert, update, and delete tuples, and execute scans. PA-3.18CS 7071 Optional Extensions § Conditional updates using scans • Extend the scan code to support updates. • Add a new method updateScan that takes a condition (expression) which is used to determine which tuples to update and a pointer to a function which takes a record as input and returns the updated version of the record. § That is, you should implement a method that updates the record values and then pass this function to updateScan. • Alternatively, extend the expression model with new expression types (e.g., adding two integers) and let updateScan take a list of expressions as a parameter. § In this case, the new values of an updated tuple are produced by applying the expressions to the old values of the tuple. PA-3.19CS 7071 Source Code Structure § Your source code directories should be structured as follows. PA-3.20CS 7071 Source Code Structure § You should reuse your storage and buffer manager implementation. • Please copy your codes from PA1 and PA2 to PA3 directory. § Submission • Put all source files in a folder assign3 • Push it into your git repository • The folder should contain at least § The provided headers and C files § README.txt file that describes your solution § A set of *.c and *.h files implementing the storage manager § A Makefile for building your code PA-3.21CS 7071 Test Cases § Test_helper.h • Defines several helper methods for implementing test cases such as ASSERT_TRUE. § Test_expr.c • This file implements several test cases using the expr.h interface. • Please let your make file generate a test_expr binary for this code. § Test_assign3_1.c • This file implements several test cases using the record_mgr.h interface. • Please let your make file generate a test_assign3 binary for this code. PA-3.22CS 7071 Resources § Make use of existing debugging and memory checking tools • Valgrind: https://valgrind.org/ • GDB for Linux: https://www.gnu.org/software/gdb/ • LLDB for Mac: https://lldb.llvm.org/ https://valgrind.org/ https://www.gnu.org/software/gdb/ https://lldb.llvm.org/ PA-3.23CS 7071 Testing Environment § Testing your codes on Linux (e.g., ubuntu) or Mac • Cygwin could be an option for Window but not fully tested… § Running your codes on commandline • Using Makefile provided PA-3.24CS 7071 Grading Aspects § Functionality • Does the code function correctly? • All the tests are successful? § Documentation • How well are the codes and readme file are documented? § Code organization • Is the structure of the code clear (Please comment!) • Does the organization suit the goal? § Do not forget the optional extensions for bonus points! PA-3.25CS 7071 Due and Late Policy § Due date • Apr 17th by 11:59 pm EST • All your codes MUST be pushed into the repository § Late policy • -10% per day • No exception unless health issue