Assignment must be done in the C language.
Laboratory Assignment 2 CS 340d This laboratory concerns developing and observing properties about memory allocation and a memory garbage collector (GC). Many computer algorithms use memory in a temporary manner, and then return it for later use by other algorithms. Likely, the most familiar form of memory reuse is a stack, such as used to implement function call and return of typical C-language programs. When a function returns, the memory storage associated with a called function is ``returned'' and may be reused by other algorithms. To manage persistent but dynamically storage structures, many languages provide automated memory allocation and collection mechanisms. Other than stack-based allocation, the C Language does not provide storage allocation primitives. However, some automated storage allocation/collection mechanisms are provided by C++ and Java. In this laboratory, we will explore such allocation and collection mechanism. Systems that provide automatic collection of unreachable storage usually provide a ``garbage collector'' that recycles storage for its repeated use. In this laboratory, we will implement a copying-style garbage collector. As a part of this laboratory, students are tasked to specify invariants that will help assure the correctness of their implementations. Laboratory Requirements This laboratory involves implementing a garbage collector. For the interested student, there is an opportunity to convert their garbage collector into a real-time garbage collector. Your garbage collector should be based on the algorithm that can be found in Henry Baker's 1978 CACM article titled: "List Processing in Real-Time on a Serial Computer", CACM, 21(4):280-283, doi: 10.1145/359460.359470. This laboratory requires you to implement the CONS, CAR, CDR, RPLACA, RPLACD, EQ, and ATOM primitives; in turn, these primitives can be used to implement a Lisp system, or a BDD package. You are encouraged to use Baker's article (mentioned above) as a guide to implementing your storage management system. What do these seven functions do? CONS( x, y ) -- Forms a single object from objects x and y CAR( p ) -- When p references a pair x, y, it returns x CDR( p ) -- When p references a pair x, y, it returns y RPLACA( x, a ) -- Replace x of pair x, y, with a RPLACD( x, b ) -- Replace x of pair x, y, with b EQ( r, s ) -- Do r and s reference the same object? ATOM( k ) -- Does k reference a non-pair? Your program should allow a command-line argument to specify the number maximum number of CONS cells that can be stored by your allocation and garbage-collection system; your system should allow for at least 100,000,000 CONS cells. The only atoms that your system needs to implement are 60-bit integers and the constants T and NIL. (Note: a Lisp system would implement additional types of atoms, such as unbounded integers, rationals, symbols, strings, etc.) We specify 60-bit integers as this leaves four bits for type information. The constants T and NIL can be implemented using type information alone. Some familiarity with Lisp may help provide context. Using the functions above, your implementation should also provide: EQUAL( x, y ) -- Returns T, when x and y are identical objects; otherwise, NIL You should be very careful when implementing your project. You should include an invariant that insures that the total amount of reachable space plus the total amount of free space equal the amount of space available for CONS cells. This invariant can be more subtle to define than you might think. On April 1st (uh oh), we will release some list-processing functions (e.g., list append, list reverse, ...) that you will implement as C functions that make reference to the subroutines you implement for the seven functions (loosely) defined above. On April 8th, we will release some tougher tests for your allocator and GC system. These tests will stress the integrity of your allocator and garbage collector. Laboratory Documentation Finally, for the writing component, you need to include in your solution program a 130-line to 150-line description of your implementation. This description should be included as a C-language comment that begins with a line containing only "/*" and ends with a line containing only " */", and written in the (approximate) format of a typical Linux manual entry. Remember, this class carries a writing flag, and this kind of summary will be required for all of the class laboratory assignments. Grading Your laboratory will be graded with the following weights: ?0% - A correctly functioning allocator and garbage collector; your implementation must implement all seven primitives. As noted above, we will release list processing functions that must be implemented. And, your program is expected to operate correctly on our stress tests. 30% - Written description of your solutions. Extra Credit 25% - A correctly functioning, real-time (RT) garbage collector. This incremental RT GC needs to work as seamlessly as the copying collector you have already implemented. Programming Techniques S. L. Graham, R. L. Rivest Editors List Processing in Real Time on a Serial Computer H e n r y G . B a k e r , J r . M a s s a c h u s e t t s I n s t i t u t e o f T e c h n o l o g y A real-time list processing system is one in which the time required by the elementary list operations (e.g. CONS, CAR, CDR, RPLACA, RPLACD, EQ, and ATOM in LISP) is bounded by a (small) constant. Classical implementations of list processing systems lack this property because allocating a list cell from the heap may cause a garbage collection, which process requires time proportional to the heap size to finish. A real-time list processing system is presented which continuously reclaims garbage, including directed cycles, while linearizing and compacting the accessible cells into contiguous locations to avoid fragmenting the free storage pool. The program is small and requires no time-sharing interrupts, making it suitable for microcode. Finally, the system requires the same average time, and not more than twice the space, of a classical implementation, and those space requirements can be reduced to approximately classical proportions by compact list representation. Arrays of different sizes, a program stack, and hash linking are simple extensions to our system, and reference counting is found to be inferior for many applications. Key Words and Phrases: real-time, compacting, garbage collection, list processing, virtual memory, file or database management, storage management, storage allocation, LISP, CDR-coding, reference counting. CR Categories: 3.50, 3.60, 3.73, 3.80, 4.13, 4.22, 4.32, 4.33, 4.35, 4.49 General permission to make fair use in teaching or research of all or part of this material is granted to individual readers and to nonprofit libraries acting for them provided that ACM's copyright notice is given and that reference is made to the publication, to its date of issue, and to the fact that reprinting privileges were granted by permission of the Association for Computing Machinery. To otherwise reprint a figure, table, other substantial excerpt, or the entire work requires specific permission as does republication, or systematic or multiple reproduc- tion. This research was supported by the Advanced Research Projects Agency of the Department of Defense and was monitored by the Office of Naval Research under contract number N00014-75-C-0522. Au- thor's address: Computer Science Department, University of Rochester, Rochester, NY 14627. © 1978 ACM 0001-0782/78/0400-0280 $00.75 280 1. Introduction and Previous Work List processing systems such as LISP [25] have slowly gained popularity over the years in spite of some rather severe handicaps. First, they usually interpreted their programs instead of compiling them, thus increasing their running time by several orders of magnitude. Sec- ond, the storage structures used in such systems were inefficient in the use of storage; for example, compiling a program sometimes halved the amount of storage it occupied. Third, processing had to be halted periodically to reclaim storage by a long process known as garbage collection, which laboriously traced and marked every accessible cell so that those inaccessible cells could be recycled. That such inefficiencies were tolerated for so long is a tribute to the elegance and productivity gained by programming in these languages. These languages freed the programmer from a pr imary concern: storage man- agement. The programmer had only to call CONS (or its equivalent) to obtain a pointer to a fresh storage block; even better, the programmer had only to relinquish all copies of the pointer and the storage block would auto- matically be reclaimed by the tireless garbage collector. The programmer no longer had to worry about prema- turely freeing a block of storage which was still in use by another part o f the system. The first problem was solved with the advent of good compilers [27, 32] and new languages such as S IMULA especially designed for efficient compilation [1, 5, 14]. The second was also solved to some extent by those same compilers because the user programs could be removed from the list storage area and freed from its inefficient constraints on representation. 1 Other techniques such as compact list representation ("CDR-coding") [12, 19] have been proposed which also offer partial solutions to this problem. This paper presents a solution to the third problem of classical list processing techniques and removes