ASSIGNMENT INSTRUCTIONS
Information:
Hashing is a very common technique used in many areas including cryptography.
You will be implementing a parallel hash table of integers with chaining mechanism used as collusion resolution.
A hash table looks like this:
A hash table is a way to combine the advantages of random access (arrays) retrieval with a better insertion/deletion than arrays.
In a hash table, the key is being first processed through hashing then the program is attempting to store/find it in an array (the hash table). If the item can be stored, it is done. If it can’t, the situation is called a collusion. There are many collusion resolutions, and we will use chaining, i.e. a linked list of items is used to store the overflow of that particular place. Note that the elements are not sorted inside the linked list.
When deleting an element from the hashing table, a similar process is followed. The element is first hashed, then the program looks for it in the correct hashing cell. If the element is found, the first occurrence is removed from the list. If it is not found, nothing happens.
The hashing table will contain 13 spots.
The hashing technique will be the following: number modulo 13
The elements will be read from a file (hashInputDOS.txt or hashInputUnix.txt) and a thread will be created for each element placement in the hash table. Note that this will happen in parallel, i.e. the reading will be continuous, and the thread will be created as soon as the element is read. Your program CANNOT wait until a thread is terminated to start a new thread.
The file will contain the following structure: ( the// are explaining what it is to be done)
Blank lines // nothing
Comments starting with #
ADD number //adds the number to the hash table
DELETE number // will delete the first occurrence of the number in the hash table
PRINT number // will print all the content of the hash table at number modulo 13.
Assignment programming requirements:
You will *use one or more arrays, and pointer arithmetic.
You will *use one or more structure or enums
Your loops will be well structured, i.e. no infinite loops, no returns inside the loops
You will not have any breaks aside from within a switch statement, no gotos, no labels, no continue.
You will use the right type of loop for the job and you will use it appropriately, i.e. if it is a pretest loop, it is primed. If it is an indefinite loop, you will use an indefinite loop
You will use descriptive variable names and descriptive function names.
You will have more than 2 .h files, and more than 2 .c files, and they will have a header
You will create complex functions (i.e. they will have pointers parameters, return something…).
Your program will *use threads and run in parallel
* when I say use, your program will either not compile or not work if I remove the item.
Hints:
Debugging in parallel is very complex. Entire theses have been written on it. In this particular case, you will have a shared resource, which is (part of ) the hashing table.
I highly recommend you work and test as you go.
Here is what my personal approach would be:
Writing the hashing function, check it.
Write the reading from the file and check that you get what you expect
Write the chaining part so that you can store/delete the value in the right spot.
Create your threads but have them work one at a time first (so it will be serial)
Examine your shared resource. What do you want to protect and when?
Parallelize the whole thing, i.e. have no restriction on the creation of threads.
Penalties:
Lack of readability (indentation) 5 points
Goto, breaks in loops, continue 5 points.
Your loops will be well structured, i.e. no infinite loops, no returns inside the loops. 5 points.
You will use the right type of loop for the job and you will use it appropriately, i.e. if it is a pretest loop, it is primed. If it is an indefinite loop, you will use an indefinite loop
Your functions will be well structured, i.e. one entry, one exit, 5 points.
Your functions will have headers stating the purpose of the function and explaining each parameter 2 points.
All your files will have headers. 2 points
You will use descriptive variable names and descriptive function names. 5 points.
You will not use any global variables. 5 points.
Dangling pointers 5 points
Memory leaks 5 points
Unsafe use of pointers, i.e. you don’t check the pointer’s validity before you use it. – 5