CSCI 6620 Data Structures - Fall 2021
Program 1: Decoding a Compressed File
P1-RunlengthDecode.docx
download
compressed1.txt
We will use Microsoft’s free Visual Studio Community 2019:https://visualstudio.microsoft.com/vs/community/Links to an external site.
1 Goals of this Assignment
- To identify a compiler and an IDE that will meet your needs for this course.
- To write a program in C++.
- To use file I/O in C++ (open, read a char, test for eof).
- To use char input that does NOT skip white space
- To use an acceptable coding style: Please see the style sheet
- To understand and use a type cast
2 File Compression
Most people now use rar or zip to compress files. These applications use a combination of compression mechanisms, in sequence. This assignment and the next explore the simplest compression scheme we have. Later this term we will implement a more complex compression scheme.
2.1 Run-Length Compression
Run-length Compression is effective on text or binary files that have the same byte repeated over and over. Think of a file containing a table of numbers: it has lots of consecutive space characters, and may have a repeated filler character, such as a ‘.’ . You will implement this simple kind of compression in Program 2. However, the algorithm todecompressthe file is easier, so I am asking you to do the decompression first.
Runs.In this scheme, any “run” of the same character (4 or more identical consecutive bytes) is replaced by a triplet of bytes, consisting of
- An escape character. We will use 0x7f, which is sometimes called “esc”. It is a non-printing ASCII
- The letter that has been repeated
- A 1-byte count of the number of of repetitions
In addition, any esc character, or run of them, that occurs in the input mustalsobe replaced by a triplet: esc esc count .
3 To do: Expand a Compressed File
I will give you a compressed file; your job is to restore it to the uncompressed representation. The algorithm is a loop that reads chars one at a time until eof is reached:
- Read an input character named my_character (do not skip whitespace) from compressed.txt and quit if end of file
- If my_character is NOT an escape character, output it to the console and to a file: console_output.ext. Continue reading the next character in the loop
- If my_character IS an escape character, read two more chars: the first will be a letter, the second is the count.
- Cast the count from type char to type unsigned short int and use it to output that many copies of the letter.
- Continue reading the next character in the loop
Please note: if you handle the end of file wrong, the last character of the output will be wrong. You must check for eofimmediately afterreading a character, not before.
4 Due Next Week
All programming in this course will be done in C++. This assignment involves file I/O which must be done using the C++ I/O libraries. Standards for style, organization, and OO usage will increase gradually throughout the term. For this program, do it simply. You do not need any classes or structs. You can do the whole thing in main().
4.1 Submitting Your Work
Make sure your name is inside every file. Put your source files and your test output into a directory named P1-Smith (or whatever your name is). Do not include your entire project, just the source and output. Zip this directory. Hand in an electronic copy of your zipped directory
Choose a submission type
Select submission type TextTextSelect submission type UploadUpload
PreviousSubmit Assignment
NextCSCI 6620 Data Structures - Fall 2021
Program 1: Decoding a Compressed File
P1-RunlengthDecode.docx
download
compressed1.txt
We will use Microsoft’s free Visual Studio Community 2019:https://visualstudio.microsoft.com/vs/community/Links to an external site.
1 Goals of this Assignment
- To identify a compiler and an IDE that will meet your needs for this course.
- To write a program in C++.
- To use file I/O in C++ (open, read a char, test for eof).
- To use char input that does NOT skip white space
- To use an acceptable coding style: Please see the style sheet
- To understand and use a type cast
2 File Compression
Most people now use rar or zip to compress files. These applications use a combination of compression mechanisms, in sequence. This assignment and the next explore the simplest compression scheme we have. Later this term we will implement a more complex compression scheme.
2.1 Run-Length Compression
Run-length Compression is effective on text or binary files that have the same byte repeated over and over. Think of a file containing a table of numbers: it has lots of consecutive space characters, and may have a repeated filler character, such as a ‘.’ . You will implement this simple kind of compression in Program 2. However, the algorithm todecompressthe file is easier, so I am asking you to do the decompression first.
Runs.In this scheme, any “run” of the same character (4 or more identical consecutive bytes) is replaced by a triplet of bytes, consisting of
- An escape character. We will use 0x7f, which is sometimes called “esc”. It is a non-printing ASCII
- The letter that has been repeated
- A 1-byte count of the number of of repetitions
In addition, any esc character, or run of them, that occurs in the input mustalsobe replaced by a triplet: esc esc count .
3 To do: Expand a Compressed File
I will give you a compressed file; your job is to restore it to the uncompressed representation. The algorithm is a loop that reads chars one at a time until eof is reached:
- Read an input character named my_character (do not skip whitespace) from compressed.txt and quit if end of file
- If my_character is NOT an escape character, output it to the console and to a file: console_output.ext. Continue reading the next character in the loop
- If my_character IS an escape character, read two more chars: the first will be a letter, the second is the count.
- Cast the count from type char to type unsigned short int and use it to output that many copies of the letter.
- Continue reading the next character in the loop
Please note: if you handle the end of file wrong, the last character of the output will be wrong. You must check for eofimmediately afterreading a character, not before.
4 Due Next Week
All programming in this course will be done in C++. This assignment involves file I/O which must be done using the C++ I/O libraries. Standards for style, organization, and OO usage will increase gradually throughout the term. For this program, do it simply. You do not need any classes or structs. You can do the whole thing in main().
4.1 Submitting Your Work
Make sure your name is inside every file. Put your source files and your test output into a directory named P1-Smith (or whatever your name is). Do not include your entire project, just the source and output. Zip this directory. Hand in an electronic copy of your zipped directory
Choose a submission type
Select submission type TextTextSelect submission type UploadUpload
PreviousSubmit Assignment
Next