Overview In the age of big data, there is a constant struggle with regard to storage. For example, a single Whole Genome Sequencing experiment can generate over 100 GB of raw data. For years, computer...

1 answer below »

Overview


In the age of big data, there is a constant struggle with regard to storage. For example, a single Whole Genome Sequencing experiment can generate over 100 GB of raw data. For years, computer scientists have attempted to reduce the storage size of files without losing any information. In this assignment, we will be implementing our own file compression tool using the Huffman algorithm.


Task: File Compression and Decompression


In this project, "all" you have to do is implement two programs:

compress

and

decompress
. The usage should be as follows:

./compress ./decompress


File Compression


The
compress
program will take as input an arbitrary file (original_file) and will use Huffman Compression to create a compressed version of the file (compressed_file):




  • Construct a Huffman code for the contents of
    original_file




  • Use the constructed Huffman code to encode
    original_file




  • Write the results (with a header to allow for decompression) to
    compressed_file





File Decompression


The
decompress
program will then take as input a compressed file created by your
compress
program (compressed_file) and will decompress the file (decompressed_file):




  • Construct a Huffman code by reading the header of
    compressed_file




  • Use the constructed Huffman code to decode
    compressed_file




  • Write the results to
    decompressed_file




Note that
decompressed_file
must be
completely identical
to
original_file! In other words, calling
diff
should return no differences:

diff decompressed_file original_file # this should not output anything


Reference Solution


You have been provided two programs (refcompress
and
refdecompress), which are a reference solution implementation of this Project. You can use them to help guide you. Note that the reference binaries were compiled to run on Ed, so if you attempt to run them on a different architecture, they will most likely not work.



Note:
Your
compress
is
not
expected to work with
refdecompress, and your
decompress
is
not
expected to work with
refcompress. The provided reference executable files are a matched pair.



Compiling and Running Your Code


We have provided a
Makefile, but
you are free to modify
anything
in the starter code

that you'd like (including the
Makefile
and any of the
.cpp
or
.hpp
files) however you wish (including adding
public
methods). You may also create your own source files and header files. Our only requirements are the following:




  • You should not import or use any libraries or methods that directly perform the compression or decompression for you.




  • You should not change any of the g++ compile flags (CXXFLAGS).




  • Your code needs to be compilable and produce two executables,
    compress
    and
    decompress, using the
    make
    command




  • Your
    compress
    and
    decompress
    programs can be run as described above after
    make.



Answered 123 days AfterJul 25, 2022

Answer To: Overview In the age of big data, there is a constant struggle with regard to storage. For example, a...

Aditi answered on Jul 26 2022
82 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here