This project implements file compression and decompression using a combination of Burrows-Wheeler Transform (BWT), Move-To-Front (MTF) encoding, and Huffman Coding in C++. This multi-stage approach efficiently reduces file sizes by transforming and encoding data based on character patterns and frequency, making it ideal for compressing large text files.
- Compresses text files using Burrows-Wheeler Transform (BWT), Move-To-Front (MTF), and Huffman Coding
- Decompresses files back to their original content
- Handles large files efficiently
- Modular C++ codebase with clear separation of logic
compressor.cpp,compressor.h: Compression logic (BWT, MTF, Huffman Coding)decompressor.cpp,decompressor.h: Decompression logic (inverse BWT, inverse MTF, Huffman Decoding)huffmanTree.cpp,huffmanTree.h: Huffman tree implementationmain.cpp: Entry point for running compression/decompressionbigfile.txt: Example input filebigfile.rsk: Example compressed filedecompressed_bigfile.txt: Example decompressed output
- Build the project
- Use a C++ compiler (e.g., g++) to compile all
.cppfiles. - Example:
g++ -o file_compressor main.cpp compressor.cpp decompressor.cpp huffmanTree.cpp
- Use a C++ compiler (e.g., g++) to compile all
- Compress a file
- Run the executable and follow prompts to select compression.
- Decompress a file
- Run the executable and follow prompts to select decompression.
This project uses a three-stage compression pipeline:
- Burrows-Wheeler Transform (BWT): Rearranges the input data to group similar characters together, making it more amenable to further compression.
- Move-To-Front (MTF) Encoding: Converts sequences of repeated characters into sequences of small integers, further increasing compressibility.
- Huffman Coding: Assigns shorter codes to more frequent symbols and longer codes to less frequent ones, reducing the overall file size without losing information.
Each stage contributes to improved compression efficiency, especially for large text files with repeating patterns.
This project is for educational purposes.