A command-line tool for encoding binary data to DNA sequences and decoding DNA sequences back to binary data.
I took inspiration from this video by ScienceClic for this project: How to store data in DNA?
dnabin converts binary files into DNA sequences using a simple 2-bit (binary) encoding scheme where each byte is represented by 4 DNA bases (quaternary):
A=00C=01G=10T=11
This encoding allows any binary data to be represented as a DNA sequence, which can be useful for data storage in synthetic DNA.
# Clone the repository
git clone https://github.com/chamorin/dnabin
cd dnabin
# Install dependencies
make deps
# Build the binary
make build
# The binary will be available at bin/dnabinUsage:
dnabin [command]
Available Commands:
decode Decode DNA sequence to binary
encode Encode binary file to DNA sequence
help Help about any command
Flags:
-h, --help help for dnabin
# Encode a file and output to stdout
dnabin encode input.bin
# Encode a file and save to output file
dnabin encode input.bin -o outputYou can also use the -c or --color flag to colorize the DNA bases in the output (A=green, C=blue, G=yellow, T=red):
dnabin encode input.bin -c# Decode a DNA sequence and output to stdout
dnabin decode input.txt
# Decode a DNA sequence and save to output file
dnabin decode input.txt -o output.binEach byte is encoded as 4 DNA bases by splitting it into 4 pairs of bits:
| Byte Value | Binary | DNA Sequence |
|---|---|---|
| 0 | 00000000 | AAAA |
| 255 | 11111111 | TTTT |
DNA sequences can contain whitespace (spaces, tabs, newlines) which are ignored during decoding.
