Cryfa is an ultrafast encryption tool specifically designed for genomic data. Besides providing robust security, it also compresses FASTA/FASTQ sequences by a factor of three, making it an efficient solution for managing genomic data.
conda install -y bioconda::cryfaThe image is available for linux/amd64 and linux/arm64 (Apple Silicon, AWS Graviton).
# Pull the image
docker pull smortezah/cryfa
# Encrypt (mount the directory containing your key file and input)
docker run --rm -v /path/to/data:/data smortezah/cryfa \
-k /data/pass.txt /data/in.fq > out.crf
# Decrypt
docker run --rm -v /path/to/data:/data smortezah/cryfa \
-k /data/pass.txt -d /data/out.crf > restored.fq# Install git and cmake (≥ 4.0)
sudo apt update;
sudo apt install git python3-pip;
pip3 install cmake;
# Clone and install Cryfa
git clone https://github.com/cobilab/cryfa.git;
cd cryfa;
sh install.sh;# Install Homebrew, git and cmake
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)";
brew install git cmake;
# Clone and install Cryfa
git clone https://github.com/cobilab/cryfa.git;
cd cryfa;
sh install.sh;# Install CMake and Visual Studio Build Tools (requires winget)
winget install --id Kitware.CMake --source winget
winget install --id Microsoft.VisualStudio.2022.BuildTools --source winget
# Clone and install Cryfa
git clone https://github.com/cobilab/cryfa.git
cd cryfa
.\install.ps1Note
Pre-compiled binaries for 64-bit Linux, macOS, and Windows are available as assets on the Releases page.
Run Cryfa with:
./cryfa [OPTION]... -k [KEY_FILE] [-d] [IN_FILE] > [OUT_FILE]For example, to compact and encrypt:
./cryfa -k pass.txt in.fq > compTo decrypt and unpack:
./cryfa -k pass.txt -d comp > orig.fqA sample file, in.fq, is available in the example/ directory.
Note
Cryfa supports a maximum file size of 64 GB. For larger files, consider splitting them into smaller chunks, e.g. using the split command in Linux, and then encrypt each chunk separately. After decryption, you can reassemble the chunks using the cat command.
Cryfa identifies the format of a genomic data file by examining its content, not its extension. For instance, a FASTA file named test can be provided with any extension — test, test.fa, test.fasta, test.fas, test.fsa, etc. So, running
./cryfa -k pass.txt test > compis equivalent to running
./cryfa -k pass.txt test.fa > compNote
The password file can have any extension or none at all -- pass, pass.txt, pass.dat, etc. are all valid and yield the same result.
Cryfa supports the following options:
| Option | Long form | Argument | Required | Description |
|---|---|---|---|---|
-k |
--key |
KEY_FILE |
Yes | Key file containing the password. Use ./keygen to generate a strong one. |
-d |
--dec |
No | Decrypt and unpack the input file. | |
-f |
--force |
No | Force non-FASTA/FASTQ mode: skip compaction, but still shuffle and encrypt. | |
-s |
--stop_shuffle |
No | Disable shuffling of the input. | |
-t |
--thread |
NUMBER |
No | Number of threads to use. |
-v |
--verbose |
No | Enable verbose mode for more detailed output. | |
-h |
--help |
No | Display the usage guide. | |
--version |
No | Display version information. |
Note
Cryfa can compact and encrypt FASTA/FASTQ files, or encrypt any other text-based genomic data (e.g., VCF, SAM, BAM) without compaction.
Cryfa leverages the standard output stream, allowing seamless integration with existing data processing pipelines.
There are two ways to create a KEY_FILE for use with -k / --key: save a raw password in a file, or use the keygen program to generate a strong one. The latter is strongly recommended.
To use the first method, save a raw password to a file (e.g., pass.txt) and pass it to Cryfa. You can use any text editor or run:
echo "Such a strong password!" > pass.txt
./cryfa -k pass.txt IN_FILE > OUT_FILEWhile the password must contain at least 8 characters, it's highly recommended to use a strong password for better security. A strong password:
- Is at least 12 characters long
- Includes a mix of lowercase (a-z) and uppercase (A-Z) letters, digits (0-9), and symbols (e.g., !, #, $, %, and })
- Is not a simple repetition of characters (e.g., zzzzzz), a keyboard pattern (e.g., qwerty), or a sequence of digits (e.g., 123456)
To use keygen instead, run:
./keygenYou'll be prompted with:
Enter a password, then press 'Enter':
Enter a raw password (e.g., A keygen raw pass!) and press Enter. You'll then see:
Enter a file name to save the generated key, then press 'Enter':
The generated key will be saved to the file you specify (e.g., key.txt). Note that keygen requires an initial raw password, but it doesn't need to be particularly strong. Use the resulting key file with Cryfa:
./cryfa -k key.txt IN_FILE > OUT_FILETo learn more about key management (generation, exchange, storage, usage, and replacement of keys), see [1], [2], [3] and [4].
To benchmark Cryfa against other methods, configure the parameters in the bench_cryfa.sh bash script and execute it:
./bench_cryfa.shThis script automates the process of downloading datasets, installing dependencies, setting up compression and encryption tools, executing these tools, and finally, displaying the results.
If you use Cryfa in your research, please cite the following references:
- M. Hosseini, D. Pratas and A.J. Pinho, "Cryfa: a secure encryption tool for genomic data," Bioinformatics, vol. 35, no. 1, pp. 146--148, 2018. DOI: 10.1093/bioinformatics/bty645
- [OPTIONAL] D. Pratas, M. Hosseini and A.J. Pinho, "Cryfa: a tool to compact and encrypt FASTA files," 11'th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB), Springer, June 2017. DOI: 10.1007/978-3-319-60816-7_37
Cryfa is licensed under the GPLv3.
