cryfa is a FASTA/FASTQ compression and encryption tool. It uses AES (Advanced Encryption Standard) for the purpose of encryption. cryfa can be applied to any FASTA or FASTQ file (DNA sequences, headers and quality-scores). Compacting 3 DNA bases into 1 char, it uses a fixed block size packing. Compared with general compression tools, it allows to decrease the file size by a factor of 3, without creating security problems such as those derived from CRIME or BREACH attacks.
Get cryfa and make the project, using:
git clone https://github.com/pratas/cryfa.git
cd cryfa
cmake .
make
If you want to run cryfa in stand-alone mode, use the following command:
./cryfa [OPTION]... -k [KEY_FILE] [-d] [IN_FILE] > [OUT_FILE]
For example, to compress:
./cryfa -k pass.txt in.fq > comp
and, to decompress:
./cryfa -k pass.txt -d comp > orig.fq
Options are described in the following section.
If you want to compare cryfa with other methods, set the parameters in run.sh bash script, then run it:
./run.sh
With this script, you can download the datasets, install the dependencies, install the compression and encryption tools, run these tools, and finally, print the results.
To see the possible options, type:
./cryfa -h
which provides the following:
SYNOPSIS
./cryfa [OPTION]... -k [KEY_FILE] [-d] [IN_FILE] > [OUT_FILE]
SAMPLE
Compress & Encrypt: ./cryfa -k pass.txt in.fq > comp
Decompress & Decrypt: ./cryfa -k pass.txt -d comp > orig.fq
DESCRIPTION
Compress and encrypt FASTA/FASTQ files.
The KEY_FILE specifies a file including the password.
-h, --help
usage guide
-k [KEY_FILE], --key [KEY_FILE]
key file name -- MANDATORY
-d, --dec
decompress & decrypt
-v, --verbose
verbose mode (more information)
-s, --disable_shuffle
disable input shuffling
-t [NUMBER], --thread [NUMBER]
number of threads
cryfa uses standard input and ouput streams, hence, it can be directly integrated with pipelines.
Please cite the following, if you use cryfa:
- D. Pratas, M. Hosseini and A.J. Pinho, "Cryfa: a tool to compact and encrypt FASTA files," 11'th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB), Springer, June 2017.
Please let us know if there is any issues.
cryfa is under GPL v3 license. For more information, click here.