- New DB summary info printed out w/ inspect script + --skip-counts option
- Now stripping carriage returns and other trailing whitespace from sequence data
- Treating l-mers immediately following ambiguous characters as ambiguous until a full k-mer is processed
- Bug in expansion of spaced seed masks that left spaces at end
- New kraken2-inspect script to report minimizer counts per taxon
- Kraken 2X build now adds terminators to all reference sequences
- Improved portability to older g++ by removing initialization of variable-length string.
- Reporting options to kraken2 script (like Kraken 1's kraken-report and kraken-mpa-report)
- Made loading to RAM default option, added --memory-mapping option to kraken2
- Low base quality masking option
- Moved low-complexity masking to library download/addition, out of build process
- Made no masking default for human genome in standard installation
- Low-complexity sequence masking as a default
- UniVec/UniVec_Core databases to supported downloads
- UniVec_Core & human in standard Kraken 2 DB
- 16S DB support (Greengenes, Silva, RDP)
- --use-names flag for kraken2 script
- Priority queue to ensure classifier output order matches input order when multi-threading
- Changelog
- Reduced amino acid alphabet (requires rebuild of old protein DBs)
- Operating manual
- kraken2 now allows compression & paired processing at same time