Skip to content

Latest commit

 

History

History
 
 

DiversityTools

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Diversity Kraken Tools

For news and updates, refer to the github page: https://github.com/jenniferlu717/KrakenTools/

KrakenToolsDiversity is a suite of scripts to be used for post-analysis of Kraken/KrakenUniq/Kraken2/Bracken results. Please cite the relevant paper if using KrakenTools with any of the listed programs.

Links to Kraken github pages

  1. Kraken 1
  2. Kraken 2
  3. KrakenUniq
  4. Bracken

For issues with any of the above programs, please open a github issue on their respective github pages. This github repository is dedicated to only the scripts provided here.


Scripts included in Diversity KrakenTools

  1. alpha_diversity.py
  2. beta_diversity.py

Running Scripts:

No installation required. All scripts are run on the command line as described.

Users can make scripts executable by running

chmod +x myscript.py
./myscript.py -h

alpha_diversity.py

This program calculates alpha diversity, from the Bracken abundance estimation file. User must specify Bracken output file, and type of alpha diversity to be calculated. Specific options are specified below.

1. alpha_diversity.py usage/options

python alpha_diversity.py

  • -f, --filename MYFILE.BRACKEN..........Bracken output file
  • -a, --alpha TYPE.......................Single letter alpha diversity type (Sh, BP, Si, ISi, F)

2. alpha_diversity.py input file

Input Bracken file must be in standard Bracken output file format and must be run specifically for Abundance Estimation. Example format:

name                     tax_id      tax_lvl     kraken....  added...   new.... fraction...                                                                                                         
    Homo sapiens             9606        S           ...         ....       999000  0.999000                                                                                                            
    Streptococcus pyogenes   1314        S           ...         ....       10      0.000001                                                                                                            
    Streptococcus agalactiae 1311        S           ...         ....       5       0.000000                                                                                                            
    Streptococcus pneumoniae 1313        S           ...         ....       3       0.000000                                                                                                            
    Bordetella pertussis     520         S           ...         ....       20      0.000002

Link to Bracken github for reference: Bracken

Note: This example shows the file format from Bracken at the species level, but users can specify a different level for abundance estimation with Bracken.

4. alpha_diversity.py alpha type input

By default, the program will calculate Shannon's alpha:

python alpha_diversity.py -f myfile.bracken

Users can specify which type of alpha diversity from this set:

  • Sh......Shannon's alpha diversity
  • BP.....Berger-Parker's alpha
  • Si.....Simpson's diversity
  • ISi.....Inverse Simpson's diversity
  • F.......Fisher's index

To calculate berger-parker's alpha:

python alpha_diversity.py -f myfile.bracken -a BP

beta_diversity.py

This program calculates the beta diversity (Bray-Curtis dissimilarity) from kraken, krona and bracken files.

To calculate the pairwise dissimilarity score, you can call the script with two or more files as follows:

python beta_diversity.py -i file1.bracken file2.bracken [...] --type bracken

The script supports bracken files, kraken and kracken2 report files and krona files. Additionally, you can pass any tab-separated file and specify the columns with the taxonomical information and read counts (--cols).

To only compute the dissimilarity score on a certain taxonomical level (only supported for kraken and krona files), you can pass --level S (S, G, F or O for species, genus, family and order level).

For more information, please take a look at the help page.

python beta_diversity.py --help