Compute the Tajima's-D, Pi-Estimator or Watterson-Estimator for multiple sequences.
This module is now part of the bfx suite. See https://py-bfx.readthedocs.io for more information.
Tajima's D is a population genetic test statistic that computes the difference between the mean number of pairwise differences and the number of segregating sites. It is used to determine whether a population is expanding or shrinking.
Tajima's D is defined as follows:
If
Whereas
A result is consideres significant if
The π estimator is the average number of pairwise differences between any two sequences:
The Watterson estimator is the expected number of segregating sites.
Using pip / pip3:
pip install tajimas_d
Using conda:
conda install -c bioconda tajimas_d
Or by source:
git clone [email protected]:not-a-feature/tajimas_d.git
cd tajimas_d
pip install .
from tajimas_d import tajimas_d, watterson_estimator, pi_estimator
sequences = ["AAAA", "AAAT", "AAGT", "AAGT"]
theta_tajima = tajimas_d(sequences)
theta_pi = pi_estimator(sequences)
theta_w = watterson_estimator(sequences)
The standalone version requires miniFasta>=2.2
to be installed.
usage: tajimas_d [-h] -f PATH [-p] [-t] [-w]
tajimas_d: Compute Tajima's D, the Pi- or Watterson-Estimator for multiple
sequences.
optional arguments:
-h, --help show this help message and exit
-f PATH, --file PATH Path to fasta file with all sequences.
-p, --pi Compute the Pi-Estimator score.
-t, --tajima Compute the Pi-Estimator score. (default)
-w, --watterson Compute the Watterson-Estimator score.
Copyright (C) 2024 by Jules Kreuer - @not_a_feature
This piece of software is published unter the GNU General Public License v3.0 TLDR:
Permissions | Conditions | Limitations |
---|---|---|
✓ Commercial use | Disclose source | ✕ Liability |
✓ Distribution | License and copyright notice | ✕ Warranty |
✓ Modification | Same license | |
✓ Patent use | State changes | |
✓ Private use |
Go to LICENSE.md to see the full version.