Tools for fast and accurate maximum likelihood estimation of Birth-Death Exposed-Infectious (BDEI) epidemiological model parameters from phylogenetic trees.
The birth-death exposed-infectious (BDEI) model [Stadler et al. 2014] describes the transmission of pathogens that feature an incubation period (when the host is already infected but not yet infectious), for example Ebola or SARS-CoV-2. In a phylodynamics framework, it allows to infer such epidemiological parameters as the basic reproduction number R0, incubation period and infectious time from a phylogenetic tree (a genealogy of pathogen sequences).
This implementation of the BDEI model solves the computational bottlenecks (due to high complexity of differential equations used in phylodynamics models, previous implementations [Stadler and Bonhoeffer 2013 and Barido-Sottani et al. 2018 ] sometimes suffered from numerical instability and were only applicable to medium datasets of <500 samples). Our fast and accurate estimator is applicable to very large datasets (10, 000 samples) allowing phylodynamics to catch up with pathogen sequencing efforts.
A Zhukova, F Hecht, Y Maday, and O Gascuel. Fast and Accurate Maximum-Likelihood Estimation of Multi-Type Birth-Death Epidemiological Models from Phylogenetic Trees medRxiv 2022 doi:10.1101/2022.08.02.22278328
As an input, one needs to provide a rooted phylogenetical tree in newick format, and the value of one of the model parameters (for identifiability):
- µ – becoming infectious rate corresponding to a state transition from E (exposed) to I (infectious) (can be fixed via the --mu argument),
- λ – transmission rate, from a transmitter in the state I to a newly infected recipient, whose state is E (can be fixed via the --la argument),
- ψ – removal rate, corresponding to individuals in the state I exiting the study (e.g. due to healing, death or starting a treatment) (can be fixed via the --psi argument),
- ρ – sampling probability (upon removal) (can be fixed via the --p argument).
There are 4 alternative ways to run PyBDEI on your computer: with docker, singularity, in Python3 (only on linux systems), or via command line (only on linux systems, requires installation with Python3).
Once docker is installed, run the following command (here we assume that the sampling probability value is known and fixed to 0.3):
docker run -v <path_to_the_folder_containing_the_tree>:/data:rw -t evolbioinfo/bdei --nwk /data/<tree_file.nwk> --p 0.3 --CI_repetitions 100 --log <file_to_store_the_estimated_parameters.tab>
This will produce a file <file_to_store_the_estimated_parameters.tab> in the <path_to_the_folder_containing_the_tree> folder, containing a tab-separated table with the estimated parameter values and their CIs (can be viewed with a text editor, Excel or Libre Office Calc).
To see advanced options, run
docker run -t evolbioinfo/bdei -h
Once singularity is installed,
run the following command
(here we assume that the sampling probability value is known and fixed to 0.3):
singularity run docker://evolbioinfo/bdei --nwk <path/to/tree_file.nwk> --p 0.3 --CI_repetitions 100 --log <path/to/file_to_store_the_estimated_parameters.tab>
This will produce a file <path/to/file_to_store_the_estimated_parameters.tab>, containing a tab-separated table with the estimated parameter values and their CIs (can be viewed with a text editor, Excel or Libre Office Calc).
To see advanced options, run
singularity run docker://evolbioinfo/bdei -h
You would need to install g++10 and NLOpt C++ libraries:
sudo apt update --fix-missing
sudo apt install -y g++-10 libnlopt-cxx-dev
You could either install python 3 system-wide:
sudo apt install -y python3 python3-pip python3-setuptools python3-distutils
or alternatively, you could install python 3 via conda (make sure that conda is installed first). Here we will create a conda environment called pybdeienv:
conda create --name pybdeienv python=3
conda activate pybdeienv
pip3 install setuptools
pip3 install numpy
pip3 install pybdei
If you installed PyBDEI via conda, do not forget to first activate the dedicated environment (here named pybdeienv), e.g.
conda activate pybdeienv
To run PyBDEI (here we assume that the sampling probability value is known and fixed to 0.3):
bdei_infer --nwk <path/to/tree_file.nwk> --p 0.3 --CI_repetitions 100 --log <path/to/file_to_store_the_estimated_parameters.tab>
This will produce a file <path/to/file_to_store_the_estimated_parameters.tab>, containing a tab-separated table with the estimated parameter values and their CIs (can be viewed with a text editor, Excel or Libre Office Calc).
To see advanced options, run:
bdei_infer -h
from pybdei import infer
# Path to the tree in newick format
tree = "tree.nwk"
result, time = infer(nwk=tree, p=0.3, CI_repetitions=100)
print('Inferred transition rate is', result.mu, result.mu_CI)
print('Inferred transmission rate is', result.la, result.la_CI)
print('Inferred removal rate is', result.psi, result.psi_CI)
print('Inferred reproductive number is', result.R_naught)
print('Inferred incubation period is', result.incubation_period)
print('Inferred infectious time is', result.infectious_time)
print('Converged in', time.CPU_time, 's and', time.iterations, 'iterations')