Calculating polygenic scores

PGS Catalog is An open database of polygenic scores and the relevant metadata required for accurate application and evaluation.

A polygenic score (PGS) aggregates the effects of many genetic variants into a single number which predicts genetic predisposition for a phenotype.

Repo has a set of notebooks and utility functions for calculating polygenic scores for a genome from raw reads.

It includes notebooks on:

how to run sequence alignment
how to run variant calling
how to pick alternate contigs
how to annotate variants
how to filter variants [TODO]
how to calculate individual's polygenic scores quickly

PGS Catalog doesn't have any special API keys, anyone can query it. So there isn't really anything except a raw human genome file (fastq) you need before starting (vcf file would allow to skip first 2-4 notebooks).

Disclaimer: Genome analysis is computation heavy, some steps here might take a whole day to run depending on the hardware (ex: sequence alignment is especially heavy).

TODO:

how to search the dna for a specific variance
polygenic score interpretation. There are many different ways methods how to calculate PGS and score numbers results vary a lot.

Installation

# create a virtual environment with your favorite venv tool
sudo apt-get install build-essential python3-dev libsnappy-dev # for pandas to_parquet and read_parquet
pip install -r requirements.txt
pre-commit install

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
bin		bin
utils		utils
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
1_how_to_run_genome_alignment.ipynb		1_how_to_run_genome_alignment.ipynb
2_how_to_run_variant_calling.ipynb		2_how_to_run_variant_calling.ipynb
3_how_to_select_alt_contigs_to_use.ipynb		3_how_to_select_alt_contigs_to_use.ipynb
4_how_to_annotate_a_variant_calling_file.ipynb		4_how_to_annotate_a_variant_calling_file.ipynb
5_how_to_filter_out_low_quality_variant.ipynb		5_how_to_filter_out_low_quality_variant.ipynb
6_how_to_calculate_individuals_all_pgs_catalog_polygenic_scores.ipynb		6_how_to_calculate_individuals_all_pgs_catalog_polygenic_scores.ipynb
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calculating polygenic scores

Installation

About

Releases

Packages

Languages

siims/calculating_polygenic_scores

Folders and files

Latest commit

History

Repository files navigation

Calculating polygenic scores

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages