NIPT-PG: Empowering Non-Invasive Prenatal Testing to learn from population genomics through an incremental pan-genomic approach
first, install the NIPT-PG conda environment:
conda create -c NIPT-PG
conda activate NIPT-PG
then, in NPIT-PG environment, install the following package:
pip install pandas numpy tqdm argparse
python3 gen_pgg.py [-r REF.FA_FILE] [-s SAM_PATH] [-n NIPT_FILE]
• -r path to the reference genome file (such as GRCh38.fa)
• -s path to the folder containing the files to be tested
• -n path to the nipt_files.csv
python3 gen_pgg.py -r data/ref.fa -s data/sam/ -n data/nipt_files_ART-Random.csv
The content of the nipt_files.csv file is as illustrated in Table 1, documenting the file name mappings for each testing file. This practice aids in standardizing file management and enhances testing efficiency.
id | nipt_files | mapping |
---|---|---|
0 | CL100050702_L02_91 | sample_0 |
1 | CL100025607_L02_22 | sample_1 |
2 | CL100035831_L01_15 | sample_2 |
... | ... | ... |
python3 map2pgg.py [-p PGG_FILE] [-s SAM_PATH] [-n NIPT_FILE] [-k K_MER]
• -p the pan-genome file path
• -s path to the folder containing the files to be tested
• -n path to the nipt_files.csv
• -k k-mer length, default=5
python3 map2pgg.py -p data/pgg.json -s data/sam/ -n data/nipt_files_ART-Random.csv -k 5
python3 aneup_det.py [-s SAM_PATH] [-g ALIGNED_SAM_PATH] [-n NIPT_FILE]
[-l LEFT_THRESHOLD]
[-r RIGHT_THRESHOLD]
[-c CONTROL SAMPLE]
• -s path to the folder containing the files to be tested
• -g path to the folder containing realigned samples
• -n path to the nipt_files.csv
• -l left threshold of z-score (default = -3)
• -r right threshold of z-score (default = 3)
python3 aneup_det.py -s data/sam/ -g data/aligned_sam/ -n data/nipt_files_ART-Random.csv -l -3 -r 3