- Supercomputing with Polygenic Risk Score Evaluation
- A workflow in Bash, R and python designed to allow easy processing and production of polygenic risk scores (both whole genome and set scores) on ARCCA's Raven supercomputer
- For a tutorial, debugging information and other documentation, please refer to the wiki
- If used for the creation of results to be published (one can dream...), please acknowledge John Hubert [email protected] and Valentina Escott-Price [email protected]
- I'm always looking for people to help in making this better, If you are familiar to git please fork this repository.
- The code has a General Public license so you are free to make changes of your own just as long as you document them!
- If you are unfamiliar with the 'git' structure but still want to help out, please email John Hubert @ [email protected] and we can sort something out!
- If you notice a bug or a fix that is required please add to the issues page or email [email protected]
** COMING SOON**
- SurPRSe! Analysis Viewing Environment - Polygenic Risk Scores
- Comparing whole genome polygenic risk scores to polygenic risk set scores is difficult as the 'ideal' significance threshold for a whole genome polygenic risk score is unlikely to be the same for a polygenic risk set score, and will likely change depending on which set you test.
- SAVE-PRS allows for "real-time" comparison of set scores to whole genome polygenic risk scores in a shiny app, to allow comparison across all significance thresholds and allow better interpretation of all results.
Location on rocks: /home/SHARED/PGC/daner_PGC_SCZ52_0513a.resultfiles_PGC_SCZ52_0513.sh2_noclo.txt
Summary stats
Location: https://www.med.unc.edu/pgc/results-and-downloads
Summary stats
Location: https://www.med.unc.edu/pgc/results-and-downloads
Summary stats
WARNING build is UCSC hg18, will likely need to convert to hg19.
Location on raven: /neurocluster/databank/CLOZUK/GWAS/BGE/*CLOZUK_GWAS_BGE*.tar.gz
Best guess genotype data
Location on raven: /neurocluster/databank/CLOZUK/GWAS/SUMSTATS/CLOZUK_PGC2noclo.wCHRX.w1000Gfrq.METAL.assoc.dosage.gz
Summary stats
Best guess genotype data
Best guess genotype data
Edit the parallel commands to reduce the number of cores that are used to process in SurPRSe which will reduce the errors produced by this command.