Hail is a framework for scalable genetic data analysis. Hail is pre-alpha software and under active development. Hail is written in Scala (mostly) and uses Apache Spark and other Apache Hadoop projects. If you are interested in getting involved in Hail development, email [email protected].
- Building
- Representation
- Hail's expression language
- Importing
- Splitting Multiallelic Variants
- Renaming Samples
- Annotating Samples or Variants
- Quality Control
- PCA
- Annotating with the Variant Effect Predictor
- Filtering
- Linear regression
- Mendel errors
- Exporting to TSV
- Exporting to VCF
- Exporting to Plink
- Persist
Here is a rough list of features currently planned or under development:
- generalized query language
- better interoperability with other Hadoop projects
- kinship estimation from GRM
- LMM
- burden tests, SKAT
- logic regression
- dosage
- posterior (PP)
- LD pruning
- sex check
- TDT
- BGEN
- Kaitlin Samocha's de novo caller