forked from Al-Murphy/MungeSumstats
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcheck_no_allele.Rd
76 lines (67 loc) · 2.71 KB
/
check_no_allele.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_no_allele.R
\name{check_no_allele}
\alias{check_no_allele}
\title{Ensure that A1 & A2 are present, if not can find it with SNP and other allele}
\usage{
check_no_allele(
sumstats_dt,
path,
ref_genome,
rsids,
imputation_ind,
allele_flip_check,
log_folder_ind,
check_save_out,
tabix_index,
nThread,
log_files,
bi_allelic_filter,
dbSNP
)
}
\arguments{
\item{path}{Filepath for the summary statistics file to be formatted. A
dataframe or datatable of the summary statistics file can also be passed
directly to MungeSumstats using the path parameter.}
\item{ref_genome}{name of the reference genome used for the GWAS ("GRCh37" or
"GRCh38"). Argument is case-insensitive. Default is NULL which infers the
reference genome from the data.}
\item{imputation_ind}{Binary Should a column be added for each imputation
step to show what SNPs have imputed values for differing fields. This
includes a field denoting SNP allele flipping (flipped). On the flipped
value, this denoted whether the alelles where switched based on
MungeSumstats initial choice of A1, A2 from the input column headers and thus
may not align with what the creator intended.\strong{Note} these columns will be
in the formatted summary statistics returned. Default is FALSE.}
\item{allele_flip_check}{Binary Should the allele columns be checked against
reference genome to infer if flipping is necessary. Default is TRUE.}
\item{log_folder_ind}{Binary Should log files be stored containing all
filtered out SNPs (separate file per filter). The data is outputted in the
same format specified for the resulting sumstats file. The only exception to
this rule is if output is vcf, then log file saved as .tsv.gz. Default is
FALSE.}
\item{tabix_index}{Index the formatted summary statistics with
\href{http://www.htslib.org/doc/tabix.html}{tabix} for fast querying.}
\item{nThread}{Number of threads to use for parallel processes.}
\item{log_files}{list of log file locations}
\item{bi_allelic_filter}{Binary Should non-biallelic SNPs be removed. Default
is TRUE.}
\item{dbSNP}{version of dbSNP to be used for imputation (144 or 155).}
}
\value{
A list containing two data tables:
\itemize{
\item \code{sumstats_dt}: the modified summary statistics data table object
\item \code{rsids}: snpsById, filtered to SNPs of interest
if loaded already. Or else NULL.
\item \code{allele_flip_check}: does the dataset require allele flip check
\item \code{log_files}: log file list
\item \code{bi_allelic_filter}: should multi-allelic SNPs be filtered out
}
}
\description{
More care needs to be taken if one of A1/A2 is present, before imputing the
other allele flipping needs to be checked
}
\keyword{internal}