man/calcPhenotype.Rd

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pRRophetic.R
\name{calcPhenotype}
\alias{calcPhenotype}
\title{Calculates phenotype from microarray data.}
\usage{
calcPhenotype(
  trainingExprData,
  trainingPtype,
  testExprData,
  batchCorrect = "eb",
  powerTransformPhenotype = TRUE,
  removeLowVaryingGenes = 0.2,
  minNumSamples = 10,
  selection = -1,
  printOutput = TRUE,
  removeLowVaringGenesFrom = "homogenizeData"
)
}
\arguments{
\item{trainingExprData}{The training data. A matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "testExprData"}

\item{trainingPtype}{The known phenotype for "trainingExprData". A numeric vector which MUST be the same length as the number of columns of "trainingExprData".}

\item{testExprData}{The test data where the phenotype will be estimted. It is a matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "trainingExprData".}

\item{batchCorrect}{How should training and test data matrices be homogenized. Choices are "eb" (default) for ComBat, "qn" for quantiles normalization or "none" for no homogenization.}

\item{powerTransformPhenotype}{Should the phenotype be power transformed before we fit the regression model? Default to TRUE, set to FALSE if the phenotype is already known to be highly normal.}

\item{removeLowVaryingGenes}{What proportion of low varying genes should be removed? 20 percent be default}

\item{minNumSamples}{How many training and test samples are requried. Print an error if below this threshold}

\item{selection}{How should duplicate gene ids be handled. Default is -1 which asks the user. 1 to summarize by their or 2 to disguard all duplicates.}

\item{printOutput}{Set to FALSE to supress output}

\item{removeLowVaringGenesFrom}{what kind of genes should be removed}
}
\value{
A vector of the estimated phenotype, in the same order as the columns of "testExprData".
}
\description{
This function uses ridge regression to calculate a phenotype from an gene expression,
given a gene expression matrix where the phenotype is already known. The function
also integrates the two datasets using a user-defined procedure, power transforms
the known phenotype and provides several other options for flexible and powerful prediction
from a gene expression matrix.
}
\author{
Paul Geeleher, Nancy Cox, R. Stephanie Huang
}
\keyword{internal}