forked from xlucpu/MOVICS
-
Notifications
You must be signed in to change notification settings - Fork 0
/
getElites.Rd
53 lines (42 loc) · 3.41 KB
/
getElites.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/getElites.R
\name{getElites}
\alias{getElites}
\title{Get elites for clustering}
\usage{
getElites(
dat = NULL,
surv.info = NULL,
method = "mad",
na.action = "rm",
doLog2 = FALSE,
p.cutoff = 0.05,
elite.pct = NULL,
elite.num = NULL,
scaleFlag = FALSE,
centerFlag = FALSE
)
}
\arguments{
\item{dat}{A data.frame of one omics data, can be continous or binary data.}
\item{surv.info}{A data.frame with rownames of observations and with at least two columns of `futime` for survival time and `fustat` for survival status (0: censoring; 1: event)}
\item{method}{A string value to indicate the filtering method for selecting elites. Allowed values contain c('mad', 'sd', 'cox', 'freq'). 'mad' means median absolute deviation, 'sd' means standard deviation, 'cox' means univariate Cox proportional hazards regression which needs surv.info also, 'freq' only works for binary data.}
\item{na.action}{A string value to indicate the action for handling NA missing value. Allowed values contain c('rm', 'impute'). 'rm' means removal of all features containing any missing values, 'impute' means imputation for missing values by k-nearest neighbors}
\item{doLog2}{A logic value to indicate if performing log2 transformation for data before calculating statistics (e.g., sd, mad and cox). FALSE by default.}
\item{p.cutoff}{A numeric cutoff for nominal p value derived from univariate Cox proportional hazards regression; 0.05 by default.}
\item{elite.pct}{A numberic cutoff of percentage for selecting elites. NOTE: epite.pct works for all methods except for 'cox', but two scenarios exist. 1) when using method of 'mad' or 'sd', features will be descending sorted by mad or sd, and top elites.pct \* feature size of elites (features) will be selected; 2) when using method of 'freq' for binary data, frequency for value of 1 will be calculated for each feature, and features that have value of 1 in greater than elites.pct \* sample size will be considered elites. This argument will be discarded if elite.num is provided simultaneously. Set this argument with 1 and leave elite.num NULL will return all the features as elites after dealing with NA values.}
\item{elite.num}{A integer cutoff of exact number for selecting elites. NOTE: elite.num works for all methods except for 'cox', but two scenarios exist. 1) when using method of 'mad' or 'sd', features will be descending sorted by mad or sd, and top elite.num of elites (features) will be selected; 2) when using method of 'freq' for binary data, frequency for value of 1 will be calculated for each feature, and features that have value of 1 in greater than elite.num of sample size will be considered elites.}
\item{scaleFlag}{A logic value to indicate if scaling the data after filtering. FALSE by default.}
\item{centerFlag}{A logic value to indicate if centerring the data after filtering. FALSE by default.}
}
\value{
A list containing the following components:
\code{elite.dat} a data.frame containing data for selected elites (features).
\code{unicox.res} a data.frame containing results for univariate Cox proportional hazards regression if \code{method == 'cox'}
}
\description{
This function provides several methods to help selecting elites from input features, which aims to reduce data dimention for multi-omics integrative clustering analysis.
}
\examples{
# There is no example and please refer to vignette.
}