Skip to content

Commit

Permalink
Edits to tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
ykoga07 committed Aug 16, 2024
1 parent b68cce2 commit 4c1d843
Showing 1 changed file with 53 additions and 19 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -5,31 +5,32 @@ date: "2024-07-19"
output: html_document
---

```{r setup, include=TRUE}
suppressPackageStartupMessages({
#Load library
library(dplyr)
library(Seurat)
library(patchwork)
library(celda)
library(singleCellTK)
library(knitr)
library(kableExtra)
library(ggplot2)
library(dendextend)
})
```{r setup, include=TRUE, message=FALSE}
#Load library
library(dplyr)
library(Seurat)
library(patchwork)
library(celda)
library(singleCellTK)
library(knitr)
library(kableExtra)
library(ggplot2)
library(dendextend)
knitr::opts_chunk$set(echo = TRUE)
source("/restricted/projectnb/camplab/projects/20240719_Codathon/Tutorial/findMarkersTree.R")
```

## Convert Seurat Object to an SingleCellExperiment (SCE) object (Optional)
## Applying the Seurat clustering algorithm (Optional)

While Celda provides the functionality for cell cluster generation, some users may opt to import cluster labels generated from the popular Seurat pipeline. (https://pubmed.ncbi.nlm.nih.gov/29608179/) We can apply the `convertSeuratToSCE` function contained within the SingleCellTK package for the object conversion.

```{r, warning=FALSE}
#Seurat pipeline (Taken from https://satijalab.org/seurat/articles/pbmc3k_tutorial):
##Read in 10X output, PBMC 3K
pbmc.data <- Read10X(data.dir = "./filtered_gene_bc_matrices/hg19/")
## Check seurat fxn if exists:
pbmc.data <- Read10X(data.dir = "/restricted/projectnb/camplab/projects/20240719_Codathon/Tutorial/filtered_gene_bc_matrices/hg19/")
##Create Seurat object
pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200)
##Normalize data
Expand All @@ -45,6 +46,8 @@ pbmc <- FindNeighbors(pbmc, dims = 1:10, verbose = FALSE)
pbmc <- FindClusters(pbmc, resolution = 0.5, verbose = FALSE)
```

## Convert Seurat Object to an SingleCellExperiment (SCE) object (Optional)

```{r}
#Convert the Seurat object to a SingleCellExperiment object
pbmcSce <- convertSeuratToSCE(pbmc, normAssayName = "logcounts")
Expand All @@ -54,6 +57,12 @@ pbmcSce <- convertSeuratToSCE(pbmc, normAssayName = "logcounts")

Generate Celda feature modules based on cell clusters that you have provided through the `recursiveSplitModule` function:

```{r}
useAssay <- "counts"
altExpName <- "featureSubset"
```


```{r}
#Convert Seurat cluster IDs, as Celda requires clusters be numeric vectors starting from 1 (Seurat cluster 0 = Celda cluster 1)
pbmcSce$seurat_clusters <- as.numeric(as.character(pbmcSce$seurat_clusters)) + 1
Expand All @@ -76,7 +85,7 @@ sce <- subsetCeldaList(rsmRes, list(L = 15))
```

```{r}
featureTable <- featureModuleTable(sce, useAssay = "counts", altExpName = "featureSubset")
featureTable <- featureModuleTable(sce, useAssay = useAssay, altExpName = altExpName)
kable(featureTable, style = "html", row.names = FALSE) %>%
kable_styling(bootstrap_options = "striped") %>%
Expand All @@ -90,22 +99,47 @@ For instance, we can see that each module represents a set of co-expressing feat
Differential expression tools can also be used on the gene modules in order to identify the modules that define each cell cluster. To do this, first run the `factorizeMatrix` function, which will generate a matrix that measures the contribution of each gene module to each cell population.

```{r DE modules, message=FALSE}
factorize <- factorizeMatrix(sce, useAssay = "counts", altExpName = "featureSubset")
factorize <- factorizeMatrix(sce, useAssay = useAssay, altExpName = altExpName)
### Take the module counts, log-normalize, then use for DE, violin plots, etc
factorizedCounts <- factorize$counts$cell
factorizedLogcounts <- normalizeCounts(factorizedCounts)
reducedDim(sce, "factorizeMatrix") <- t(factorizedLogcounts)
sce <- runFindMarker(sce, useReducedDim = "factorizeMatrix", cluster = "seurat_clusters", useAssay = NULL)
```

The differential expression results are contained within `sce@metadata$findMarker`. The `Gene` column denotes the differentially expressed module, whilst the `Log2_FC` column denotes the level of upregulation of the module in the cluster.
The differential expression results are contained within `sce@metadata$findMarker`. The `Gene` column denotes the differentially expressed module, whilst the `Log2_FC` column denotes the level of upregulation of the module in the cluster. These results can be accessed through the `getFindMarkerTopTable` function.

```{r, eval = FALSE}
getFindMarkerTopTable(sce)
```


```{r DE module table, message=FALSE}
```{r DE module table, message=FALSE, echo = FALSE}
kable(sce@metadata$findMarker, style = "html", row.names = FALSE) %>%
kable_styling(bootstrap_options = "striped") %>%
scroll_box(width = "100%", height = "500px")
```

### Module exploration

Celda provides several methods for the visualization of the module expression within the dataset.

#### UMAP

The function `plotDimReduceModule` can be used visualize the probabilities of a particular module or sets of modules on a reduced dimensional plot such as a UMAP. This can be another quick method to see how modules are expressed across various cells in 2-D space. As an example, we can look at modules L7 and L8:

```{r, fig.width = 5.5, fig.height = 4.5}
sce <- celdaUmap(sce, useAssay = useAssay, altExpName = altExpName)
plotDimReduceModule(sce, modules = 7:8, useAssay = useAssay, altExpName = altExpName, reducedDimName = "celda_UMAP")
```

#### Heatmap

The function `moduleHeatmap` can be used to view the expression of features across cells for a specific module.

```{r, fig.width = 5, fig.height = 8}
moduleHeatmap(sce, featureModule = 8, useAssay = useAssay, topFeatures = 25)
```

## Generate a Decision Tree

Expand Down

0 comments on commit 4c1d843

Please sign in to comment.