Skip to content

Commit

Permalink
Np add cell bender to multiome as optional task (broadinstitute#1125)
Browse files Browse the repository at this point in the history
* add cellbender as optional task

* add cellbender to Multiome.wdl

* add cellbender to Multiome.wdl

* add some optional inputs

* add some optional inputs

* hard code h5ad for khalid comparison

* change sample name

* remove hard coded h5ad file

* make cellbender false

* changelog

* changelogs

* Updated WARP docs for CellBender task,inputs, and outputs on multiome overview

* Update README.md

* Update website/docs/Pipelines/Multiome_Pipeline/README.md

Co-authored-by: Kaylee Mathews <[email protected]>

* Update pipelines/skylab/multiome/Multiome.wdl

---------

Co-authored-by: ekiernan <[email protected]>
Co-authored-by: kayleemathews <[email protected]>
Co-authored-by: Kaylee Mathews <[email protected]>
  • Loading branch information
4 people authored Nov 21, 2023
1 parent 4169831 commit f660755
Show file tree
Hide file tree
Showing 5 changed files with 58 additions and 5 deletions.
4 changes: 4 additions & 0 deletions pipelines/skylab/multiome/Multiome.changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# 2.3.2
2023-11-20 (Date of Last Commit)
* Added an optional task to the Multiome.wdl that will run CellBender on the Optimus output h5ad file

# 2.3.1
2023-11-20 (Date of Last Commit)

Expand Down
36 changes: 35 additions & 1 deletion pipelines/skylab/multiome/Multiome.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ version 1.0
import "../../../pipelines/skylab/multiome/atac.wdl" as atac
import "../../../pipelines/skylab/optimus/Optimus.wdl" as optimus
import "../../../tasks/skylab/H5adUtils.wdl" as H5adUtils
import "https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl" as CellBender

workflow Multiome {
String pipeline_version = "2.3.1"
String pipeline_version = "2.3.2"

input {
String input_id
Expand Down Expand Up @@ -40,6 +42,9 @@ workflow Multiome {
# Whitelist
File atac_whitelist = "gs://gcp-public-data--broad-references/RNA/resources/arc-v1/737K-arc-v1_atac.txt"

# CellBender
Boolean run_cellbender = false

}

# Call the Optimus workflow
Expand Down Expand Up @@ -85,6 +90,25 @@ workflow Multiome {
gex_whitelist = gex_whitelist,
atac_whitelist = atac_whitelist
}

# Call CellBender
if (run_cellbender) {
call CellBender.run_cellbender_remove_background_gpu as CellBender {
input:
sample_name = input_id,
input_file_unfiltered = Optimus.h5ad_output_file,
hardware_boot_disk_size_GB = 20,
hardware_cpu_count = 4,
hardware_disk_size_GB = 50,
hardware_gpu_type = "nvidia-tesla-t4",
hardware_memory_GB = 32,
hardware_preemptible_tries = 2,
hardware_zones = "us-central1-a us-central1-c",
nvidia_driver_version = "470.82.01"

}
}

meta {
allowNestedInputs: true
}
Expand All @@ -108,5 +132,15 @@ workflow Multiome {
File gene_metrics_gex = Optimus.gene_metrics
File? cell_calls_gex = Optimus.cell_calls
File h5ad_output_file_gex = JoinBarcodes.gex_h5ad_file

# cellbender outputs
File? cell_barcodes_csv = CellBender.cell_csv
File? checkpoint_file = CellBender.ckpt_file
Array[File]? h5_array = CellBender.h5_array
Array[File]? html_report_array = CellBender.report_array
File? log = CellBender.log
Array[File]? metrics_csv_array = CellBender.metrics_array
String? output_directory = CellBender.output_dir
File? summary_pdf = CellBender.pdf
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@
"Multiome.ref_genome_fasta":"gs://gcp-public-data--broad-references/hg38/v0/GRCh38.primary_assembly.genome.fa",
"Multiome.tar_bwa_reference":"gs://gcp-public-data--broad-references/hg38/v0/bwa/v2_2_1/bwa-mem2-2.2.1-Human-GENCODE-build-GRCh38.tar",
"Multiome.tar_star_reference":"gs://gcp-public-data--broad-references/hg38/v0/star/v2_7_10a/modified_star2.7.10a-Human-GENCODE-build-GRCh38-43.tar",
"Multiome.chrom_sizes":"gs://broad-gotc-test-storage/Multiome/input/hg38.chrom.sizes"
"Multiome.chrom_sizes":"gs://broad-gotc-test-storage/Multiome/input/hg38.chrom.sizes",
"Multiome.run_cellbender":"false"
}
4 changes: 3 additions & 1 deletion verification/test-wdls/TestMultiome.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ workflow TestMultiome {
Boolean update_truth
String vault_token_path
String google_account_vault_path
Boolean run_cellbender

}

Expand Down Expand Up @@ -82,7 +83,8 @@ workflow TestMultiome {
adapter_seq_read1 = adapter_seq_read1,
adapter_seq_read3 = adapter_seq_read3,
chrom_sizes = chrom_sizes,
atac_whitelist = atac_whitelist
atac_whitelist = atac_whitelist,
run_cellbender = run_cellbender

}

Expand Down
16 changes: 14 additions & 2 deletions website/docs/Pipelines/Multiome_Pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ slug: /Pipelines/Multiome_Pipeline/README

| Pipeline Version | Date Updated | Documentation Author | Questions or Feedback |
| :----: | :---: | :----: | :--------------: |
| [Multiome v2.3.0](https://github.com/broadinstitute/warp/releases) | November, 2023 | Kaylee Mathews | Please file GitHub issues in warp or contact the [WARP Pipeline Development team](mailto:[email protected]) |
| [Multiome v2.3.1](https://github.com/broadinstitute/warp/releases) | November, 2023 | Kaylee Mathews | Please file GitHub issues in warp or contact the [WARP Pipeline Development team](mailto:[email protected]) |

![Multiome_diagram](./multiome_diagram.png)

Expand Down Expand Up @@ -78,6 +78,7 @@ Multiome can be deployed using [Cromwell](https://cromwell.readthedocs.io/en/sta
| adapter_seq_read1 | Optional string describing the adapter sequence for ATAC read 1 paired-end reads to be used during adapter trimming with Cutadapt; default is "GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG". | String |
| adapter_seq_read3 | Optional string describing the adapter sequence for ATAC read 2 paired-end reads to be used during adapter trimming with Cutadapt; default is "TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG". | String |
| atac_whitelist | Optional file containing the list of valid barcodes for 10x multiome ATAC adata; default is "gs://gcp-public-data--broad-references/RNA/resources/arc-v1/737K-arc-v1_atac.txt". | File |
| run_cellbender | Optional boolean used to determine if the Optimus (GEX) pipeline should run CellBender on the output gene expression h5ad file, `h5ad_output_file_gex`; default is "false". | Boolean |

#### Sample inputs for analyses in a Terra Workspace

Expand All @@ -86,13 +87,14 @@ The Multiome pipeline is currently available on the cloud-based platform Terra.

## Tasks

The Multiome workflow calls two subworkflows, which are described briefly in the table below. For more details on each subworkflow, including the tasks that they call, see the documentation linked in the table.
The Multiome workflow calls two WARP subworkflows, one external subworkflow (optional), and an additional task, which are described briefly in the table below. For more details on each subworkflow and task, see the documentation and WDL scripts linked in the table.

| Subworkflow | Software | Description |
| ----------- | -------- | ----------- |
| ATAC ([WDL](https://github.com/broadinstitute/warp/blob/develop/pipelines/skylab/multiome/atac.wdl) and [documentation](../ATAC/README)) | fastqprocess, bwa-mem, SnapATAC2 | Workflow used to analyze 10x single-cell ATAC data. |
| Optimus ([WDL](https://github.com/broadinstitute/warp/blob/develop/pipelines/skylab/optimus/Optimus.wdl) and [documentation](../Optimus_Pipeline/README)) | fastqprocess, STARsolo, Emptydrops | Workflow used to analyze 10x single-cell GEX data. |
| JoinMultiomeBarcodes as JoinBarcodes ([WDL](https://github.com/broadinstitute/warp/blob/develop/tasks/skylab/H5adUtils.wdl)) | Python3 | Task that adds an extra column to the Optimus metrics `h5ad.obs` property that lists the respective ATAC barcodes for each gene expression barcode. It also adds an extra column to the ATAC metrics `h5ad.obs` property to link ATAC barcodes to gene expression barcodes. |
| CellBender.run_cellbender_remove_background_gpu as CellBender ([WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl))| CellBender | Optional task that runs the `cellbender_remove_background.wdl` WDL script directly from the [CellBender GitHub repository](https://github.com/broadinstitute/CellBender/tree/master), depending on whether the input `run_cellbender` is "true" or "false". |

## Outputs

Expand All @@ -111,6 +113,16 @@ The Multiome workflow calls two subworkflows, which are described briefly in the
| gene_metrics_gex | `<input_id>_gex.gene_metrics.csv.gz` | CSV file containing the per-gene metrics. |
| cell_calls_gex | `<input_id>_gex.emptyDrops` | TSV file containing the EmptyDrops results when the Optimus workflow is run in sc_rna mode. |
| h5ad_output_file_gex | `<input_id>_gex.h5ad` | h5ad (Anndata) file containing the raw cell-by-gene count matrix, gene metrics, cell metrics, and global attributes. Also contains equivalent ATAC barcode for each gene expression barcode in the `atac_barcodes` column of the `h5ad.obs` property. See the [Optimus Count Matrix Overview](../Optimus_Pipeline/Loom_schema.md) for more details. |
| cell_barcodes_csv | `<cell_csv>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| checkpoint_file | `<ckpt_file>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| h5_array | `<h5_array>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| html_report_array | `<report_array>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| log | `<log>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| metrics_csv_array | `<metrics_array>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| output_directory | `<output_dir>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |
| summary_pdf | `<pdf>` | Optional output produced when `run_cellbender` is "true"; see CellBender [documentation](https://cellbender.readthedocs.io/en/latest/usage/index.html) and [WDL](https://raw.githubusercontent.com/broadinstitute/CellBender/v0.3.1/wdl/cellbender_remove_background.wdl) for more information. |



## Versioning and testing

Expand Down

0 comments on commit f660755

Please sign in to comment.