Introduction

Author	Date	Purpose
Melanie Weilert and Kaelan Brennan	June 2022	Give setup directions for Zelda paper

Introduction

Here, we give setup instructions for someone interested in reproducing analysis from the Zelda paper. The steps we take are:

Environment setup (Anaconda3)
Process sequencing data (Snakemake)
Analysis (Python and R)

Conda (Anaconda3) environment setup

Because BPNet and ChromBPNet use TensorFlow1 and TensorFlow2, respectively, we need multiple conda environments to switch between these different features. It is HIGHLY recommended that you use conda=4.7.12 while reproducing this code, since BPNet operates under older sofware and updated versions of conda are not compatible with the setup instructions below:

Setup for BPNet environment

The BPNet conda environment can be installed using the instructions found here: [https://github.com/kundajelab/bpnet]. There are 2 environments: 1 with and 1 without a GPU capability. If you choose to install the GPU-compatible BPNet environment on an Nvidia GPU (we trained on a NVIDIA® TITAN RTX GPU), then you will need the appropriate drivers:

CUDA v9.0
cuDNN v7.0.5

Setup for ChromBPNet environment

The ChromBPNet conda environment can be installed using the instructions found here: [https://github.com/kundajelab/chrombpnet/tree/pre-release]. If you choose to install the GPU-compatible ChromBPNet environment on an Nvidia GPU (we trained on a NVIDIA® TITAN RTX GPU), then you will need the appropriate drivers:

CUDA v11.0
cuDNN v8.3.0

Process sequencing data

All data is located in data/* and the pipeline instructions are designated from a Snakefile using Snakemake. The Snakefile sources all the input starting information from the setup/samples.csv file from the starting_file column.

In order to assign the nexus barcodes, we should parse through each site to get sequencing data.

parallel -j 10 bash scripts/nexus_identify_fixed_barcodes.sh -i {} -o txt/nexus_barcodes/\`basename {} .fastq.gz \`\.freqs.txt ::: fastq/dm6/nexus/*.fastq.gz
tail -n +1 txt/nexus_barcodes/*.str.txt

In order to process the data, navigate to the data/ folder, then type snakemake -j 6 for 6 simultaneous tasks running.

Software versions associated with data processing

R==4.2.0
Python==3.7.3
bowtie2==2.3.5.1
cutadapt==2.5
samtools==1.14
Java OpenJDK==1.8.0_191
PICARD==2.23.8
bamCompare==3.5.1
macs2==2.2.7.1
idr==2.0.3
snakemake==5.10.0

Analysis

The rendered .ipynb and .Rmd files are under the analysis/ folder. Files are numbered in the order by which they were run. Raw figures can be found here as well as code and associated scripts to run analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
analysis		analysis
data		data
.Rhistory		.Rhistory
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Conda (Anaconda3) environment setup

Setup for BPNet environment

Setup for ChromBPNet environment

Process sequencing data

Software versions associated with data processing

Analysis

About

Releases 2

Packages

Contributors 2

Languages

zeitlingerlab/Brennan_Zelda_2023

Folders and files

Latest commit

History

Repository files navigation

Introduction

Conda (Anaconda3) environment setup

Setup for BPNet environment

Setup for ChromBPNet environment

Process sequencing data

Software versions associated with data processing

Analysis

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages