### first activate conda
$ conda activate snakemake #we assume there's conda smk environment already installed on your device
### enter the directory of the repository
$cd dorado_ont_wf
## 2 step process
Step 1 (optional). convert .fast5 to .pod5 files
a. edit `samples_fast5_2_pod5.yaml` with paths to fast5
b. lunch program with `$ sh run_Snakefile_fast5_pod5.sh`
Notes on pod5 conversion
i. fast5 to pod5 is a looses conversion. Implies you can delete your original fast5 after conversion without loosing sleep.
ii. `pod5 convert fast5` allows multi-threading with `--threads #` option
iii. pod5 can be installed via `pip install pod5`
iv. You don't need unecessary memory for pod5 conversion. 3GB is enough. You do need lots of cpus though for threading
Step 2. modified base calling with dorado on .pod5 files and subsequent methylation extraction with modkit
## notes on run time
1. pod5 conversion
you dont need lots of memory for pod5 conversion. Max mem=4GB used. Rather increase nthreads=64, nGPUs=4
#merge run scripts
#run snakmake
sh run_Snakefile_fast5_pod5.sh Snakemake_basecalling_modkit.smk config/cluster_pod5.json
#other snakemake piplines
`methylation_calling_modkit.smk` - methylation calling with modkit only
`split_reads.smk` - split pod files by sample_rate or channels
`Snakemake_basecalling_modkit.smk` - end to end basecalling and methylation calling with modkit
####################################################################################
# note to self
### test run with ``
$snakemake -s Snakefile.smk -np #target filename may change
$snakemake -s Snakefile.smk --cores 12 --forcerun -np #dry run with cores
#run actual pipeline on the cluster
$nohup snakemake -s Snakefile.smk --latency-wait 60 --restart-times 2 --keep-going --forceall --cluster "bsub -J {rule} -R "rusage[mem=32]" -W 1:00 -n 12 -o logs/cluster/{rule}.%J.out -e logs/cluster/{rule}.%J.err" -j 3 &
#alternatievely make a script that run snalmake
$ sh run_snakefile.sh
$ cat run_snakefile.sh
#!/bin/bash
# Run snakemake
snakemake --jobname 's.{jobid}.{rulename}' \
--snakefile Snakefile_agg_stats_ONT.smk \
--use-conda \
--keep-going \
--reason \
--printshellcmds \
--latency-wait 10 \
--rerun-incomplete \
--stats snakemake_$(date +"%Y%m%d_%H%M%S").stats \
-j 500 \
--cluster-config config/cluster.json \
--cluster "bsub -q {cluster.queue} -n {cluster.threads} -W {cluster.time} -M{cluster.mem} -R\"span[hosts=1] select[mem>{cluster.mem}] rusage[mem={cluster.mem}]\" {cluster.extra} -o out.txt -e err.txt"
-
Notifications
You must be signed in to change notification settings - Fork 0
sahuno/dorado_ont_wf
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published