forked from fmalmeida/MpGAP
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge dev branch into master (fmalmeida#34)
* Update .gitignore * first step: adding nf-core framework * updateing help/parameter files * Update nextflow_schema.json * Update defaults.config * Update WorkflowMain.groovy fixing DOI * Update WorkflowMain.groovy fixing example command * fixing parameter check * changing workflow case name * changed threads parameter and added process labels in sr and qual modules * Update quast.nf fixed additional parameter in quast * adjusting how assemblers interpret resources across attempts * fixed the string of additional parameters * Create illumina_test.yml added example samplesheet for illumina only tests * Update nextflow_schema.json fixing boolean param * Create lreads_test.yml adding example samplesheet for long reads only tests * Create hybrid_test.yml adding example samplesheet for hybrid tests * Update .gitattributes * seems ready for new patch version
- Loading branch information
Showing
51 changed files
with
1,608 additions
and
519 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
* linguist-vendored | ||
*.nf linguist-vendored=false | ||
*.config linguist-vendored=false | ||
*.py linguist-vendored=false | ||
*.R linguist-vendored=false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
.nextflow* | ||
docs/_build | ||
docs/_build | ||
testing |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
samplesheet: | ||
|
||
- id: ont_hybrid | ||
nanopore: https://github.com/fmalmeida/test_datasets/raw/main/ecoli_ont_15X.fastq.gz | ||
genome_size: 0.5m | ||
illumina: | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_1.fastq.gz | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_2.fastq.gz | ||
hybrid_strategy: both | ||
|
||
- id: pacbio_hybrid | ||
pacbio: https://github.com/fmalmeida/test_datasets/raw/main/ecoli_pacbio_15X.fastq.gz | ||
genome_size: 0.5m | ||
illumina: | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_1.fastq.gz | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_2.fastq.gz | ||
hybrid_strategy: both |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
samplesheet: | ||
- id: illumina_only | ||
illumina: | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_1.fastq.gz | ||
- https://github.com/fmalmeida/test_datasets/raw/main/ecoli_illumina_15X_2.fastq.gz |
File renamed without changes
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
samplesheet: | ||
- id: ont_only | ||
nanopore: https://github.com/fmalmeida/test_datasets/raw/main/ecoli_ont_15X.fastq.gz | ||
genome_size: 0.5m | ||
- id: pacbio_only | ||
pacbio: https://github.com/fmalmeida/test_datasets/raw/main/ecoli_pacbio_15X.fastq.gz | ||
genome_size: 0.5m |
File renamed without changes
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
process { | ||
|
||
// The defaults for all processes | ||
cpus = { params.max_cpus } | ||
memory = { params.max_memory } | ||
time = { params.max_time } | ||
|
||
errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'finish' } | ||
maxRetries = 1 | ||
maxErrors = '-1' | ||
|
||
// labels | ||
withLabel:process_ultralow { | ||
cpus = { check_max( 1 * task.attempt, 'cpus' ) } | ||
memory = { check_max( 2.GB * task.attempt, 'memory' ) } | ||
time = { check_max( 1.h * task.attempt, 'time' ) } | ||
} | ||
withLabel:process_low { | ||
cpus = { check_max( 2 * task.attempt, 'cpus' ) } | ||
memory = { check_max( 4.GB * task.attempt, 'memory' ) } | ||
time = { check_max( 1.h * task.attempt, 'time' ) } | ||
} | ||
withLabel:error_ignore { | ||
errorStrategy = 'ignore' | ||
} | ||
withLabel:error_retry { | ||
errorStrategy = 'retry' | ||
maxRetries = 2 | ||
} | ||
|
||
// Assemblies will first try to adjust themselves to a parallel execution | ||
// If it is not possible, then it waits to use all the resources allowed | ||
withLabel:process_assembly { | ||
cpus = { if (task.attempt == 1) { check_max( 6 * task.attempt, 'cpus' ) } else { params.max_cpus } } | ||
memory = { if (task.attempt == 1) { check_max( 14.GB * task.attempt, 'memory' ) } else { params.max_memory } } | ||
time = { if (task.attempt == 1) { check_max( 16.h * task.attempt, 'time' ) } else { params.max_time } } | ||
} | ||
|
||
} | ||
|
||
// Function to ensure that resource requirements don't go beyond | ||
// a maximum limit | ||
def check_max(obj, type) { | ||
if(type == 'memory'){ | ||
try { | ||
if(obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1) | ||
return params.max_memory as nextflow.util.MemoryUnit | ||
else | ||
return obj | ||
} catch (all) { | ||
println " ### ERROR ### Max memory '${params.max_memory}' is not valid! Using default value: $obj" | ||
return obj | ||
} | ||
} else if(type == 'time'){ | ||
try { | ||
if(obj.compareTo(params.max_time as nextflow.util.Duration) == 1) | ||
return params.max_time as nextflow.util.Duration | ||
else | ||
return obj | ||
} catch (all) { | ||
println " ### ERROR ### Max time '${params.max_time}' is not valid! Using default value: $obj" | ||
return obj | ||
} | ||
} else if(type == 'cpus'){ | ||
try { | ||
return Math.min( obj, params.max_cpus as int ) | ||
} catch (all) { | ||
println " ### ERROR ### Max cpus '${params.max_cpus}' is not valid! Using default value: $obj" | ||
return obj | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
// conda profile | ||
params.selected_profile = "conda" | ||
singularity.enabled = false | ||
docker.enabled = false | ||
process.conda = "$CONDA_PREFIX/envs/mpgap-3.1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
/* | ||
* Configuration File to run fmalmeida/mpgap pipeline. | ||
*/ | ||
|
||
params { | ||
|
||
/* | ||
* Input parameter | ||
*/ | ||
|
||
|
||
// Path to YAML samplesheet file. | ||
// Please read the documentation https://mpgap.readthedocs.io/en/latest/samplesheet.html to know how to create a samplesheet file. | ||
input = null | ||
|
||
/* | ||
* Output parameters | ||
*/ | ||
|
||
|
||
// Output folder name | ||
output = "output" | ||
|
||
|
||
/* | ||
* Resources parameters | ||
*/ | ||
|
||
// Memory allocation for pilon polish. | ||
// Values in Gb. Default 50G. 50G has been proved to be enough in most cases. | ||
// This step is crucial because with not enough memory will crash and not correct your assembly. | ||
pilon_memory_limit = 50 | ||
|
||
/* | ||
* General parameters | ||
* | ||
* These parameters will set the default for all samples. | ||
* However, they can also be set inside the YAML, if this happens | ||
* the pipeline will use the value inside the YAML to overwrite | ||
* the parameter for that specific sample. | ||
* | ||
* Please read the documentation https://mpgap.readthedocs.io/en/latest/samplesheet.html to know more about the samplesheet file. | ||
*/ | ||
|
||
|
||
// This parameter only needs to be set if the software chosen is Canu, wtdbg2 or Haslr. Is optional for Flye. | ||
// It is an estimate of the size of the genome. Common suffices are allowed, for example, 3.7m or 2.8g | ||
genome_size = null | ||
|
||
// Select the appropriate value to pass to wtdbg2 to assemble input. | ||
// Options are: "ont" for Nanopore reads, "rs" for PacBio RSII, "sq" for PacBio Sequel, "ccs" for PacBio CCS reads. | ||
// By default, if not given, the pipeline will use the value "ont" if nanopore reads are used and "sq" if pacbio reads are used | ||
wtdbg2_technology = null | ||
|
||
// Select the appropriate shasta config to use for assembly | ||
// Since shasta v0.8 (Oct/2021) this parameter is now mandatory. | ||
shasta_config = "Nanopore-Oct2021" | ||
|
||
// Tells the pipeline to interpret the long reads as "corrected" long reads. | ||
// This will activate (if available) the options for corrected reads in the | ||
// assemblers: -corrected (in canu), --pacbio-corr|--nano-corr (in flye), etc. | ||
// Be cautious when using this parameter. If your reads are not corrected, and | ||
// you use this parameter, you will probably do not generate any contig. | ||
corrected_long_reads = false | ||
|
||
// This parameter below (hybrid_strategy) is to select the hybrid strategies adopted by the pipeline. | ||
// Read the documentation https://mpgap.readthedocs.io/en/latest/manual.html to know more about the hybrid strategies. | ||
// | ||
// Whenever using this parameter, it is also possible to polish the longreads-only assemblies with Nanopolish, | ||
// Medaka or VarianCaller (Arrow) before the polishing with shortreads (using Pilon). For that it is necessary to set | ||
// the right parameters: pacbio_bam and nanopolish_fast5 (files given only inside YAML) or medaka_model. | ||
hybrid_strategy = 1 | ||
|
||
// Default medaka model used for polishing nanopore long reads assemblies. | ||
// Please read their manual https://github.com/nanoporetech/medaka to know more about the available models. | ||
medaka_model = "r941_min_high_g360" | ||
|
||
// This parameter sets to nanopolish the max number of haplotypes to be considered. | ||
// Sometimes the pipeline may crash because to much variation was found exceeding the limit | ||
nanopolish_max_haplotypes = 1000 | ||
|
||
|
||
/* | ||
* Advanced parameters | ||
* | ||
* Controlling the execution of assemblers | ||
* It must be set as true to skip the software and false to use it. | ||
* Also adding the possibility to pass additional parameters to them | ||
* Additional parameters must be in quotes and separated by spaces. | ||
*/ | ||
|
||
|
||
quast_additional_parameters = null // Give additional parameters to Quast while assessing assembly metrics. | ||
// Must be given as shown in Quast manual. E.g. " --large --eukaryote ". | ||
|
||
skip_spades = false // Hybrid and shortreads only assemblies | ||
spades_additional_parameters = null // Must be given as shown in Spades manual. E.g. " --meta --plasmids " | ||
|
||
skip_shovill = false // Paired shortreads only assemblies | ||
shovill_additional_parameters = null // Must be given as shown in Shovill manual. E.g. " --depth 15 " | ||
// The pipeline already executes shovill with spades, skesa and megahit, so please, do not use it with shovill's ``--assembler`` parameter. | ||
|
||
skip_unicycler = false // Hybrid and shortreads only assemblies | ||
unicycler_additional_parameters = null // Must be given as shown in Unicycler manual. E.g. " --mode conservative --no_correct " | ||
|
||
skip_haslr = false // Hybrid assemblies | ||
haslr_additional_parameters = null // Must be given as shown in Haslr manual. E.g. " --cov-lr 30 " | ||
|
||
skip_canu = false // Longreads only assemblies | ||
canu_additional_parameters = null // Must be given as shown in Canu manual. E.g. " correctedErrorRate=0.075 corOutCoverage=200 " | ||
|
||
skip_flye = false // Longreads only assemblies | ||
flye_additional_parameters = null // Must be given as shown in Flye manual. E.g. " --meta --iterations 4 " | ||
|
||
skip_raven = false // Longreads only assemblies | ||
raven_additional_parameters = null // Must be given as shown in Raven manual. E.g. " --polishing-rounds 4 " | ||
|
||
skip_wtdbg2 = false // Longreads only assemblies | ||
wtdbg2_additional_parameters = null // Must be given as shown in wtdbg2 manual. E.g. " --tidy-reads 5000 " | ||
|
||
skip_shasta = false // Nanopore longreads only assemblies | ||
shasta_additional_parameters = null // Must be given as shown in shasta manual. E.g. " --Reads.minReadLength 5000 " | ||
|
||
// Max resource options | ||
// Defaults only, expecting to be overwritten | ||
max_memory = '14.GB' | ||
max_cpus = 6 | ||
max_time = '40.h' | ||
|
||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
// docker profile | ||
params.selected_profile = "docker" | ||
singularity.enabled = false | ||
docker.enabled = true | ||
docker.runOptions = '-u \$(id -u):\$(id -g)' | ||
fixOwnership = true | ||
process.container = "fmalmeida/mpgap:v3.1" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
// singularity profile | ||
params.selected_profile = "singularity" | ||
docker.enabled = false | ||
singularity.enabled = true | ||
singularity.autoMounts = true | ||
process.container = "docker://fmalmeida/mpgap:v3.1" | ||
singularity.autoMounts = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
// standard local profile -- default | ||
// does not use any pre-configuration from profiles | ||
// using docker as default | ||
params.selected_profile = "none" | ||
singularity.enabled = false | ||
docker.enabled = false |
Oops, something went wrong.