Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
proudquartz authored Aug 19, 2019
1 parent c0506c7 commit 3961074
Showing 1 changed file with 26 additions and 0 deletions.
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,32 @@ The pipeline requires a local copy of the 16SMicrobial database from NCBI.

1. Simulation results file
- A csv file containing all the parameters for all the designs, as well as some summary statistics for each design
* `DESIGN_ID`: identifier for each design
* `SAMPLE`: name of the input FASTA file without file extension
* `TARGET_RANK`: desired taxonomic rank for the probe design. Availabel options: phylum, class, order, family, genus, and species
* `SIMILARITY`: similarity cut off for grouping 16S sequences. A low cut off (e.g. 0.1) essentially means 16S sequences will be grouped by their lineage information in the NCBI 16SMicrobial database. A higher cut off can be used to subdivide sequences within a given taxon. Higher cut off values generally leads to longer run time.
* `MAX_CONTINUOUS_HOMOLOGY`: maximum continuous homology (measured in bp) for a probe-target hit to be considered significant. Lower values leads to more stringent designs. Default is 14 bp.
* `MIN_TM`: minimum melting temperature threhold
* `MAX_TM`: maximum melting temperature threhold
* `GC`: minimum probe GC content threhold
* `INCLUDE_START`: number of nucleotides to exclude at the beginning of the 16S sequences
* `INCLUDE_END`: number of nucleotides to exclude at the end of the 16S sequences
* `PROBE_SELECTION_METHOD`: method for selecting probes. Available options are
1. `SingleBestProbe`: select the top probe for each taxa, if available
2. `AllSpecific`: select all probes that are specific and only specific to its target taxon
3. `AllSpecificPStartGroup`: select all probes that are specific and only specific to its target taxon within each segment of the 16S sequences. By default the 16S sequences are dividied into block resolutions of 100bp regions. If there are less than 15 probes available (average one probe per block), the block resolution is modified in 20bp decrements until there are 15 probes or the block resolution is zero, whichever happens first.
4. `MinOverlap`: select all probes that are specific and only specific to its target taxon with minimum overlap in their target coverage
5. `TopN`: select the top *n* probes for each taxa
* `PRIMERSET`: primer sets to include in the final probes. There are three sets (A, B, and C) availble in the current version. User specific primer sets can also be added if necessary.
* `OTU`: boolean to indicate whether to group 16S sequences only by their similarity. Generally set to `F` for ease of taxonomic interpretation of the probe designs, but could be useful if very high taxonomic resolution is desired.
* `TPN`: number of top probes to select for each taxon, if the probe selection method is set to `TopN`
* `FREQLL`: minimum abundance threshold. Default is zero, and is generally left at zero. Can be increased in situations where the in silico taxonomic coverage is not as good as desired. A higher value means increasing the probe design space for the more abundance sequences at the risk of those probes mishybridizing to the lower abundance taxa in the experiment.
* `BOT`: minimum blast on target rate thrshold. Probes with blast on target values lower than this value is considered *promiscuous*, and is not included in the final probe pool.
* `BARCODESELECTION`: method for barcode assignment to taxa. Available options are:
1. MostSimple: assign barcodes by barcode complexity, starting with the simplest ones. Barcodes with more bits are considered more complex.
2. Random: randomly assign barcodes to taxa
3. MostComplex: assign barcodes by barcode complexity, starting with the most complex ones. Barcodes with more bits are considered more complex.
* `BPLC`: minimum blocking probe length threhold. Blocking probes with length lower than this threshold is considered likely to be washed off and do not need to be included in the final probe pool. Default is 15 bp.
2. Probe folder
- A folder containing selected probe summary files for each taxa, a concatenated file containing all selected probes, a file containing information for all the blocking probes, as well as text files that can be sent as is to array synthesis vendors for complex oligo pool synthesis.

Expand Down

0 comments on commit 3961074

Please sign in to comment.