GitHub - lh3/miniasm at 5b0f5c428a99dd0572266941704243845b8685c3

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
misc		misc
.gitignore		.gitignore
Makefile		Makefile
PAF.md		PAF.md
README.md		README.md
asg.c		asg.c
asg.h		asg.h
asm.c		asm.c
common.c		common.c
dotter.c		dotter.c
eps.h		eps.h
hit.c		hit.c
kdq.h		kdq.h
khash.h		khash.h
kseq.h		kseq.h
ksort.h		ksort.h
kvec.h		kvec.h
main.c		main.c
miniasm.h		miniasm.h
paf.c		paf.c
paf.h		paf.h
sdict.c		sdict.c
sdict.h		sdict.h
sys.c		sys.c
sys.h		sys.h

Repository files navigation

Getting Started

# Install minimap and miniasm
git clone https://github.com/lh3/minimap && (cd minimap && make)
git clone https://github.com/lh3/miniasm && (cd miniasm && make)
# Overlapping
minimap/minimap -Sw5 -L100 -m0 -t8 reads.fa reads.fa | gzip -1 > reads.paf.gz
# Assembly
miniasm/miniasm -f reads.fa reads.paf.gz > reads.gfa

Introduction

Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. Different from mainstream assemblers, miniasm does not have a consensus step. It simply concatenates pieces of read sequences to generate the final unitig sequences. Thus the per-base error rate is similar to the raw input reads.

So far miniasm is in very early development stage. It has only been tested on twelve bacterial genomes sequenced with PacBio. Including the mapping step, it takes about 3 minutes to assmble a bacterial genome. Under the default setting, miniasm assembles 5 out of 12 datasets into a single contig. The 12 data sets are PacBio E. coli sample, ERS473430, ERS544009, ERS554120, ERS605484, ERS617393, ERS646601, ERS659581, ERS670327, ERS685285, ERS743109 and a deprecated PacBio E. coli data set.

Miniasm proves that at least for high-coverage bacterial genomes, it is possible to generate long contigs from raw PacBio reads without error correction. It also shows that minimap can be used as a read overlapper, even though it is probably not as sensitive as the more sophisticated overlapers such as MHAP and DALIGNER. Coupled with long-read error correctors and consensus tools, miniasm may also be useful to produce high-quality assemblies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Introduction

About

Releases 3

Packages

Contributors 4

Languages

License

lh3/miniasm

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Introduction

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 4

Languages

Packages