Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove scaffold and other unplaced sequences before mapping ? #57

Open
CTLife opened this issue Mar 28, 2016 · 4 comments
Open

remove scaffold and other unplaced sequences before mapping ? #57

CTLife opened this issue Mar 28, 2016 · 4 comments

Comments

@CTLife
Copy link

CTLife commented Mar 28, 2016

Hi,
I downloaded reference genomes from Ensembl (fasta format).
But there are lots of sequences with name "dna:scaffold": https://github.com/CTLife/TEMP/tree/master/RefGenomes

Such as Mouse_GRCm38 (mm10), except chromosome 1-19, Mt, X and Y; others should be removed before mapping ? https://github.com/CTLife/TEMP/blob/master/RefGenomes/Mouse_GRCm38.p4.txt

Such as Human_GRCh38.p5 (hg38), https://github.com/CTLife/TEMP/blob/master/RefGenomes/Human_GRCh38.p5.txt, there are 516 sequences. In addition to chromosome 1-22, Mt, X and Y; others (such as CHR_HG2241_PATCH and KI270728.1) should be removed before mapping ?

@billzt
Copy link

billzt commented Mar 28, 2016

I even encounter some draft assemblies with thousands of scaffolds besides the main chromosomes

@CTLife
Copy link
Author

CTLife commented Mar 28, 2016

@billzt These thousands of scaffolds besides the main chromosomes need to be removed before mapping ChIP-seq and RNA-seq reads ? ?

@billzt
Copy link

billzt commented Mar 28, 2016

Well, usually I'll kept them.

@CTLife
Copy link
Author

CTLife commented Mar 28, 2016

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants