GitHub - biologyzhangbo/SnpExt_Genebased: A script about extracting snp on candidate genes based python3.

The first script is Gene_locat.py.
The script requires two files, one is the processed GFF file, and the other is the gene id file.
The processed GFF file can be obtained through the following ways:
First, download the annotation file of the target genome from phytozome, and then run the following command in linux:
grep “gene” refgenome.gff> gene.refgenome.gff
A usable gff file should be same as the following format:
Chr01 phytozomev10 gene 1951 2616 . + . ID=Sobic.001G000100.v3.1;Name=Sobic.001G000100;ancestorIdentifier=Sobic.001G000100.v2.1 Chr01 phytozomev10 gene 11180 14899 . - . ID=Sobic.001G000200.v3.1;Name=Sobic.001G000200;ancestorIdentifier=Sobic.001G000200.v2.1
Chr01 phytozomev10 gene 23399 24152 . - . ID=Sobic.001G000300.v3.1;Name=Sobic.001G000300;ancestorIdentifier=Sobic.001G000300.v2.1
Chr01 phytozomev10 gene 22391 42443 . - . ID=Sobic.001G000400.v3.1;Name=Sobic.001G000400;ancestorIdentifier=Sobic.001G000400.v2.1

The gene id file should have one gene name per line as follows:
Sobic.001G355700
Sobic.002G484000
Sobic.005G821200
...

Then, run the command as follows:
python Gene_locat.py geneidfile gene.refgenome.gff genelocationfile

The second script is SnpExt_Genebased.py.

The script requires two files, one is the processed VCF file, and the other is the gene location file.
The processed VCF file can be obtained through the following ways:
grep "#" VCFfile > title
grep -v "#" VCFfile > processed.VCFfile
The gene location file can be produced by Gene_locat.py.
Second, run the command as following:
python SnpExt_Genebased.py genelocationfile processed.VCFfile
cat title extract.vcf > extracted.vcf
rm extract.vcf

Contact: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Gene_locat.py		Gene_locat.py
README.md		README.md
SnpExt_Genebased.py		SnpExt_Genebased.py

biologyzhangbo/SnpExt_Genebased

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages