-
Notifications
You must be signed in to change notification settings - Fork 4
PreparingOMA
ebersber edited this page Nov 4, 2018
·
5 revisions
When working with orthologs from the OMA database, 2 files have to be downloaded from OMA database – OMA groups (http://omabrowser.org/All/oma-groups.txt.gz) and OMA Protein Sequences (http://omabrowser.org/All/oma-seqs.fa.gz). If you don't want to use the create_conf.pl script for downloading the data, perform the following steps:
Always check for the latest versions at http://omabrowser.org/oma/current/. Once downloaded, do the following:
- unzip the oma groups file
gunzip oma-groups.txt.gz
- unzip the fasta file
gunzip oma-seqs.fa.gz
- Convert Multi-line fasta to single-line fasta
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);} END
{printf("\n");}' < oma-seqs.fa > oma-seqs.fa
- remove white spaces in the fasta headers
sed –i –e ‘s/> />/’ oma-seqs.fa
Now, the OMA files are ready to be used with protTrace. Provide the path of these files in the program configuration file under tabs path_oma_group and path_oma_seqs respectively.