-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
functional annotation updates #230
Comments
It looks like we will have to use the new version of Interproscan in order to use the updated databases. The json and xml outputs for the newer versions are substanitally bigger (60 M vs 1.8 G) but if we gzip those files we can keep the size down pretty well ( 1.8 G -> 207 M). |
interproscan 5.45-80_3 (what we currently use) So, long story short, the newer versions have much larger outputs but they compress well. |
I'm inclined to remove the json and xml from the output that we provide. That said, do the gff3, tsv and gaf files that we produce convey the same information that the xml and json files do? |
We can definitely remove the json. I think we can remove the xml and the others will cover the same information. They will just be more difficult for someone to parse but I'm not sure anyone is doing that. |
InterProScan support replied: [We pull the GO from the XML into the GAF file so I think we can avoid this problem.] Another example is the version of resources used in InterProScan. The [Our readme file specifies which version of Interproscan we used and that is associated with a specific set of analysis versions.] Finally, if you want to keep the score or e-value of matches reported by [Not sure how attached we are to the scores or evalues. Does anyone look at them?] In a nutshell, we recommend using the XML and JSON formats, especially |
first thoughts on how to do updates:
-re-run functional annotation pipeline
-copy new functional annotation directory to analysis folder (preferrably on apollo-stage, otherwise CERES)
-re-run final-workflow.cwl to generate genomic_annotated.gff (or possibly just the gff annotation portion; preferrably on apollo-stage)
-remove NCBI ref track from apollo (there may be more steps necessary here)
-add new NCBI ref track (there may be more steps necessary here)
-push changes to apollo-prod, i5k-stage, i5k-prod
-re-run createsymlinks
-update tripal functional annotation page
The text was updated successfully, but these errors were encountered: