-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
158 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,161 @@ | ||
# GO Enrichment Workhop | ||
|
||
GO Enrichment: see lecture [slides](PaulThomas_cshl2018.pdf) ~slide 45 for general idea | ||
|
||
Use [http://pantherdb.org/](http://pantherdb.org/) | ||
1. Enter IDs: upload for your list of genes Piwi_2fold_down_id | ||
2. Select organism, Drosophilia | ||
3. Select Analysis | ||
- select statical overrepresentation test, deselect the checkbox "use default settings" | ||
4. submit | ||
|
||
On next page | ||
- upload Piwi_ref as your reference list | ||
- select GO biological Process complete | ||
- launch analysis | ||
|
||
## Input files | ||
- download files from the [our repository](files) github repository directory | ||
<!----- Conversion time: 0.88 seconds. | ||
Using this Markdown file: | ||
1. Cut and paste this output into your source file. | ||
2. See the notes and action items below regarding this conversion run. | ||
3. Check the rendered output (headings, lists, code blocks, tables) for proper | ||
formatting and use a linkchecker before you publish this page. | ||
Conversion notes: | ||
* Docs to Markdown version 1.0β17 | ||
* Mon Oct 21 2019 16:23:23 GMT-0700 (PDT) | ||
* Source doc: https://docs.google.com/open?id=1HGxjb10-Kqx-ZaJUHzpuRA41kngeJU1vlzEiOAAX4aA | ||
-----> | ||
|
||
|
||
# Gene Function Annotation and Gene Set Analysis: Workshop | ||
|
||
Oct. 25th, 2019 | ||
|
||
The goal of this exercise is to learn how to use a python script to retrieve PANTHER annotation data or perform a statistical overrepresentation test through an Application Programming Interface (API) | ||
|
||
|
||
## Download | ||
|
||
The script is developed in the GitHub repository in the following location: | ||
|
||
[https://github.com/pantherdb/pantherapi-pyclient](https://github.com/pantherdb/pantherapi-pyclient) | ||
|
||
If you have a GitHub account, you can clone the repo to your desktop app. If not, you can simply download the repo to your desktop. | ||
|
||
_<span style="text-decoration:underline;">PANTHER API Service</span>_ | ||
|
||
PANTHER API is an interface to allow client to access PANTHER data and tools. The users can access directly through command-line command, or embed the commands/codes in various scripts and programs (Perl, Python, R, etc.). | ||
|
||
Example client code for calling can be found in the[ Panther API services](http://panthertest3.med.usc.edu:8083/services/tryItOut.jsp?url=%2Fservices%2Fapi%2Fpanther) | ||
|
||
|
||
## Installation | ||
|
||
$ git clone https://github.com/pantherdb/pantherapi-pyclient.git | ||
|
||
$ cd pantherapi-pyclient | ||
|
||
$ python3 -m venv env | ||
|
||
$ . env/bin/activate (bash) or source env/bin/activate.csh (C-shell or tcsh) | ||
|
||
$ pip install -r requirements.txt | ||
|
||
|
||
## Running | ||
|
||
$ python3 pthr_go_annots.py --service <service type> --params_file <parameter file> --seq_id_file <gene list file> | ||
|
||
|
||
### Service Types | ||
|
||
Currently, there are three options for service types (--service or -s). | ||
|
||
|
||
|
||
* _enrich_ -- This is the statistical overrepresentation test on a list of genes. | ||
* _geneinfo_ -- This call provides GO and pathway annnotations to the uploaded genes. | ||
* _ortholog_ -- This call returns the orthologs of the uploaded list. Maximum of 10 genes can be loaded. | ||
|
||
|
||
### Parameter File | ||
|
||
These files (in JSON format) are in the params/ folder. They should be edited according to the uploaded data and the type of call. \ | ||
\ | ||
**_enrich.json \ | ||
_**This file should be used when _enrich_ is specified as the service type. There are four items to be specified in this file. \ | ||
1. "organism": "**9606**", _--specify an organism with a taxon ID. (see Appendix on How to find a taxon ID?) \ | ||
_ 2. "annotDataSet": "**GO:0008150**", _--specify an annotation data set. (see Appendix on How to find the ID for supported annotation dataset?) \ | ||
_ 3. "enrichmentTestType": "**FISHER**", _--enter either FISHER (for Fisher's Exact test) or BINOMIAL (for binomial distribution test) \ | ||
_ 4. "correction": "**FDR**" _--specify the multi test correction method (FDR, BONFERRONI, or NONE) \ | ||
\ | ||
_ **_geneinfo.json \ | ||
_** This file should be used when _geneinfo_ is specified as the service type. The organism taxon ID needs to be specified to match the uploaded data. \ | ||
\ | ||
**_ortholog.json \ | ||
_** This file should be used when _ortholog_ is specified as the service type. There are two items to be specified \ | ||
1. "organism": "**9606**", _-- specify the organism of the uploaded genes \ | ||
_ 2. "orthologType": "**LDO**" _-- specify the type of ortholog, e.g., LDO (for least divergent ortholog), or all.__ | ||
|
||
|
||
### User Gene List | ||
|
||
This should be a simple text file (.txt) with one gene identifier per line. Please visit the following page to find out the supported IDs. | ||
|
||
[www.pantherdb.org/tips/tips_batchIdSearch_supportedId.jsp](www.pantherdb.org/tips/tips_batchIdSearch_supportedId.jsp) | ||
|
||
|
||
## Usage | ||
|
||
$ python3 pthr_go_annots.py -h | ||
|
||
usage: pthr_go_annots.py [-h] [-s SERVICE] [-p PARAMS_FILE] [-f SEQ_ID_FILE] | ||
|
||
optional arguments: | ||
|
||
-h, --help show this help message and exit | ||
|
||
-s SERVICE, --service SERVICE | ||
|
||
Panther API service to call (e.g. 'enrich', | ||
|
||
'geneinfo', 'ortholog') | ||
|
||
-p PARAMS_FILE, --params_file PARAMS_FILE | ||
|
||
File path to request parameters JSON file | ||
|
||
-f SEQ_ID_FILE, --seq_id_file SEQ_ID_FILE | ||
|
||
File path to list of sequence identifiers | ||
|
||
_<span style="text-decoration:underline;">Examples:</span>_ | ||
|
||
% python3 pthr_go_annots.py -s geneinfo -p params/geneinfo.json -f resources/test_ids.txt | ||
|
||
% python3 pthr_go_annots.py -s enrich -p params/enrich.json -f resources/test_ids.txt | ||
|
||
% python3 pthr_go_annots.py -s ortholog -p params/ortholog.json -f resources/test_ids_ortholog.txt | ||
|
||
|
||
## Appendix | ||
|
||
|
||
### _How to find a Taxon ID?_ | ||
|
||
There are three ways to find the exact taxon IDs for genomes supported by PANTHER. | ||
|
||
|
||
|
||
1. Go to the PANTHER Open API site ([http://panthertest3.med.usc.edu:8083/services/tryItOut.jsp?url=%2Fservices%2Fapi%2Fpanther](http://panthertest3.med.usc.edu:8083/services/tryItOut.jsp?url=%2Fservices%2Fapi%2Fpanther)), and use the /supportedgenomes service. | ||
2. Go directly to the API link page ([http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedgenomes](http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedgenomes)). | ||
3. Run the following command: curl -X POST "http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedgenomes" -H "accept: application/json" | ||
|
||
Use the taxon ID that corresponds to the genomes in the ‘name’ field. | ||
|
||
|
||
### _How to find the ID for supported annotation dataset?_ | ||
|
||
There are three similar ways to find the IDs or text needed for the supported annotation dataset. | ||
|
||
|
||
|
||
1. Go to the PANTHER Open API site ([http://panthertest3.med.usc.edu:8083/services/tryItOut.jsp?url=%2Fservices%2Fapi%2Fpanther](http://panthertest3.med.usc.edu:8083/services/tryItOut.jsp?url=%2Fservices%2Fapi%2Fpanther)), and use the /supportedannotdatasets service. | ||
2. Go directly to the API link page ([http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedannotdatasets](http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedannotdatasets)). | ||
3. Run the following command: curl -X POST "http://panthertest3.med.usc.edu:8083/services/oai/pantherdb/supportedannotdatasets" -H "accept: application/json" | ||
|
||
Use the text in the ‘id’ field for the parameter files. | ||
|
||
|
||
<!-- Docs to Markdown version 1.0β17 --> |