annotation

Annotation related questions (e.g. evidence codes, ID mapping...).

How are binary interactions curated by the IntAct group selected for export to GO as protein binding (GO:0005515) annotations?

All binary interactions evidences in the IntAct database, including those generated by Spoke expansion of co-complex data, are clustered to produce a non-redundant set of protein pairs. Each binary pair is then scored, using a simple addition of the cumulated value of a weighted score for the interaction detection method and the interaction type for each interaction evidence associated with that binary pair.

Can a single gene product be annotated with more than one GO term?

Yes!

It is possible and usually expected for a single gene / gene product to be associated with more than one GO term. The fact that you may have found that there are two or more different GO terms associated with a single gene / gene product in your results should not be a cause for concern.

The Gene Ontology allows users to describe a gene / gene product in detail, considering three main aspects: its molecular function, the biological process in which it participates, and its cellular location:

How can I programmatically get a list of GO terms associated with a gene identifier?

This very useful BIOSTARS thread clarifies "How Do I Do Simple GO Term Lookup Given A Gene (Or mRNA) Identifier?" You might find these responses useful when trying to simply create a list of GO terms associated with a given ID- and when you are not looking to conduct enrichment analyses - using a programatic approach. https://www.biostars.org/p/1226/

What are the differences between the data available in AmiGO and those on QuickGO?

These are some of the differences between EBI-GOA (QuickGO) and GO Central (AmiGO) when it comes to entities.

GO Central recommends that GAF annotations are made to genes, that is 1:1 equivalents. In GOA (and consequently in QuickGO) annotations are made to proteins, and there may be multiple proteins per gene, sometimes representing different isoforms. You will see this reflected in different numbers for mouse annotations for example.

What is the best way to obtain the GO annotations for a list of Ensembl IDs in batch?

You can do this using QuickGO (www.ebi.ac.uk/QuickGO).
  1. Click on the 'Search and filter GO annotation sets' link, which will take you to a table of all annotations in the GOA database. You now have to filter this set on your gene IDs.
  2. Click on 'Filter' in the top right toolbar, select the 'Gene Product ID' tab and then paste your Ensembl gene IDs into the text box, line separated.

How do I access older versions of gene association files?

Here are several options:
  1. old database dumps, requires knowledge of schema and SQL for retrieving info, need to be able to restore the whole db
  2. http://archive.geneontology.org/full/
  3. CVS attic for individual gene_association files
  4. http://cvsweb.geneontology.org/cgi-bin/cvsweb.cgi/go/gene-associations/A...
  5. cvs repository for individual gene_association files
  6. For example, SGD file history goes back to 2004

How can I do term enrichment analysis for a species that is not present in the list from AmiGO?

The Term Enrichment tool on the GO and AmiGO websites center only on data from the genomes available on the PANTHER Classification System Database (http://go.pantherdb.org). Details about how to use the tools available on AmiGO and how to interpret the results is available on our website at http://geneontology.org/page/go-enrichment-analysis.

How do I find all annotations for species X that I can't find in AmiGO?

  1. Open the QuickGO web-page http://www.ebi.ac.uk/QuickGO/
  2. Click on the Search and Filter GO annotation sets link located beneath the search box
  3. This will lead you to an Annotation download page where you can click the filter icon (Located to the right hand side of the page)

How do I annotate a de novo assembled transcriptome against the GO database?

You can annotate the coding sequences in your transcripts using InterProScan. You can do this using WebServices or by downloading the tool and running it locally. Details can be found at: http://www.ebi.ac.uk/interpro/search/sequence-search/

This will predict GO terms based on domains detected using the mapping file here: http://geneontology.org/page/download-mappings

What is the difference between the filtered and unfiltered versions of the GOA UniProt gene associations files?

The filtered version available on the GO Download's site (gene_association.goa_uniprot_noiea ) does not contain annotations for those species where a different Consortium group is primarily responsible