Annotation Tools, Downloads, and Beyond

Annotation Tools

Annotation is the practice of capturing the activities and localization of a gene product with GO terms, providing references and indicating what kind of evidence is available to support the annotations. More information on how this is done can be found in the Guide to GO Annotation Policies. Members of the GO Consortium make their annotation data freely available to the public as part of the data accessed by AmiGO 2, the GO browser and search engine. Annotation data sets from individual databases can found on the GO annotations page.

In addition, the GO consortium has prepared GO slims, 'slimmed down' versions of the ontologies that allow you to annotate genomes or sets of gene products to gain a high-level view of gene functions. Using GO slims you can, for example, work out what proportion of a genome is involved in signal transduction, biosynthesis or reproduction. See the GO Slim Guide for more information.

Downloads

All data from the GO project is freely available. Visit the 'Downloads' page to obtain the ontology data in a number of different formats, including XML and mySQL. The GO file format guide has more information on these formats.

If you need lists of the genes or gene products that have been associated with a particular GO term, the current Annotations table tracks the number of annotations and provides links to the gene association files for each of the collaborating databases is available.

Contributing to GO

The GO project is constantly evolving, and we welcome feedback from all users. Learn more about how you can contribute to the GO by visiting our instructions page.

Beyond GO

GO allows us to annotate genes and their products with a limited set of attributes. For example, the GO does not allow for the description of genes in terms of which cells or tissues they're expressed in, which developmental stages they're expressed at, or their involvement in disease. It is not necessary for the GO to do these things because other ontologies are being developed for these purposes. The GO Consortium supports the development of other ontologies, and all the tools for editing and curating ontologies are freely available to the public. A list of freely available ontologies that are relevant to genomics and proteomics and are structured similarly to GO can be found at the Open Biomedical Ontologies website. A larger list, which includes the ontologies listed at OBO and also other controlled vocabularies that do not fulfill the OBO criteria is available at the Ontology Working Group section of the Microarray Gene Expression Data (MGED) Network site.

Cross-products:

The existence of several ontologies will also allow us to create 'cross-products' that maximize the utility of each ontology while avoiding redundancy. For example, by combining the developmental terms in the GO process ontology with a second ontology that describes Drosophila anatomical structures, we could create an ontology of fly development. We could repeat this process for other organisms without having to clutter up GO with large numbers of species-specific terms. Similarly, we could create an ontology of biosynthetic pathways by combining the biosynthesis terms in the GO process ontology with a chemical ontology.

Mappings to other classification systems:

GO is not the only attempt to build structured controlled vocabularies for genome annotation, nor is it the only such series of catalogs in current use. The GO project provides mappings between GO and these other systems, although we caution that these mappings are neither complete nor exact and should only to be used as a guide.