guide

Introduction to the GO resource

The Gene Ontology (GO) is a comprehensive resource of computable knowledge regarding the functions of genes and gene products. As such, it is extensively used by the biomedical research community for the analysis of -omics and related data. The structured knowledge in the ontology is a crucial part of the global biomedical informatics infrastructure.

GO Annotation File Format 2.0

Annotation data is submitted to the GO Consortium in the form of Gene Association Format, or GAFs. This guide lays out the format specifications for GAF 2.0; for the older GAF 1.0 file syntax, please see the GAF 1.0 file format guide.

Please see the information on the changes in GAF 2.0.

General information about annotation can be found in the GO annotation guide.

The Reference Genome Annotation Project

The GO Consortium coordinated an effort to maximize and optimize GO annotations for a large and representative set of key genomes, known as 'reference genomes'. The Reference Genome Annotation Project aimed to completely annotate twelve reference genomes, producing a resource that may effectively seed automatic annotation efforts of other genomes.

Guide to GO Evidence Codes

A GO annotation consists of a GO term associated with a specific reference that describes the work or analysis upon which the association between a specific GO term and gene product is based. Each annotation must also include an evidence code to indicate how the annotation to a particular term is supported. Although evidence codes do reflect the type of work or analysis described in the cited reference which supports the GO term to gene product association, they are not necessarily a classification of types of experiments/analyses.

ID Mapping Files

ID Mapping Files

This page documents the file formats used to store the mapping between the Database object IDs to corresponding sequence IDs in UniProtKB or NCBI.
  • gp2protein file
  • gp2rna file
  • gp_unlocalized

gp2protein file

A gp2protein file is a tab-delimited file that provides a mapping between database object IDs and protein sequence IDs. gp2protein files contributed by annotation groups are available for download.

Need for gp2protein file

    Current Annotations

    Current Annotations
    • Annotation Details and Downloads
    • Filtered files
    • Unfiltered files
    • gp2protein files

    Annotation Details and Downloads

    The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.

    Tools

    The GO Tools Registry is no longer supported

    LEAD Database Downloads

    The GO database, accessed by tools such as AmiGO, contains ontology, gene product, sequence, and manual annotation data. It is available in a number of formats and configurations. This page details the database downloads available; for more information on the database itself, please see the GO database guide.