GO Archive

Comprehensive GO archive of the ontology and annotations from 2004. Note that this replaces the former GO CVS, SVN and old archive.

[ Quick access to the GO Archive ]

GO release folder hierarchy

  • annotations : contains the GO annotations as GAF files. GPAD and GPI files are available from March, 2018 with the newer GO DOI releases
  • annotations/gp2protein (up to Feb, 2018) : contains the files mapping gene products (usually MOD ids) to proteins (UniProtKB accession number)
  • annotations/gp2rna (up to Feb, 2018): equivalent of gp2protein files but for non-coding RNAs (mapping to RNA central IDs)
  • ontology : contains the GO ontology (obo and owl files) - users are recommended to use ontology/go.obo (obo format 1.2) if they don’t need to go back further than March 2009 and ontology/gene_ontology.obo (obo format 1.0) if they need to go back to the beginning of the archive
  • ontology/extensions (from May, 2015): contains the various ontologies imported or produced by GO
  • ontology/external2go : files mapping GO to different resources (e.g. interpro, kegg, reactome, etc)
  • ontology/subsets (from Oct, 2004) : contains the GO slims used to simplify the ontology for specific purposes (e.g. goslim_synapse) or organisms (e.g. goslim_pombe) - we recommend to use .obo files (2004-now) rather than old deprecated .go files (2004-2009)
  • mysql_dumps (up to Jan, 2017) : contains the MySQL dumps of GO (e.g. -assocdb , -termdb)
  • products/annotations : contains the GO annotations provided by the MODs to the GO consortium. Those files are kept for transparency but users are recommended to use the GO annotations in annotations/ as they can differ due to various filtering and checks performed by the GO consortium to ensure quality

Additional note on GO subsets

The GO subsets from 2004 to 2018 were deposited to give an easy access to the GO slim used in a particular publication or analysis and for reuse by the GO community at the time. Some of these GO slims are no longer maintained by the authors and as such can contain obsoleted GO terms. Although we recommend to use the .obo files (consistent with our current releases), old and deprecated .go files were kept in the archive. In .go files, parentage and relationships are indicated by indentation and punctuation characters (e.g. ‘%’ to indicate an is_a relationship).

If you are looking for current, actively maintained GO slims, please see the guide to GO subsets

Topic / Usage Information Download
Generic GO slim Suparna Mundodi and Amelia Ireland Aug 2002 old GO format
Honey bee ESTs C.W. Whitfield, M.R. Band, M.F. Bonaldo, C.G. Kumar, L. Liu, J.R. Pardinas, H.M. Robertson, M.B. Soares, G.E. Robinson, PMID:11923340 Apr 2002 old GO format
Drosophila M. Adams, M. Ashburner, G.M. Rubin, S.E. Lewis et al.; Adams et al., PMID:10731132 Mar 2000 old GO format
Glossina ESTs M. Berriman Sep 2002 old GO format
UniProtKB-GOA N. Mulder, M. Pruess PMID:12230037 Nov 2002 old GO format
Mouse The RIKEN Genome Exploration Group Phase II Team and the FANTOM Consortium PMID:11217851 Feb 2001 old GO format
P. falciparum M. Berriman July 2002 old GO format
Plant Suparna Mundodi Dec 2002 old GO format
Rice (Beijing) J. Yu et al. PMID:11935017 Apr 2002 old GO format
Rice (Syngenta) J. Yu et al. PMID:11935018 Apr 2002 old GO format
Yeast SGD curators Aug 2003 old GO format
Prokaryotic subset GO curators. Replaced by taxon constraints. old GO format

GO DOI releases (March 2018+)

In addition to the folder hierarchy described above, the GO DOI releases produced from March 2018 contain additional folders. These folders are only useful to a few people who would want or need to reproduce a GO release, using for instance the set of programs (bin/) and libraries (lib/) available at the time of the release. Starting from Oct 2019, GO also provides various statistics files in release_stats/.

How the GO Archive was built

The archive was generated using the data scattered across 3 legacy systems, namely the GO CVS, the GO SVN and the old product archive. Each of those systems was created at different times to serve different purposes and they were partially redundant, both in terms of the types of data they contained and in time frames (e.g. SVN was maintained from 2011 to 2018 while CVS was maintained from 2002 to 2018). The project is hosted on GitHub.

Please contact the GO Helpdesk if you have any questions.