Cross-references of external classification systems to GO

Many Gene Ontology terms are cross-referenced to corresponding concepts from a number of external vocabularies, including Enzyme Commission numbers, KEGG, Reactome Pathways, and Wikipedia. The cross-references (mappings) are typically made manually. Please report any errors or suggest alternatives to the GO helpdesk.

Using and citing cross-references and mappings

If you have used a mapping in a publication or presentation, please ensure that you cite both the GO project and the source of the mapping (detailed below). See the GO citation guide for citing the GO project.

Cross-references format

  • Cross-references files are simple text files that start with a comment line describing the date generated and the GO release used, for example:

    ! Generated on 2018-10-05T08:40Z from the ontology 'go' with data version: 'releases/2017-03-31'

  • Each cross-reference is on a different line, in the format:

    database:term identifier (id/name) > GO:GO term name ; GO:id

    For example:

    EC:1.1.1.1 > GO:alcohol dehydrogenase (NAD) activity ; GO:0004022

  • Cross-references to GO mappings can be many-to-many. Each cross-reference is on a separate line of the file.

Mappings file directory

Direct access to the mappings file directory is available here: http://current.geneontology.org/ontology/external2go/

Mapping Last update Download
Enzyme Commission (EC) enzyme numbers
Enzyme Commission; contact: GO Ontology editors
Constructed and maintained in the GO ontology file by GO editorial staff
Citation: Hill DP, Davis AP, Richardson JE, Corradi JP, Ringwald M, Eppig JT, Blake JA. Program description: Strategies for biological annotation of mammalian systems: implementing gene ontologies in mouse genome informatics. Genomics. May 2001;74(1):121–8.
[ PMID:11374909 | doi:10.1006/geno.2001.6513 ]
Monthly txt
KEGG pathways and reactions
Kyoto Encyclopaedia of Genes and Genomes
Constructed and maintained by Amelia Ireland and a script
Daily txt
MetaCyc pathways and reactions
MetaCyc; contact: GO Ontology editors
Constructed and maintained in the GO ontology file by GO editorial staff
Daily txt
Reactome events and catalyst activities
Reactome
Constructed by Reactome curators and maintained in the GO ontology file by GO editorial staff
Daily txt
Rhea Annotated Reactions Database
Rhea
Constructed and maintained by Amelia Ireland and a script
Daily txt
EAWAG-BBD enzyme IDs
Swiss Federal Institute of Aquatic Science and Technology Biocatalysis/Biodegradation Database (EAWAG-BBD); contact: GO Ontology editors
Maintained in the GO ontology file by GO editorial staff
Daily txt
EAWAG-BBD pathway IDs
Swiss Federal Institute of Aquatic Science and Technology Biocatalysis/Biodegradation Database (EAWAG-BBD); contact: GO Ontology editors
Maintained in the GO ontology file by GO editorial staff
Daily txt
EAWAG-BBD reaction IDs
Swiss Federal Institute of Aquatic Science and Technology Biocatalysis/Biodegradation Database (EAWAG-BBD); contact: GO Ontology editors
Maintained in the GO ontology file by GO editorial staff
Daily txt

Mappings files provided by external groups

Mapping Last update Download
COG functional categories
Clusters of Orthologous Groups (COG)
Constructed by Michael Ashburner and Jane Lomax
June 2004 txt
Expressed Gene Anatomy Database (EGAD)
Constructed by Michael Ashburner
Oct 2000 txt
E. coli Genome and Proteome functional categories (GenProtEC)
Constructed by Heather Butler and Michael Ashburner
Dec 2000 txt
HAMAP families
High-Quality Automated and Manual Annotation of Microbial Proteomes (HAMAP); contact: UniProtKB-GOA team
Mapping generated by the HAMAP and UniProtKB UniProtKB-GOA teams
Citation: Lima T, Auchincloss AH, Coudert E, Keller G, Michoud K, Rivoire C, Bulliard V, de Castro E, Lachaize C, Baratin D, Phan I, Bougueleret L, Bairoch A. HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot. Nucleic Acids Res.. Jan 2009;37(Database issue):D471–8.
[ PMID:18849571 | doi:10.1093/nar/gkn661 ]
Monthly txt
InterPro protein families, domains and functional sites
InterPro; contact: InterPro team
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
MIPS FunCat
MIPS Functional Catalogue (FunCat)
Constructed by Michael Ashburner and Midori Harris
Jan 2006 txt
MultiFun cell function assignment schema classifications
Constructed by Michael Ashburner, Jane Lomax and Margrethe Hauge Serres
Dec 2005 txt
Pfam domains
Pfam; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
PIRSF protein superfamilies
PIRSF; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
PRINTS domains
PRINTS; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
ProDom domains
ProDom; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
ProSite domains
ProSite; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation: Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C. InterPro: the integrative protein signature database. Nucleic Acids Res.. Jan 2009;37(Database issue):D211–5.
[ PMID:18940856 | doi:10.1093/nar/gkn785 ]
Monthly txt
Rfam RNA families
Rfam
Constructed and maintained by Sam Griffiths-Jones and Jennifer Daub
May 2006 txt
SMART domains
SMART; contact: InterPro team
This mapping is generated from data supplied by InterPro for the InterPro2GO mapping
Citation:Letunic I, Bork P; 20 years of the SMART protein domain annotation resource. Nucleic Acids Res.. Jan 2018;46(D1):D493–D496.
[ PMID:29040681 | doi.org/10.1093/nar/gkx922 ]
Monthly txt
JCVI (TIGRFAM) protein families
JCVI (TIGRFAM) protein families
Constructed by Michelle Gwinn and other TIGR staff
Monthly txt
JCVI roles
JCVI roles
Constructed by Michael Ashburner
Jan 2004 txt
UniProt Knowledgebase
Mapping of GO terms to UniProt Knowledgebase keywords.
This mapping is generated by the UniProtKB and UniProtKB-GOA teams [was SwissProt keyword]
UniProt; contact: UniProtKB-GOA team
Citation: Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O’Donovan C. The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res.. Jan 2015;43(Database issue):D1057–1063.
[ PMID:25378336 | doi:10.1093/nar/gku1113 ]
Monthly txt
UniProt subcellular location
UniProt; contact: UniProtKB-GOA team
Citation: Huntley RP, Sawford T, Mutowo-Meullenet P, Shypitsyna A, Bonilla C, Martin MJ, O’Donovan C. The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res.. Jan 2015;43(Database issue):D1057–1063.
[ PMID:25378336 | doi:10.1093/nar/gku1113 ]
Monthly txt

Nota Bene. Files listed as being updated monthly are completely regenerated during the monthly UniProtKB-GOA release. Minor corrections to files are not mentioned here, and only comprehensive updates are listed.