the Gene Ontology

  • Open menus
  • Home
  • FAQ
  • Downloads
  • Ontologies
  • Annotations
  • Database
  • Mappings to GO
  • Teaching Resources
  • Other files
  • FTP and CVS downloads
  • Tools
  • Browsers
  • Microarray tools
  • Annotation tools
  • Other tools
  • Submit New Tools
  • Documentation
  • Introduction
  • Annotation Guide
  • Evidence Code Guide
  • Component Ontology
  • Function Ontology
  • Process Ontology
  • File Format Guide
  • GO Database Guide
  • GO Slim Guide
  • Meeting minutes
  • Editorial Style Guide
  • About GO
  • GO Consortium
  • Publications
  • Citation Policy
  • Mailing lists
  • Interest Groups
  • GO People
  • Funding
  • Acknowledgements
  • Newsletter
  • Projects
  • Cardiovascular
  • Immunology
  • Reference Genomes
  • Contact GO
  • Site Map

The Beginner's Guide to Modifying the Ontologies

This guide is intended for new GO curators. It assumes that you have a basic understanding of GO's structure and scope (see the GO introductory documentation) but no technical knowledge whatsoever.

  • The Basics
  • Accounts
  • OBO-Edit
  • Mailing lists
  • Setting Up CVS Access
  • The CVS repository
  • The .tcshrc File
  • Accessing the CVS repository
  • Setting your GO ID Range
  • The GO Numbers File
  • Setting Your Number Range in OBO-Edit
  • The 'go' directory

The Basics

Accounts

Before you can assign SourceForge requests to yourself and edit the ontologies, you will need the following accounts set up:

  • a UNIX account or an xterm or terminal emulator application at your work site
  • GO CVS account - e-mail the GO helpdesk
  • a user account on the Sourceforge web site

Jane, Amelia and Midori are SourceForge administrators so they can grant the privileges (tech and admin) that you need.

OBO-Edit

Editing the flat files using a text editor is inefficient and error-prone, so most of our editing is done using the ontology editing program OBO-Edit, which can be found on the GO SourceForge downloads page. For more information on how to use OBO-Edit, see the Gene Ontology OBO-Edit User Guide. There are two associated SourceForge trackers which you may find useful: the OBO-Edit feature request tracker and the OBO-Edit bug report tracker.

Mailing lists

There are a number of email lists available, depending on your field of interest and the work you intend to get involved in. See the mailing lists page for details of the lists available and how to subscribe.

Back to top

Setting Up CVS Access

The CVS repository

The GO Consortium have set up a CVS repository to house the GO data. CVS, or concurrent versions system, is a tool that allows multiple users to edit a file simultaneously. The majority of the GO project data, included the ontology file and the annotation data, can be downloaded directly from the GO CVS repository at Stanford. To learn more about CVS, have a look at the CVS manual.

The .tcshrc File

Open a terminal window (or, if using Windows, access your UNIX account). When you start UNIX, you are placed in a program called the shell. Shells come in different flavors, but we use either the c-shell (csh) or a variant of it called the tc-shell (tcsh). The shell interprets commands (which you type in) and passes them to the kernel, which performs the commands. The first file that UNIX reads is your .tcshrc file; you can control your working environment by setting environment variables, such as setting a default text editor or a default printer, and aliases, which are short text strings to replace and represent long strings.

If you use this method, you can change the alias to anything you like. It will also save you a lot of typing in the future if you add some more aliases to your .tcshrc file for when you need to access cvs. If you're quite new to this kind of thing and you're planning to follow this series of guides right through then you can add the aliases in the example .tcshrc file linked below, and then the commands you need to type will correspond exactly to those in the instructions which follow. If you wish to make your own aliases then please remember to set the environmental variable: setenv CVS_RSH ssh and refer to the .tcshrc file to see the commands that I discuss later written out in full.

The example .tcshrc includes commands under '# Basic GO cvs setup:' and '# To check and commit OBO flatfiles to cvs' that are used in the subsequent guides. You may also wish later to add and experiment with the commands listed under '# Other useful GO cvs setup commands'. These are things that I have found useful but not crucial to the subsequent pages of this guide.

To use the aliases, you will need to use a text editor to edit your .tcshrc configuration file. There are several UNIX text editors, but one of the easiest to use is emacs. To open a text file in emacs, type

emacs filename

The online emacs tutorial can be accessed from within the program by typing

ctrl-h t

To edit your .tcshrc file, go to your home directory and then open the file using the following commands:

cd ~
emacs .tcshrc

Cut and paste in the aliases, and then save and close the file by typing

[control]-x [control]-s
[control]-x [control]-c

To put your changes into effect, you will now need to either restart your terminal application and open a new window, or tell the terminal to reread the .tcshrc file using the command

source .tcshrc

You are now ready to GO! The function of these aliases will be explained below and later on in the Guide to Addressing a SourceForge Request

Accessing the CVS repository

The first thing to do is to download the current version of the files in the GO CVS repository. To fit in with the aliases and directory structures described in this guide you should make sure you are in the Documents directory.

To change your ssh password (first time only), open a terminal window and type

sshpassword

Change the password to something more secure.

You will be accessing CVS via ssh and this would normally require that you type your password for every command. However you can avoid this by adding a key to your computer and to the remote host. To do this please follow the instructions shown at the Apple developer site under the heading 'Secure CVS via ssh' and using Version 2. Once you have made the key you should send the public key file id_dsa.pub to the sysadmin@genome.stanford.edu (Stanford Genetics system administrators). Make sure the files have the following permissions:

-rw------- 1 <user> <group> 672 Jul 17 11:19 id_dsa
-rw-r--r-- 1 <user> <group> 616 Jul 17 11:19 id_dsa.pub

To check out the contents of the GO repository, type

cvsco

The directories and files will then start to download; depending on the speed of your internet connection this may take an hour or more. This download puts the current GO files into a new directory called go; the contents of this directory are listed at the bottom of the page.

To update your existing copy of the GO repository, use the command

cvsup

You will also need to create a directory called old within the ontology directory in which you will store a copy of the unedited ontology file to check your edits against. The ontology directory is located under the main go directory, so create the old directory using the command

mkdir go/ontology/old

Back to top

Setting your GO ID Range

The GO Numbers File

The GO_numbers file in the CVS repository contains a list of who 'owns' each range of GO IDs. To assign yourself some numbers on this list, you need to edit your local copy and then commit it back to the cvs repository. The numbers file is in plain text format and can be opened in any text editor; to open the file in the text editor emacs, type the command

emacs go/numbers/go_numbers

Scroll down to the text that reads "Allocated number ranges for additions" and add your chosen range of numbers in the format:

XXX: GO:0048001 to GO:0050000

where XXX are your initials. Then, locate the number range in the file and add a header for your section, in the format

[your name]'s GO numbers

You can then start your list of GO numbers and corresponding GO terms, using the following format:

GO:0048001 XXX erythrose-4-phosphate dehydrogenase

Once you've finished editing the file, save and close it, then commit it back to the cvs repository:

cvs ci go/numbers/go_numbers

Setting Your Number Range in OBO-Edit

Once you have claimed a set of numbers from the ontology numbers file, you must also set these numbers within the configuration file of OBO-Edit. To do this you should open OBO-Edit, and choose from the 'plugins' menu the 'OBO-Edit Configuration Manager'. Within this window you can now fill in your range of numbers, starting in the 'start of id range' line and finishing in the 'End of id range' line. Press 'Save Configuration' to save your changes.

Back to top

The 'go' directory

doc
documents, including abbreviations for database cross-references and their definitions, and other files with miscellaneous types of information
external2go
files that contain a mapping from an external system (e.g. EC classifications, InterPro, MetaCyc) to GO terms
gene-associations
files created though curation by a group or project (e.g. SGD, FlyBase, GOA) that associate gene products with GO terms, including evidence codes, references, and other supporting information for the annotation
gene-associations/ readme
Readme files for the above mentioned gene-association files
GO_slims
files that contain a subset of GO terms, a selected set of high level terms from one, two, or three of the Gene Ontologies, particularly useful for reporting results of genome-level analyses; different GO_slims have been created for various purposes
gp2protein
index files between database object ID and sequence IDs.
meeting
Announcements, Agendas, and Meeting Programs for GO Users Meetings
meeting/ minutes
Minutes from GO Consortium Meetings
numbers
Defines number ranges for particular projects and lists numbers which have been used
ontology
Contains the OBO format ontology file, as well as the ontology and definitions files in the old GO format
software
Contributed software from the GO Consortium
software/ Python/ mgibrowser
A GO term browser provided by the MGI group
software/ SGD
Software from SGD group
software/ SGD/ geneAssociation
Perl scripts used by SGD to create their gene-association file.
software/ SGD/ goAnnotation
Perl scripts for Oracle storage of GO associations
software/ SGD/ goPath
Perl scripts to parse ontology files and output child/parent flat file
software/ utilities
Various utilities, including the script to create the GO Monthly Reports
software/ utilities/ goview
Java used to create the go-diff daily messages that show changes in the ontologies
synonyms
Files to define synonym types and list synonyms for GO terms
teaching_resources
collection of posters, presentations and tutorials by contributed by members of the Consortium
www
document root for the GO web site (www.geneontology.org); contains HTML files used for the GO web site
xml
archive for XML files created by the GOC group at LBNL (Berkeley, California).

Back to top


Open Biomedical Ontologies logo

Last modified Tuesday, 20-Mar-2007 10:40:07 PDT
Cite GO • Terms of use • GO helpdesk
Copyright © 1999-Friday, 08-Aug-2008 14:02:20 PDT the Gene Ontology