Biorithm  1.1
Public Types | Public Member Functions
GeneOntology Class Reference

#include <GO.h>

List of all members.

Public Types

typedef boost::tuple
< GOFunction *, unsigned int,
unsigned int > 
GOLCATuple
 Compute all the least common ancestors of function1 and function2.

Public Member Functions

 GeneOntology ()
 GeneOntology (istream *in)
void read (string file)
void read (istream *in)
void readTerms (string file, vector< GOFunction * > &functions)
 Read a list of line-delimited GO terms and return a vector of GOFunction pointers corresponding to those terms.
void computeEnrichmentsOntologizer (const set< string > &geneSet, MyAnnotations &annotations, vector< EnrichmentRecord< string, string > > &rejalts) const
 Compute functions enriched in geneSet as described in the Ontologizer algorithm.
void computeFunctionsWithNoChildren (set< string > &functions, set< string > &childless)
 Computes the subset of 'functions' that have no children as members of 'functions'.
void computeFunctionsWithNoChildren (set< GOFunction * > &functions, set< GOFunction * > &childless)
 Computes the subset of 'functions' that have no children as members of 'functions'.
int computeLeastCommonAncestors (GOFunction *function1, GOFunction *function2, vector< GOLCATuple > &lcas)
void computeMostSpecificFunctions (set< GOFunction * > &functions, set< GOFunction * > &specificFunctions)
 Computes the subset of GOFunctions in functions that are not ancestors of any other GOFunctions in functions.
template<typename InputIterator , typename OutputIterator , typename Extractor >
void computeMostSpecificFunctionsTemplated (InputIterator functionsBegin, InputIterator functionsEnd, OutputIterator specificFunctionsBegin, Extractor _extractor)
void computeMostSpecificFunctions (set< string > &functions, set< string > &specificFunctions)
 Computes the subset of functions that are not ancestors of any other members of functions.
set< GOFunction * > getAncestors (GOFunction *obj, int type=0)
void setBoolRelationship (bool r)
void setStrRelationshipFile (string file)
map< GOFunction *, unsigned int > getAncestorsWithDistance (GOFunction *obj, int type=0)
set< GOFunction * > getParents (GOFunction *obj, int type=0)
set< GOFunction * > getDescendants (GOFunction *obj, int type=0)
map< GOFunction *, unsigned int > getDescendantsWithDistance (GOFunction *obj, int type=0)
set< GOFunction * > getChildren (GOFunction *obj, int type=0)
GOFunctiongetFunctionById (int id) const
GOFunctiongetFunctionById (string id) const
GOFunctiongetFunctionByName (string name) const
bool isKnownFunctionId (string id) const
void computeFunctionDepths (unsigned int type=0)
void computeSiblings (unsigned int type=0)
 Compute the siblings of each function in the Gene Ontology.
void computeTopologicalSort (vector< GOFunction * > &sortedFunctions, unsigned int type=0)
 Performs a topological sort of the functions in the Gene Ontology from the root downward.
void printFunctionDepths (ostream &fstr, unsigned int type=0)
void printFunctionDescendants (ostream &fstr, unsigned int type=0)
void printRelationships ()
 Prints out a file representing the relationships between GO functions.
void groupFunctionsByDepth (map< unsigned int, set< GOFunction * > > &functionsByDepth, string functionCategory="biological_process", GOGroupByDepthType groupingType=GO_GROUP_BY_MINIMUM_DEPTH)
void groupFunctionsByCutoff (map< string, map< unsigned int, set< GOFunction * > > > &functionsByCutoff, const vector< unsigned int > &cutoffs, const map< BioFunction, unsigned int > &geneCounts, bool allowDescendants=false)
void groupFunctionsByDepth2 (map< string, map< unsigned int, set< GOFunction * > > > &functionsByDepth, bool allowDescendants=false)
void groupFunctionsByDepth2 (map< string, map< unsigned int, set< BioFunction > > > &functionsByDepth, bool allowDescendants=false)
 ~GeneOntology ()

Detailed Description

A class that encapsulates a gene ontology, read in by parsing a gene ontology obo file


Member Typedef Documentation

typedef boost::tuple< GOFunction *, unsigned int, unsigned int > GeneOntology::GOLCATuple

Compute all the least common ancestors of function1 and function2.

The method defines the "height" of a common ancestor of two functions as the sum of the distances of the common ancestor to the two functions. The least common ancestors are all the common ancestors of minimum height.

Parameters:
[in]function1,apointer to an instance of GOFunction.
[in]function2,apointer to an instance of GOFunction.
[out]lcas,areference to a vector of pointers to GOFunction, which will store all the least common ancestors.
Returns:
The height of the least common ancestors. The return value is -1 if function1 and function2 do have not any common ancestor.

Constructor & Destructor Documentation

Constructor

GeneOntology::GeneOntology ( istream *  in)

Constructor

Parameters:
*inA stream pointer (eg: &cout, &infile, &myStringStream)

Destructor warning: be careful when passing an instance of GeneOntology by reference, because the destructor explicitly deletes functions from memory. If the destructor is called multiple times, the compiler will choke.


Member Function Documentation

void GeneOntology::computeEnrichmentsOntologizer ( const set< string > &  geneSet,
MyAnnotations annotations,
vector< EnrichmentRecord< string, string > > &  rejalts 
) const

Compute functions enriched in geneSet as described in the Ontologizer algorithm.

The Ontologizer algorithm attempts to take parent-child relationships in the Gene Ontology into account. This method implements that approach. Specifically, the universe is not the set of all annotated genes but the set of genes annotating the parents of the current function. This method traverses the Gene Ontology DAG recursively from top to bottom.

void GeneOntology::computeFunctionDepths ( unsigned int  type = 0)

Compute the depths of each node in the Gene Ontology.

Performs a topological sort of the functions in the Gene Ontology from the root downward and stores the depths with the nodes. The root of the Gene Ontology has depth -1, the root of each category has depth 0, and so on.

Parameters:
type,aninteger defining what type of relations to use when computing depths.
void GeneOntology::computeFunctionsWithNoChildren ( set< string > &  functions,
set< string > &  childless 
)

Computes the subset of 'functions' that have no children as members of 'functions'.

Parameters:
[in]functions,theset of functions in which to search for 'functions' with no children.
[out]childless,thesubset of 'functions' which have no children in 'functions'.
Note:
This method is a thin wrapper around computeFunctionsWithNoChildren(set< GOFunction* >, set< GOFunction* >)
void GeneOntology::computeFunctionsWithNoChildren ( set< GOFunction * > &  functions,
set< GOFunction * > &  childless 
)

Computes the subset of 'functions' that have no children as members of 'functions'.

Parameters:
[in]functions,theset of functions in which to search for 'functions' with no children.
[out]childless,thesubset of 'functions' which have no children in 'functions'.
void GeneOntology::computeMostSpecificFunctions ( set< GOFunction * > &  functions,
set< GOFunction * > &  specificFunctions 
)

Computes the subset of GOFunctions in functions that are not ancestors of any other GOFunctions in functions.

Note:
functions is a const reference since this method invokes non-const methods on the GOFunctions stored in functions.
void GeneOntology::computeMostSpecificFunctions ( set< string > &  functions,
set< string > &  specificFunctions 
)

Computes the subset of functions that are not ancestors of any other members of functions.

Note:
This method is a thin wrapper around GeneOntology::computeMostSpecificFunctions(set< GOFunction * >, set< GOFunction * >).
void GeneOntology::computeTopologicalSort ( vector< GOFunction * > &  sortedFunctions,
unsigned int  type = 0 
)

Performs a topological sort of the functions in the Gene Ontology from the root downward.

The nodes in each connected component of the DAG appear consecutively in the sorted order.

Parameters:
sortedFunction,areference to a vector of pointers to instance of GOFunction. The method fills this vector with the sorted list of functions.
type,aninteger defining what type of relations to use when performing the topological sort.
set< GOFunction * > GeneOntology::getAncestors ( GOFunction obj,
int  type = 0 
)

Performs a transitive closure traversal of the gene ontology to determine the ancestors of a particular function. Partial traversals are cached making this a constant time operation when repeated and fast if some of the ancestors have already been computed

Parameters:
objA gene ontology function object pointer
typean integer defining what type of traversal will be done:
  • 0 means any parent relation (currently is_a and part_of)
  • 1 means follow only is_a relations
  • 2 means follow only part_of relations
Returns:
A set of ancestors of the type specified
map< GOFunction *, unsigned int > GeneOntology::getAncestorsWithDistance ( GOFunction obj,
int  type = 0 
)

This method is similar to GeneOntology::getAncestors(), except that along with each ancestral function, the method also returns the shortest distance from the ancestor to the argument obj.

Parameters:
objA gene ontology function object pointer
typean integer defining what type of traversal will be done:
  • 0 means any parent relation (currently is_a and part_of)
  • 1 means follow only is_a relations
  • 2 means follow only part_of relations
Returns:
A map each of whose keys is an ancestor of the type specified and whose value is the shortest distance to the ancestor.
set< GOFunction * > GeneOntology::getChildren ( GOFunction obj,
int  type = 0 
)

Returns a function's direct children

Parameters:
objA gene ontology function object pointer
typean integer defining what type of children to return:
  • 0 means any child relation (currently is_a and part_of)
  • 1 means only is_a relations
  • 2 means only part_of relations
Returns:
A set of children of the type specified
set< GOFunction * > GeneOntology::getDescendants ( GOFunction obj,
int  type = 0 
)

Performs a transitive closure traversal of the gene ontology to determine the descendants of a particular function. Partial traversals are cached making this a constant time operation when repeated and fast if some of the descendants have already been computed

Parameters:
objA gene ontology function object pointer
typean integer defining what type of traversal will be done:
  • 0 means any descendant relation (currently is_a and part_of)
  • 1 means follow only is_a relations
  • 2 means follow only part_of relations
Returns:
A set of descendants of the type specified
map< GOFunction *, unsigned int > GeneOntology::getDescendantsWithDistance ( GOFunction obj,
int  type = 0 
)

This method is similar to GeneOntology::getDescendants(), except that along with each descendant function, the method also returns the shortest distance from the descendant to the argument obj.

Parameters:
objA gene ontology function object pointer
typean integer defining what type of traversal will be done:
  • 0 means any parent relation (currently is_a and part_of)
  • 1 means follow only is_a relations
  • 2 means follow only part_of relations
Returns:
A map each of whose keys is a descendant of the type specified and whose value is the shortest distance to the descendant.

Constant time lookup of function by id (or alternative id)

Parameters:
idThe id to lookup
Returns:
a go function pointer that has that id
GOFunction* GeneOntology::getFunctionById ( string  id) const [inline]

Constant time lookup of function by id (or alternative id)

Parameters:
idThe id to lookup. This id is a string.
Returns:
a go function pointer that has that id.

The method converts id to an integer to removing any prefix of the form "GO:"

GOFunction * GeneOntology::getFunctionByName ( string  name) const

Constant time lookup of function by its name (or synonym)

Parameters:
nameThe name to lookup
Returns:
a pointer to GOFunction that has that name
set< GOFunction * > GeneOntology::getParents ( GOFunction obj,
int  type = 0 
)

Returns a function's direct parents

Parameters:
objA gene ontology function object pointer
typean integer defining what type of parents to return:
  • 0 means any parent relation (currently is_a and part_of)
  • 1 means only is_a relations
  • 2 means only part_of relations
Returns:
A set of parents of the type specified
void GeneOntology::groupFunctionsByDepth ( map< unsigned int, set< GOFunction * > > &  functionsByDepth,
string  functionCategory = "biological_process",
GOGroupByDepthType  groupingType = GO_GROUP_BY_MINIMUM_DEPTH 
)

Group functions having the same depth and return a vector whose indices are depths and values are sets of functions.

Parameters:
functionCategory,thename of the GO category (aka namespace) to restrict functions to. The default is "biological_process". You can specify "molecular_function" or "cellular_component" too. You can also specify the abbreviations "p", "f", and "c", respectively.
groupingType,specifieshow to group the functions. This variable can take three values: GO_GROUP_BY_MINIMUM_DEPTH, use the minimum depth of a function; GO_GROUP_BY_AVERAGE_DEPTH, use the average depth of a function; and GO_GROUP_BY_MAXIMUM_DEPTH, use the maximum depth of a function.
void GeneOntology::printFunctionDepths ( ostream &  fstr,
unsigned int  type = 0 
)

Print the minimum depth of each function to fstr.

void GeneOntology::printFunctionDescendants ( ostream &  fstr,
unsigned int  type = 0 
)

Print the descendants of each function to fstr.

Parameters:
[out]fstr,theoutput stream to print the results to.

The output stream contains three columns.

  1. The identifier of the function.
  2. The identifier of the descendant.
  3. The smallest distance between the function and the descendant.

Prints out a file representing the relationships between GO functions.

The method outputs a file with the following structure: FunctionA FunctionB RelationType

Where FunctionA is related to FunctionB through the relation RelationType. RelationshipType can take on one of the following values: is_a, part_of, regulates, positively_regualtes, or negatively_regulates.

void GeneOntology::read ( string  file)

Reads information about the Gene Ontology.

Parameters:
fileA string containing the location of a file in OBO format.
void GeneOntology::read ( istream *  in)

Reads information about the Gene Ontology.

Parameters:
*inA stream pointer (eg: &cout, &infile, &myStringStream)
void GeneOntology::readTerms ( string  file,
vector< GOFunction * > &  functions 
)

Read a list of line-delimited GO terms and return a vector of GOFunction pointers corresponding to those terms.

Parameters:
fileThe location of file which lists one GO term per line. Each line should be tab-delmited with at least one column. The first column must be the GO id (e.g., 'GO:0008150' for biological_process).
functions,Asthe file is read, a GOFunction pointer is created for each function and added to the end of the functions vector.

The documentation for this class was generated from the following files:
 All Classes Functions Variables Typedefs Friends