Biorithm
1.1
|
#include <GO.h>
Public Types | |
typedef boost::tuple < GOFunction *, unsigned int, unsigned int > | GOLCATuple |
Compute all the least common ancestors of function1 and function2. | |
Public Member Functions | |
GeneOntology () | |
GeneOntology (istream *in) | |
void | read (string file) |
void | read (istream *in) |
void | readTerms (string file, vector< GOFunction * > &functions) |
Read a list of line-delimited GO terms and return a vector of GOFunction pointers corresponding to those terms. | |
void | computeEnrichmentsOntologizer (const set< string > &geneSet, MyAnnotations &annotations, vector< EnrichmentRecord< string, string > > &rejalts) const |
Compute functions enriched in geneSet as described in the Ontologizer algorithm. | |
void | computeFunctionsWithNoChildren (set< string > &functions, set< string > &childless) |
Computes the subset of 'functions' that have no children as members of 'functions'. | |
void | computeFunctionsWithNoChildren (set< GOFunction * > &functions, set< GOFunction * > &childless) |
Computes the subset of 'functions' that have no children as members of 'functions'. | |
int | computeLeastCommonAncestors (GOFunction *function1, GOFunction *function2, vector< GOLCATuple > &lcas) |
void | computeMostSpecificFunctions (set< GOFunction * > &functions, set< GOFunction * > &specificFunctions) |
Computes the subset of GOFunctions in functions that are not ancestors of any other GOFunctions in functions. | |
template<typename InputIterator , typename OutputIterator , typename Extractor > | |
void | computeMostSpecificFunctionsTemplated (InputIterator functionsBegin, InputIterator functionsEnd, OutputIterator specificFunctionsBegin, Extractor _extractor) |
void | computeMostSpecificFunctions (set< string > &functions, set< string > &specificFunctions) |
Computes the subset of functions that are not ancestors of any other members of functions. | |
set< GOFunction * > | getAncestors (GOFunction *obj, int type=0) |
void | setBoolRelationship (bool r) |
void | setStrRelationshipFile (string file) |
map< GOFunction *, unsigned int > | getAncestorsWithDistance (GOFunction *obj, int type=0) |
set< GOFunction * > | getParents (GOFunction *obj, int type=0) |
set< GOFunction * > | getDescendants (GOFunction *obj, int type=0) |
map< GOFunction *, unsigned int > | getDescendantsWithDistance (GOFunction *obj, int type=0) |
set< GOFunction * > | getChildren (GOFunction *obj, int type=0) |
GOFunction * | getFunctionById (int id) const |
GOFunction * | getFunctionById (string id) const |
GOFunction * | getFunctionByName (string name) const |
bool | isKnownFunctionId (string id) const |
void | computeFunctionDepths (unsigned int type=0) |
void | computeSiblings (unsigned int type=0) |
Compute the siblings of each function in the Gene Ontology. | |
void | computeTopologicalSort (vector< GOFunction * > &sortedFunctions, unsigned int type=0) |
Performs a topological sort of the functions in the Gene Ontology from the root downward. | |
void | printFunctionDepths (ostream &fstr, unsigned int type=0) |
void | printFunctionDescendants (ostream &fstr, unsigned int type=0) |
void | printRelationships () |
Prints out a file representing the relationships between GO functions. | |
void | groupFunctionsByDepth (map< unsigned int, set< GOFunction * > > &functionsByDepth, string functionCategory="biological_process", GOGroupByDepthType groupingType=GO_GROUP_BY_MINIMUM_DEPTH) |
void | groupFunctionsByCutoff (map< string, map< unsigned int, set< GOFunction * > > > &functionsByCutoff, const vector< unsigned int > &cutoffs, const map< BioFunction, unsigned int > &geneCounts, bool allowDescendants=false) |
void | groupFunctionsByDepth2 (map< string, map< unsigned int, set< GOFunction * > > > &functionsByDepth, bool allowDescendants=false) |
void | groupFunctionsByDepth2 (map< string, map< unsigned int, set< BioFunction > > > &functionsByDepth, bool allowDescendants=false) |
~GeneOntology () |
A class that encapsulates a gene ontology, read in by parsing a gene ontology obo file
typedef boost::tuple< GOFunction *, unsigned int, unsigned int > GeneOntology::GOLCATuple |
Compute all the least common ancestors of function1 and function2.
The method defines the "height" of a common ancestor of two functions as the sum of the distances of the common ancestor to the two functions. The least common ancestors are all the common ancestors of minimum height.
[in] | function1,a | pointer to an instance of GOFunction. |
[in] | function2,a | pointer to an instance of GOFunction. |
[out] | lcas,a | reference to a vector of pointers to GOFunction, which will store all the least common ancestors. |
GeneOntology::GeneOntology | ( | ) | [inline] |
Constructor
GeneOntology::GeneOntology | ( | istream * | in | ) |
Constructor
*in | A stream pointer (eg: &cout, &infile, &myStringStream) |
Destructor warning: be careful when passing an instance of GeneOntology by reference, because the destructor explicitly deletes functions from memory. If the destructor is called multiple times, the compiler will choke.
void GeneOntology::computeEnrichmentsOntologizer | ( | const set< string > & | geneSet, |
MyAnnotations & | annotations, | ||
vector< EnrichmentRecord< string, string > > & | rejalts | ||
) | const |
Compute functions enriched in geneSet as described in the Ontologizer algorithm.
The Ontologizer algorithm attempts to take parent-child relationships in the Gene Ontology into account. This method implements that approach. Specifically, the universe is not the set of all annotated genes but the set of genes annotating the parents of the current function. This method traverses the Gene Ontology DAG recursively from top to bottom.
void GeneOntology::computeFunctionDepths | ( | unsigned int | type = 0 | ) |
Compute the depths of each node in the Gene Ontology.
Performs a topological sort of the functions in the Gene Ontology from the root downward and stores the depths with the nodes. The root of the Gene Ontology has depth -1, the root of each category has depth 0, and so on.
type,an | integer defining what type of relations to use when computing depths. |
void GeneOntology::computeFunctionsWithNoChildren | ( | set< string > & | functions, |
set< string > & | childless | ||
) |
Computes the subset of 'functions' that have no children as members of 'functions'.
[in] | functions,the | set of functions in which to search for 'functions' with no children. |
[out] | childless,the | subset of 'functions' which have no children in 'functions'. |
void GeneOntology::computeFunctionsWithNoChildren | ( | set< GOFunction * > & | functions, |
set< GOFunction * > & | childless | ||
) |
Computes the subset of 'functions' that have no children as members of 'functions'.
[in] | functions,the | set of functions in which to search for 'functions' with no children. |
[out] | childless,the | subset of 'functions' which have no children in 'functions'. |
void GeneOntology::computeMostSpecificFunctions | ( | set< GOFunction * > & | functions, |
set< GOFunction * > & | specificFunctions | ||
) |
Computes the subset of GOFunctions in functions that are not ancestors of any other GOFunctions in functions.
void GeneOntology::computeMostSpecificFunctions | ( | set< string > & | functions, |
set< string > & | specificFunctions | ||
) |
Computes the subset of functions that are not ancestors of any other members of functions.
void GeneOntology::computeTopologicalSort | ( | vector< GOFunction * > & | sortedFunctions, |
unsigned int | type = 0 |
||
) |
Performs a topological sort of the functions in the Gene Ontology from the root downward.
The nodes in each connected component of the DAG appear consecutively in the sorted order.
sortedFunction,a | reference to a vector of pointers to instance of GOFunction. The method fills this vector with the sorted list of functions. |
type,an | integer defining what type of relations to use when performing the topological sort. |
set< GOFunction * > GeneOntology::getAncestors | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
Performs a transitive closure traversal of the gene ontology to determine the ancestors of a particular function. Partial traversals are cached making this a constant time operation when repeated and fast if some of the ancestors have already been computed
obj | A gene ontology function object pointer |
type | an integer defining what type of traversal will be done:
|
map< GOFunction *, unsigned int > GeneOntology::getAncestorsWithDistance | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
This method is similar to GeneOntology::getAncestors(), except that along with each ancestral function, the method also returns the shortest distance from the ancestor to the argument obj.
obj | A gene ontology function object pointer |
type | an integer defining what type of traversal will be done:
|
set< GOFunction * > GeneOntology::getChildren | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
Returns a function's direct children
obj | A gene ontology function object pointer |
type | an integer defining what type of children to return:
|
set< GOFunction * > GeneOntology::getDescendants | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
Performs a transitive closure traversal of the gene ontology to determine the descendants of a particular function. Partial traversals are cached making this a constant time operation when repeated and fast if some of the descendants have already been computed
obj | A gene ontology function object pointer |
type | an integer defining what type of traversal will be done:
|
map< GOFunction *, unsigned int > GeneOntology::getDescendantsWithDistance | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
This method is similar to GeneOntology::getDescendants(), except that along with each descendant function, the method also returns the shortest distance from the descendant to the argument obj.
obj | A gene ontology function object pointer |
type | an integer defining what type of traversal will be done:
|
GOFunction * GeneOntology::getFunctionById | ( | int | id | ) | const |
Constant time lookup of function by id (or alternative id)
id | The id to lookup |
GOFunction* GeneOntology::getFunctionById | ( | string | id | ) | const [inline] |
Constant time lookup of function by id (or alternative id)
id | The id to lookup. This id is a string. |
The method converts id to an integer to removing any prefix of the form "GO:"
GOFunction * GeneOntology::getFunctionByName | ( | string | name | ) | const |
Constant time lookup of function by its name (or synonym)
name | The name to lookup |
set< GOFunction * > GeneOntology::getParents | ( | GOFunction * | obj, |
int | type = 0 |
||
) |
Returns a function's direct parents
obj | A gene ontology function object pointer |
type | an integer defining what type of parents to return:
|
void GeneOntology::groupFunctionsByDepth | ( | map< unsigned int, set< GOFunction * > > & | functionsByDepth, |
string | functionCategory = "biological_process" , |
||
GOGroupByDepthType | groupingType = GO_GROUP_BY_MINIMUM_DEPTH |
||
) |
Group functions having the same depth and return a vector whose indices are depths and values are sets of functions.
functionCategory,the | name of the GO category (aka namespace) to restrict functions to. The default is "biological_process". You can specify "molecular_function" or "cellular_component" too. You can also specify the abbreviations "p", "f", and "c", respectively. |
groupingType,specifies | how to group the functions. This variable can take three values: GO_GROUP_BY_MINIMUM_DEPTH, use the minimum depth of a function; GO_GROUP_BY_AVERAGE_DEPTH, use the average depth of a function; and GO_GROUP_BY_MAXIMUM_DEPTH, use the maximum depth of a function. |
void GeneOntology::printFunctionDepths | ( | ostream & | fstr, |
unsigned int | type = 0 |
||
) |
Print the minimum depth of each function to fstr.
void GeneOntology::printFunctionDescendants | ( | ostream & | fstr, |
unsigned int | type = 0 |
||
) |
Print the descendants of each function to fstr.
[out] | fstr,the | output stream to print the results to. |
The output stream contains three columns.
void GeneOntology::printRelationships | ( | ) |
Prints out a file representing the relationships between GO functions.
The method outputs a file with the following structure: FunctionA FunctionB RelationType
Where FunctionA is related to FunctionB through the relation RelationType. RelationshipType can take on one of the following values: is_a, part_of, regulates, positively_regualtes, or negatively_regulates.
void GeneOntology::read | ( | string | file | ) |
Reads information about the Gene Ontology.
file | A string containing the location of a file in OBO format. |
void GeneOntology::read | ( | istream * | in | ) |
Reads information about the Gene Ontology.
*in | A stream pointer (eg: &cout, &infile, &myStringStream) |
void GeneOntology::readTerms | ( | string | file, |
vector< GOFunction * > & | functions | ||
) |
Read a list of line-delimited GO terms and return a vector of GOFunction pointers corresponding to those terms.
file | The location of file which lists one GO term per line. Each line should be tab-delmited with at least one column. The first column must be the GO id (e.g., 'GO:0008150' for biological_process). |
functions,As | the file is read, a GOFunction pointer is created for each function and added to the end of the functions vector. |