Biorithm  1.1
Public Member Functions
Enrichment< S, T > Class Template Reference

Enrichment Class. More...

#include <enrichment.h>

List of all members.

Public Member Functions

 Enrichment ()
void clear ()
double computeHyperGeometricProbability (int globTotal, int globTrue, int locTotal, int locTrue)
void addUnannotated (int numS, int numT)
bool check (double value, LIBENRICHMENT_TEST_TYPE test, double alpha, unsigned int numTests, unsigned int rank)
double correct (double value, LIBENRICHMENT_TEST_TYPE test, double alpha, unsigned int numTests, unsigned int rank)
void check (vector< EnrichmentRecord< S, T > > &in, vector< EnrichmentRecord< S, T > > &out, LIBENRICHMENT_TEST_TYPE test, double alpha)
void getEnrichments (set< S > &in, vector< EnrichmentRecord< S, T > > &out)
void getEnrichmentsGSEA (set< S > &in, vector< EnrichmentRecord< S, T > > &out)
bool isPair (const S &s, const T &t) const
void loadPairs (const vector< pair< S, T > > &in)
void loadPairs (const map< S, set< T > > &in)
void FormSettings (string Db, string Host, string User, string Passwd)
 The default setting forms a connection to oncogroup.
vector< long > LocusIndex (vector< long > LL)
 Retrieve database ids from locus link ids.
void SetUniverse (vector< long > Uni)
 Set the full universe of functions and associated probe counts.
void SetQuery (vector< long > Qry)
 Set the probes associated with a single bicluster.
int SetBicluster (long clusId)
 Set the universe and query sets through a single database call.
void Enrich (double threshold)
 Calculate Enrichment.
map< long, vector< long > > getEnrich_p2f (void)
 Get an enrichment map from probes to functions.
map< long, vector< long > > getEnrich_f2p (void)
 Get an enrichment map from functions to probes.
map< long, double > getEnrich_f2val (void)
 Get a map from functions to associated p-values.
map< long, vector< long > > getUni_f2p (void)
 Get a map of the universal functions to associated probes.
map< long, vector< long > > getUni_p2f (void)
 Get a map of the universal probes to associated functions.
int InsertEnrich (void)
 Insert the enrichment results into the database.
void ClearResults (void)
 Clear the enrichment results from the last query.
void ClearUniverse (void)
 Clear all data.
void functionFile (string fileName)
 Produce function reference file.

Detailed Description

template<class S, class T>
class Enrichment< S, T >

Enrichment Class.

Enrichment Class

The Enrichment class is designed to read data from a mysql database and calculate the p-values for biclusters. Each experiment has a range of associated probes and functions. A bicluster associated with the experiment is defined with its own range of associated probes and functions. From this data, p-values are calculated for each function associataed with the bicluster. The results, which would be sets of referenced probe-to-functions with p-values, can be generated as either maps or directly inserted into the mysql database.

NOTE: The mysql database referenced in this program is oncogroup at whipple.cs.vt.edu. This data base was generated and maintained by Greg Grothaus.


Constructor & Destructor Documentation

template<class S , class T >
Enrichment< S, T >::Enrichment ( )

Constructor


Member Function Documentation

template<class S , class T >
void Enrichment< S, T >::addUnannotated ( int  numS,
int  numT 
)

Adds unpaired elements of each type

Parameters:
numSnumber of unpaired elements of type S
numTnumber of unpaired elements of type T
template<class S , class T >
bool Enrichment< S, T >::check ( double  value,
LIBENRICHMENT_TEST_TYPE  test,
double  alpha,
unsigned int  numTests,
unsigned int  rank 
)

Given a value of statistical significance, checks if the value is significant by performing a correction for testing multiple hypotheses.

Parameters:
valueThe uncorrect statistical significance.
testThe type of test to be used. Possible values are LIBENRICHMENT_NONE,LIBENRICHMENT_BONFERRONI, LIBENRICHMENT_HOLMS, and LIBENRICHMENT_FALSE_DISCOVERY_RATE
alphaThe alpha cutoff specifying the probability of a Type I statistical error
numTestsThe number of multiple hypotheses being tested.
rankThe index of this particular test in the list of all tested hypotheses sorted by uncorrected statistical significance.
Returns:
true if and only if the corrected value is at most alpha.
template<class S , class T >
void Enrichment< S, T >::check ( vector< EnrichmentRecord< S, T > > &  in,
vector< EnrichmentRecord< S, T > > &  out,
LIBENRICHMENT_TEST_TYPE  test,
double  alpha 
)

Given an EnrichmentRecord list of all hypotheses, performs a multiple hypothesis correction.

Parameters:
inA vector of EnrichmentRecords sorted by increasing p value (generally the output from getEnrichments)
testThe type of test to be used. Possible values are LIBENRICHMENT_NONE,LIBENRICHMENT_BONFERRONI, LIBENRICHMENT_HOLMS, LIBENRICHMENT_FALSE_DISCOVERY_RATE
[in]alphaThe alpha cutoff specifying the probability of a Type I statistical error
[out]out,asubset of the input vector that passes the multiple hypotheses test
Precondition:
Assumes that 'in' is already sorted. This is true if 'in' is the output of getEnrichments
template<class S , class T >
void Enrichment< S, T >::clear ( )

Removes all stored data.

Warning:
If you want to use the same instance of Enrichment for different categories of type T (for instance, different Gene Ontology categories), you must invoke clear() method when switching to a different category. Alternately, use different instances of Enrichment for different categories.
template<class S , class T >
void Enrichment< S, T >::ClearResults ( void  )

Clear the enrichment results from the last query.

This function removes all of the probes, functions, and enrichment values associated with the last query. This should be done before every query.

template<class S , class T >
void Enrichment< S, T >::ClearUniverse ( void  )

Clear all data.

This function will remove all of the data from this data structure. This should be done before a set of experiment probes are defined.

template<class S , class T >
double Enrichment< S, T >::computeHyperGeometricProbability ( int  globTotal,
int  globTrue,
int  locTotal,
int  locTrue 
)

Calculates the Hypergeometric Statistic using summary values

Parameters:
globTotalThe total number of elements in your global set of objects
globTrueThe total number of elements in your global set of objects with property P.
locTotalThe total number of elements in your subset of objects
locTrueThe total number of elements in your subset of objects with property P
Returns:
the probability of seeing locTrue objects out of locTotal with property P if locTotal was a random subset
template<class S , class T >
double Enrichment< S, T >::correct ( double  value,
LIBENRICHMENT_TEST_TYPE  test,
double  alpha,
unsigned int  numTests,
unsigned int  rank 
)

Given a value of statistical significance, corrects the value to account for testing multiple hypotheses.

Parameters:
valueThe uncorrect statistical significance.
testThe type of test to be used. Possible values are LIBENRICHMENT_NONE,LIBENRICHMENT_BONFERRONI, LIBENRICHMENT_HOLMS, and LIBENRICHMENT_FALSE_DISCOVERY_RATE
alphaThe alpha cutoff specifying the probability of a Type I statistical error
numTestsThe number of multiple hypotheses being tested.
rankThe index of this particular test in the list of all tested hypotheses sorted by uncorrected statistical significance.
Returns:
The corrected value.
template<class S , class T >
void Enrichment< S, T >::Enrich ( double  threshold)

Calculate Enrichment.

This function will take all of the data defined by functions SetUniverse and SetQuery to calculate the hyper-geometric p-values for each function data object that was found associated. All functions with associated p-values less than the threshold will be stored.

Parameters:
thresholddefines the maximum p-values of interest. All p-values found less than the threshold are stored.
template<class S , class T >
void Enrichment< S, T >::FormSettings ( string  Db,
string  Host,
string  User,
string  Passwd 
)

The default setting forms a connection to oncogroup.

This function allows the user to define a different database from the default setting for oncogroup at whipple.cs.vt.edu

Parameters:
Dbis the databasename
Hostis the host server
Useris the account used to access the database
Passwdis the associated password
template<class S , class T >
void Enrichment< S, T >::functionFile ( string  fileName)

Produce function reference file.

This function will produce a file which contains all of the functions upon which an enrichment calculation is made. The file contains four columns:

[database function ID] tab [Num of associated probes in the universe] tab [Num of associated probes in the bicluster] tab [Enrichment Score]

The file will also contain the total number of probes in the entire universe and the entire bicluster.

template<class S , class T >
map< long, vector< long > > Enrichment< S, T >::getEnrich_f2p ( void  )

Get an enrichment map from functions to probes.

This function will produce a map that associates functions with a vector of probes such that the function has p-value less than the threshold.

Returns:
A map from function database ids to a vector of probe database ids.
template<class S , class T >
map< long, double > Enrichment< S, T >::getEnrich_f2val ( void  )

Get a map from functions to associated p-values.

This function produces a map to associate each function with its associated p-value.

Returns:
A map from function database ids to the associated p-values.
template<class S , class T >
map< long, vector< long > > Enrichment< S, T >::getEnrich_p2f ( void  )

Get an enrichment map from probes to functions.

This function will give back a map that associates each probe with a vector of functions such that all the functions had p-value less than the threshold.

Returns:
The function returns a map from probes to function vectors. All the values are database ids.
template<class S , class T >
void Enrichment< S, T >::getEnrichments ( set< S > &  in,
vector< EnrichmentRecord< S, T > > &  out 
)

Calculates enrichments for all T objects given a set of S objects

Precondition:
Assumes that S/T object relationships have already been loaded through loadPairs and add addUnannotated.
Parameters:
[in]inA set of S objects to find enrichments of T object type
[out]avector of EnrichmentRecords sorted in increasing order of p-value
template<class S , class T >
map< long, vector< long > > Enrichment< S, T >::getUni_f2p ( void  )

Get a map of the universal functions to associated probes.

This function will return a map that associates every function in the universe to all of its associated probes

Returns:
A map from function database ids to a vector of associated database probe ids.
template<class S , class T >
map< long, vector< long > > Enrichment< S, T >::getUni_p2f ( void  )

Get a map of the universal probes to associated functions.

This function will return a map that associates every probe in the universe to all of its associated functions.

Returns:
A map from probe database ids to a vector of function database ids.
template<class S , class T >
int Enrichment< S, T >::InsertEnrich ( void  )

Insert the enrichment results into the database.

This function will take all of the enrichments defined by the Enrich function and insert the results into the table biclusterXfunction. The the enrichment values will be inserted with the database ids for the function and the bicluster.

Returns:
The function will return 1 if the insertion was succesful, 0 otherwise.
template<class S , class T >
bool Enrichment< S, T >::isPair ( const S &  s,
const T &  t 
) const

Checks if pair of objects are related in the set of stored pairs.

Parameters:
[in]s,aninstance of S.
[in]t,aninstance of T.
template<class S , class T >
void Enrichment< S, T >::loadPairs ( const vector< pair< S, T > > &  in)

Loads relationship data as a set of pairs

Parameters:
[in]inA vector of pairs of objects S and T such that there is a relationship between S and T. For example, S is a gene or protein and T is a function.
template<class S , class T >
void Enrichment< S, T >::loadPairs ( const map< S, set< T > > &  in)

Loads relationship data as a map of maps.

Parameters:
[in]inA map keyed by elements of type S, where each value is a set of elements of type T. For example, S is a gene or protein and T is a function.
template<class S , class T >
vector< long > Enrichment< S, T >::LocusIndex ( vector< long >  LL)

Retrieve database ids from locus link ids.

The set universe and set query commands expect database ids as variable. This may not always be easy for the user. Therefore, the following commands produce database id vectors based on different index angles.

Parameters:
LLis a vector of long values which are the locus link index values for probes
Returns:
The resulting vector contains corresponding probe database id for each locus link id. If any probe is not found, a corresponding -1 value will placed in the vector position.
template<class S , class T >
int Enrichment< S, T >::SetBicluster ( long  clusId)

Set the universe and query sets through a single database call.

This function will generate the full set of probes for the universe and the query by traversing the mysql database oncogroup based on a single reference to a defined Bicluster.

Parameters:
clusIdis the database index for a bicluster that has already been defined in the database
Returns:
An integer is returned to define the success of the function: 1 is success, 0 is error.
template<class S , class T >
void Enrichment< S, T >::SetQuery ( vector< long >  Qry)

Set the probes associated with a single bicluster.

This function will traverse all of the functions associated with the probes defined in the vector Qry. The associations will be used to calculate p-values

Parameters:
Qryis a vector of longs. The longs are database ids for the probes.
template<class S , class T >
void Enrichment< S, T >::SetUniverse ( vector< long >  Uni)

Set the full universe of functions and associated probe counts.

The function traverses all of the functional annotations associated with the vector Uni. In the process association counts are made at each function found. These values are required for further enrichment analysis.

Parameters:
Uniis a vector of long. The long values are database ids for probes

The documentation for this class was generated from the following files:
 All Classes Functions Variables Typedefs Friends