Biorithm  1.1
Public Member Functions
Reporter Class Reference

List of all members.

Public Member Functions

 Reporter (string outDir, string cmdLine)
void addCV (string function, MyGainCVResult result, MyNT threshold, string algorithm="Hopfield")
void addCV (string gene, string function, MyGainAnnotationType correctState, MyGainAnnotationType predictedState, MyNT correctStateConfidence, MyNT predictionConfidence, string algorithm="Hopfield")
 Add results of cross validation of a single gene/protein for a function. The method also updates the overall cross validation results.
void addPrediction (string gene, string function, MyNT prob, MyNT input, MyNT threshold, string algorithm="Hopfield")
void addPredictionCutBasedConfidence (string gene, string function, MyNT confidence, MyNT threshold, string algorithm="Hopfield")
void checkTruePathRuleForPredictions (string algorithm, MyAnnotations &annotations, const GeneOntology &go, ostream &ostr, map< string, set< string > > &tprViolations)
 Check whether the predictions follow the true path rule.
void comparePredictions (string algo1, string algo2, ostream &ostr)
 Compare the predictions for algo1 and algo2.
void computePredictionRanks (string algo, MyAnnotations &differentAnnotations, GeneOntology &go)
 For each prediction made by an algorithm, use the confidence of the prediction to compute its rank of that prediction in the list of all predictions and in the list of predictions for that function.
void evaluatePredictions (string algo, MyAnnotations &currentAnnotations, MyAnnotations &newAnnotations, GeneOntology &go)
 Evaluate quality of predictions based on new functional annotations in newAnnotations that are not in currentAnnotations.
void evaluatePredictionsForROCCurvesUsingRanks (string algo, MyAnnotations &differentAnnotations, GeneOntology &go)
 Evaluate quality of predictions based on new functional annotations in newAnnotations that are not in currentAnnotations and generate ROC curves.
void getAlgorithms (set< string > &algorithms)
 Return the set of algorithms with stored predictions.
void getGenesWithPredictions (string algo, set< string > &genes)
void printComparisonPredictionEvaluationROCCurves (const set< string > &algorithms, string outputDir, MyAnnotations &annotations, const GeneOntology &go)
 Print ROC curves comparing the predictions for multiple algorithms.
void printComparisonPredictionEvaluationAUCScatterPlots (const set< string > &algorithms, string outputDir, MyAnnotations &annotations)
void printDetailedCVResults (ostream &dcvfstr, bool flush=1)
void printCVResults (ostream &cvfstr, bool flush=1, const BioFunction *functionToPrint=NULL)
 Print results of cross-validation, one function per line.
void printPredictions (ostream &predfstr, int numPredictionsToPrintPerFunction, bool flush=1, const BioFunction *functionToPrint=NULL)
 Print prediction results, one (gene, function, algorithm) triple per line.
void printPredictionEvaluationROCCurves (ostream &ostr)
 Print ROC curves that summarise the evaluation.
void printPredictionEvaluations (ostream &ostr, GeneOntology *go=NULL)
 Print results of evaluating predictions, one (gene, function, algorithm) triple per line.
void printPredictionEvaluationSummary (ostream &ostr)
 Print various statistics that summarise the evaluation.
void readGeneUniverse (string guFile, set< string > &universe)
 Read the universe of genes over which GAIN operated in a previous invocation.
void readPredictions (string predFile, set< string > *onlyFunctions=NULL, string convertFunction="")
void readDetailedCVResults (string predFile, const set< string > *onlyFunctions=NULL, string convertFunction="")
void setExperimentName (string dataset)
void printCurveDataFromCV (ostream &out, GeneOntology &go, set< BioFunction > functions=set< BioFunction >(), string extra="")
 Print data for ROC and precision-recall curves from cross validation results.
void printECWeightedCurveDataFromCV (ostream &out, GeneOntology &go, MyAnnotations &annotations, set< BioFunction > functions=set< BioFunction >(), string extra="")
 Print data for ROC and precision-recall curves from cross validation results, taking evidence code weights into account.
void clear ()

Member Function Documentation

void Reporter::addCV ( string  function,
MyGainCVResult  result,
MyNT  threshold,
string  algorithm = "Hopfield" 
)

Add the result of cross validation for a single gene-function pair.

void Reporter::addCV ( string  gene,
string  function,
MyGainAnnotationType  correctState,
MyGainAnnotationType  predictedState,
MyNT  correctStateConfidence,
MyNT  predictionConfidence,
string  algorithm = "Hopfield" 
)

Add results of cross validation of a single gene/protein for a function. The method also updates the overall cross validation results.

Add cumulative results of cross validation for a function.

Parameters:
[in]geneThe gene involved in the cross validation.
[in]functionThe function involved in the cross validation.
[in]correctStateThe correct annotation of gene is with respect to function.
[in]predictedStateThe predicted annotation of gene is with respect to function.
[in]correctStateConfidenceA number (should be between 0 and 1) indicating the (a priori) confidence in correctState. Normally, this value is 1. For some algorithm, such as SinkSource with evidence codes, this value can depend on the weights assigned to the evidence codes.
[in]predictionConfidenceThe confidence in the prediction.
[in]algorithmthe name of the algorithm being cross validated.
Warning:
If you invoke this method, you should invoke it for every cross validated gene/function pair. You should not invoke the other addCV() method.
void Reporter::addPredictionCutBasedConfidence ( string  gene,
string  function,
MyNT  confidence,
MyNT  threshold,
string  algorithm = "Hopfield" 
)

Add the confidence for a prediction where the confidence is computed using distance to a cut in the FLN.

void Reporter::checkTruePathRuleForPredictions ( string  algorithm,
MyAnnotations annotations,
const GeneOntology go,
ostream &  ostr,
map< string, set< string > > &  tprViolations 
)

Check whether the predictions follow the true path rule.

Parameters:
[in]algorithm,thealgorithm whose predictions need checking.
[in]annotations,areference to an instance of MyAnnotations.
[in]go,areference to an instance of GeneOntology.
[in]ostr,anoutput stream to print results to.
[out]tprViolations,amap from strings to sets of strings that will store for each function, the set of genes predicted not to have that function but predicted to have some child of that function.

For each prediction (gene-function) pair made by algorithm, for each parent of the function, the method checks if the gene is either predicted to have or annotated with the parent.

void Reporter::comparePredictions ( string  algo1,
string  algo2,
ostream &  ostr 
)

Compare the predictions for algo1 and algo2.

Parameters:
[in]algo1,thename of the first algorithm to compare.
[in]algo2,thename of the second algorithm to compare.

For each prediction made by algo1, the method checks if algo2 also made that prediction. If only algo1 made the prediction or if the confidence values are different, the method prints out the details of the predictions.

Note:
To print a list of predictions made by algo2 but not by algo1, invoke the method with the first two arguments swapped.
void Reporter::computePredictionRanks ( string  algo,
MyAnnotations differentAnnotations,
GeneOntology go 
)

For each prediction made by an algorithm, use the confidence of the prediction to compute its rank of that prediction in the list of all predictions and in the list of predictions for that function.

Parameters:
[in]algo,thename of the algorithm to evaluate.
[in]differentAnnotations,thedifference between the new annotations and the current/old annotations.
[in]go,areference to an instance of GeneOntology. The method uses this variable to decide if a predicted function is not valid (e.g., it is now obsolete).

The method assumes that the calling context computed differentAnnotations after computing the transitive closure of the new and the current/old annotations.

void Reporter::evaluatePredictions ( string  algo,
MyAnnotations currentAnnotations,
MyAnnotations newAnnotations,
GeneOntology go 
)

Evaluate quality of predictions based on new functional annotations in newAnnotations that are not in currentAnnotations.

Parameters:
[in]algo,thename of the algorithm to evaluate.
[in]currentAnnotations,the"old" annotations that were the basis of the predictions.
[in]newAnnotations,the"new" annotations that are the basis of the evaluations.
[in]go,areference to an instance of GeneOntology. The method uses this variable to decide if a predicted function is not valid (e.g., it is now obsolete).

The method performs the following computations:

(i) Find those annotations in newAnnotations that are not in currentAnnotations.

(ii) Restrict this difference to verifiable genes, i.e., genes for which the algorithm has predicted at least one function.

(iii) For each verifiable gene, find the most specific predictions (MSPs). For each such prediction, compute the closest function annotating the gene in the difference. Consider a prediction (and the gene) to be verified if the closest function is a descendant of the function in the prediction; to assist this calculation, when comparing functions, the method considers a descendant to be closer than an ancestor or a relative.

(iv) For verified MSPs, compute the distribution (histogram) of distances from the predicted function to the verifying function.

(v) For verified genes, compute the distribution (histogram) of the smallest distance from a predicted function to the verifying function; the minimum is over all verified MSPS for a gene.

(vi) For verified genes, compute the distribution of confidence values for verified MSPs and for unverified MSPs.

(vi) For unverified genes (i.e., genes without any verified predictions), compute the distribution of confidence values for all MSPs.

void Reporter::evaluatePredictionsForROCCurvesUsingRanks ( string  algo,
MyAnnotations differentAnnotations,
GeneOntology go 
)

Evaluate quality of predictions based on new functional annotations in newAnnotations that are not in currentAnnotations and generate ROC curves.

Parameters:
[in]algo,thename of the algorithm to evaluate.
[in]differentAnnotations,thedifference between the new annotations and the current/old annotations.
[in]go,areference to an instance of GeneOntology. The method uses this variable to decide if a predicted function is not valid (e.g., it is now obsolete).

The method assumes that the calling context computed differentAnnotations after computing the transitive closure of the new and the current/old annotations.

void Reporter::getGenesWithPredictions ( string  algo,
set< string > &  genes 
)

Return all the genes that have a predicted function, where the prediction is made by the algorithm called algo.

void Reporter::printComparisonPredictionEvaluationROCCurves ( const set< string > &  algorithms,
string  outputDir,
MyAnnotations annotations,
const GeneOntology go 
)

Print ROC curves comparing the predictions for multiple algorithms.

Parameters:
[in]algorithms,aset of names of algorithms to compare.
[in]outputDir,thename of the directory to print plots to.

The method prints a single plot containing the ROC curves for all algorithms. For each function with at least one verifiable prediction, the method also prints a plot containing the ROC curves for the algorithms for that function.

void Reporter::printCurveDataFromCV ( ostream &  out,
GeneOntology go,
set< BioFunction functions = set< BioFunction >(),
string  extra = "" 
)

Print data for ROC and precision-recall curves from cross validation results.

Parameters:
[out]outthe stream to output to.
[in]goan instance of GeneOntology
[in]functionsa set of BioFunctions. If this set is not empty, then the method combines the results of all the functions in this set.
[in]extraan extra string to print as part of the information output. This string goes into a comment printed before the cross-validation curve data. Scripts such as eval-gain.pl can use the contents of the string in their output files.

The method prints out the following pieces of information for every function or for the entire group in the parameter functions: confidence cutoff, desired recall (the method tries to print recall values at regular intervals), actual recall (a desired recall may not be achievable because many results have identical confidence values, precision, false positive rate, the number of true positives, the number of false positives, the number of true negatives, and the number of false negatives.

void Reporter::printECWeightedCurveDataFromCV ( ostream &  out,
GeneOntology go,
MyAnnotations annotations,
set< BioFunction functions = set< BioFunction >(),
string  extra = "" 
)

Print data for ROC and precision-recall curves from cross validation results, taking evidence code weights into account.

The parameters for this method are identical to those for Reporter::printCurveDataFromCV(). This method modifies the counts for the number of true positives, false positives, true negatives, and false negatives, based on weights associated with evidence codes. The method relies on the values of correctStateConfidence input to Reporter::addCV().

void Reporter::printPredictionEvaluations ( ostream &  ostr,
GeneOntology go = NULL 
)

Print results of evaluating predictions, one (gene, function, algorithm) triple per line.

On each line, the method prints a predicted function, the name of the prediction algorithm, and the closest function in the new annotations. The method prints this information only for genes that have a new annotation in the net set of annotations. If there are multiple closest functions, the method prints just one.

void Reporter::printPredictions ( ostream &  predfstr,
int  numPredictionsToPrintPerFunction,
bool  flush = 1,
const BioFunction functionToPrint = NULL 
)

Print prediction results, one (gene, function, algorithm) triple per line.

void Reporter::readDetailedCVResults ( string  predFile,
const set< string > *  onlyFunctions = NULL,
string  convertFunction = "" 
)

Read detailed cross-validation results from detailedCVFile.

Parameters:
[in]detailedCVFile,thename of the file to read results from.
[in]onlyFunctions,apointer to a set of function ids.
[in]convertFunction,thename of a mathematical function to use to convert confidence values. Currently supported values are "oneminusexpminus" for $1 - (-x)$.

If onlyFunctions is not NULL, the method will ignore detailed CV results for all functions not present in onlyFunctions.

void Reporter::readGeneUniverse ( string  guFile,
set< string > &  universe 
)

Read the universe of genes over which GAIN operated in a previous invocation.

Parameters:
[in]guFile,afile containing identifiers of the genes in the FLN used by GAIN in a previous invocation.

Using this method is useful for ensuring (for example) that eval-gain operates on precisely the same set of genes as gain.

void Reporter::readPredictions ( string  predFile,
set< string > *  onlyFunctions = NULL,
string  convertFunction = "" 
)

Read predictions from predFile.

Parameters:
[in]predFile,thename of the file to read predictions from.
[in]onlyFunctions,apointer to a set of function ids.
[in]convertFunction,thename of a mathematical function to use to convert confidence values. Currently supported values are "oneminusexpminus" for $1 - (-x)$.

If onlyFunctions is not NULL, the method will ignore detailed CV results for all functions not present in onlyFunctions.

void Reporter::setExperimentName ( string  dataset) [inline]

Set the name of the dataset used to obtain these results.

Use this method to set the name of the dataset corresponding to these results. The name could correspond to a gene expression experiment, a method of constructing the FLN, an algorithm comparison, etc.


The documentation for this class was generated from the following files:
 All Classes Functions Variables Typedefs Friends