Biorithm  1.1
Public Member Functions | Protected Member Functions | Protected Attributes
Apriori Class Reference

A class that implements the Apriori algorithm for computing itemsets in binary matrix. This class computes closed itemsets. It also computes the lattice connecting the closed itemsets. ! More...

#include <apriori.h>

Inheritance diagram for Apriori:
AprioriWithComplement

List of all members.

Public Member Functions

 Apriori (string filename)
 Constructor that reads a binary matrix from a file. Create an instance of Apriori from a file. The file must be in the form of a tab delimited matrix with the first row containing the column names. Each row must begin with the row name. The first token of the first row must exist, but can be any value.
 Apriori (vector< vector< unsigned int > > &matrix, map< unsigned int, string > &majorNames, map< unsigned int, string > &minorNames)
virtual ~Apriori ()
 Destructor.
virtual bool checkIfClosed (Itemset &itemset) const
virtual void computeItemsetFromRows (vector< unsigned int > &rowIndices, Itemset &itemset)
 Compute an itemset from a set of rows.
virtual void computeItemsetFromRows (Itemset &itemset)
 Compute the columns in the itemset given the rows already stored in the itemset.
virtual void computeItemsets (const set< unsigned int > *rowsToAvoidInSeed=NULL, const set< unsigned int > *rowsToAvoidCompletely=NULL)
virtual void computeItemsetsLowRAM (const set< unsigned int > *rowsToAvoidInSeed=NULL, const set< unsigned int > *rowsToAvoidCompletely=NULL)
 This method is identical to Apriori::computeItemsets() except that it uses less memory by not storing column names with each computed itemset.
virtual AprioriLatticeEdgeType computeLatticeEdgeType (const Itemset &itm1, const Itemset &itm2) const
virtual void computeLattice ()
virtual void computeRandomDistribution (ostream &ostrm, unsigned int numTries, map< ItemsetRandomRowDistKeyType, MyHistogram > &rowDists, map< unsigned int, MyHistogram > &columnDists, MyHistogram &sizeDist)
bool containsColumn (Itemset &itemset, unsigned int index) const
bool containsRow (Itemset &itemset, unsigned int index) const
virtual void getItemsets (vector< Itemset > &itemsets, bool deleteItemsets=false)
virtual void getLattice (ItemsetLattice &lattice, bool deleteLattice=false)
virtual void getClosedLattice (ItemsetLattice &closedLattice, bool deleteLattice=false)
 Returns the transitive closure of the lattice connecting the closed itemsets in the data.
virtual unsigned int getNumItemsets () const
 Return the number of computed itemsets.
virtual unsigned int getNumLatticeEdges () const
 Return the number of edges in the lattice.
virtual void printItemsets (ostream &ostr) const
 Print itemsets to the output stream in "itemset" format.
virtual void printItemsets (string outputFile) const
virtual void printItemsetsGraph (ostream &ostr) const
 Print the bipartite graph induced by each itemset to the output stream.
virtual void printItemsetsGraph (string outputFile) const
virtual void printItemsetStatistics (ostream &ostr) const
 For each k, print the number of itemsets with k rows and with k columns.
void readFile (string filename)
 Reads a binary matrix into the invocant from a file.
virtual void setItemsets (const vector< Itemset > &isets)
 Store itemsets (computed by another method) internally.
virtual unsigned int setMinimums (unsigned int rows, unsigned int columns)

Protected Member Functions

virtual void _computeAllRows (vector< unsigned int > &allRows) const
virtual unsigned int _computeRandomRows (const vector< unsigned int > &allRows, vector< unsigned int > &randomRows) const
virtual bool _areRowsInSameDataset (unsigned int i, unsigned int j) const

Protected Attributes

unsigned int minrows
unsigned int mincols
bool transposed
vector< vector< unsigned int > > data
map< unsigned int, string > columnNames
map< unsigned int, string > rowNames
unsigned int _numDatasets
vector< unsigned int > _rowIndexToDatasetIndex
vector< Itemsetitemsets
bool _itemsetsComputed
ItemsetLattice _closedLattice
ItemsetLattice _reducedLattice
bool _latticeComputed

Detailed Description

A class that implements the Apriori algorithm for computing itemsets in binary matrix. This class computes closed itemsets. It also computes the lattice connecting the closed itemsets. !


Constructor & Destructor Documentation

Apriori::Apriori ( string  filename)

Constructor that reads a binary matrix from a file. Create an instance of Apriori from a file. The file must be in the form of a tab delimited matrix with the first row containing the column names. Each row must begin with the row name. The first token of the first row must exist, but can be any value.


Member Function Documentation

bool Apriori::checkIfClosed ( Itemset itemset) const [virtual]

Return true if and only if the itemset is closed.

void Apriori::computeItemsets ( const set< unsigned int > *  rowsToAvoidInSeed = NULL,
const set< unsigned int > *  rowsToAvoidCompletely = NULL 
) [virtual]

Compute closed itemsets in the data using the Apriori algorithm.

Parameters:
[in]rowsToAvoidInSeed,apointer to a set of rows to avoid when computing single-row itemsets. The method starts by computing itemsets with single rows. If a row is a member of this set, the method will not use that row to compute a single-row itemset. The row may be used later in computing itemsets with more than one row.
[in]rowsToAvoidCompletely,apointer to a set of rows to avoid when computing any itemset, including single-row itemsets.
Note:
Use the Apriori::getItemsets() method to retrieve the itemsets.

Reimplemented in AprioriWithComplement.

void Apriori::computeItemsetsLowRAM ( const set< unsigned int > *  rowsToAvoidInSeed = NULL,
const set< unsigned int > *  rowsToAvoidCompletely = NULL 
) [virtual]

This method is identical to Apriori::computeItemsets() except that it uses less memory by not storing column names with each computed itemset.

Note:
You must invoke Apriori::finalizeItemsets() before processing the columns of the computed itemsets.
void Apriori::computeLattice ( ) [virtual]

Compute the lattice induced by subset relationships between the itemsets.

Note:
The method invokes Apriori::computeItemsets() if you have not already invoked it.
AprioriLatticeEdgeType Apriori::computeLatticeEdgeType ( const Itemset itm1,
const Itemset itm2 
) const [virtual]

Compute the type of the edge between two itemsets.

Parameters:
[in]itm1,aninstance of Itemset.
[in]itm2,aninstance of Itemset.
Returns:
LATTICE_NO_EDGE if there is no relationship between the two itemsets and LATTICE_EDGE if itm1's rows contain itm2's rows.

Reimplemented in AprioriWithComplement.

void Apriori::computeRandomDistribution ( ostream &  ostrm,
unsigned int  numTries,
map< ItemsetRandomRowDistKeyType, MyHistogram > &  rowDists,
map< unsigned int, MyHistogram > &  columnDists,
MyHistogram sizeDist 
) [virtual]

Compute a distribution of itemset sizes by picking subsets of rows uniformly at random.

bool Apriori::containsColumn ( Itemset itemset,
unsigned int  index 
) const

Return true if and only if the column index should be part of itemset.

The method checks if the column index has a 1 in all the rows currently in itemset.

bool Apriori::containsRow ( Itemset itemset,
unsigned int  index 
) const

Return true if and only if the row index should be part of itemset.

The method checks if the row index has a 1 in all the columns currently in itemset.

void Apriori::getClosedLattice ( ItemsetLattice closedLattice,
bool  deleteLattice = false 
) [virtual]

Returns the transitive closure of the lattice connecting the closed itemsets in the data.

Parameters:
[in]deleteLattice,aboolean; if true, delete the closed lattice stored internally.
Note:
The method invokes Apriori::computeLattice() if you have not already invoked it.
void Apriori::getItemsets ( vector< Itemset > &  itemsets,
bool  deleteItemsets = false 
) [virtual]

Returns a vector of closed itemsets in the data.

Parameters:
[out]itemsets,avector of Itemset to return the closed itemsets in.
[in]deleteItemsets,aboolean; if true, delete the itemsets stored internally
Note:
The method invokes Apriori::computeItemsets() if you have not already invoked it.
void Apriori::getLattice ( ItemsetLattice lattice,
bool  deleteLattice = false 
) [virtual]

Returns the lattice connecting the closed itemsets in the data.

Parameters:
[out]lattice,aninstance of ItemsetLattice to return the lattice in.
[in]deleteLattice,aboolean; if true, delete the lattice stored internally.
Note:
The method invokes Apriori::computeLattice() if you have not already invoked it.
void Apriori::printItemsets ( string  outputFile) const [virtual]

Print itemsets to the output file in "itemset" format.

Each itemset appears on one line containing the name of the itemset, the names of the rows in the itemset, and the names of the columns in the itemset, all separated by tabs. The name of the itemset is the string "itemset_<index>_<number of rows>_<number of columns>".

void Apriori::printItemsetsGraph ( string  outputFile) const [virtual]

Print the bipartite graph induced by each itemset to the output file.

Each line of the file contains the itemset name, a row name, and a column name, all separated by tabs.

void Apriori::setItemsets ( const vector< Itemset > &  isets) [virtual]

Store itemsets (computed by another method) internally.

Parameters:
[in]itemsets,avector of Itemsets to be stored.

Use this method when you have a set of itemsets that you would like to further process using methods in the Apriori class, e.g., Apriori::computeLattice(). A typical use of this method is to compute itemsets using this class, using an external method to prune the itemsets, and computing the lattice involving the pruned itemsets.

Warning:
Using this method will discard any itemsets previously stored in the class.

The documentation for this class was generated from the following files:
 All Classes Functions Variables Typedefs Friends