We propose a representation for gene expression data called conserved gene expression motifs or xMOTIFS. A gene's expression level is conserved across a set of samples if the gene is expressed with the same abundance in all the samples. A conserved gene expression motif is a subset of genes that is simultaneously conserved across a subset of samples. We present a computational technique to discover large conserved gene motifs that cover all the samples and classes in the data. When applied to published data sets representing different cancers or disease outcomes, our algorithm constructs xMOTIFS that distinguish between the various classes.
You can download the version of the paper that appeared in the Pacific Symposium on Biocomputing 2003.