This page contains supplementary information to accompany the paper:

Connecting the Dots between PubMed Abstracts

M. Shahriar Hossain1, Joseph Gresock1, Yvette Edmonds2, Richard Helm2, Malcolm Potts3, Naren Ramakrishnan1

1Dept. of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
2Dept. of Biochemistry, Virginia Tech, Blacksburg, VA 24061, USA
3Dept. of Biological and Environmental Sciences, Qatar University, Qatar

 

 

 

 

 

 

# of Stories

Case Study 1

Case Study 2

Case Study 3

After A* search

332,104

74,893

119,964

After p and q-value filtering

(,)

17,266

6,486

11,739

After context filtering

611

424

202

After Sentence cohesion filtering

159

135

48

 

 

The table above provides statistics about the stories after each important stage of the storytelling pipeline. It shows that initially we obtained 332,104 stories for case study 1, 74,893 stories for case study 2, and 119,964 stories for case study 3. We filter the stories based on statistical significance (, ) and obtained 17266, 6486, and 11739 stories respectively for case study 1, 2, and 3. For case study 1, we obtain 611 stories after the context overlap filtering and 159 final stories after sentence cohesion filtering. For case study 2, we had 424 stories after applying the context filter and 135 final stories after sentence cohesion filtering. The number of stories after context filtration for case study 3 is 202, while the final set contained 48 stories. The generated stories can be found in the following three links:

Case study 1        Case study 2       Case study 3 

 

  

 

List of Expanded Molecules:

We expanded some of the molecule descriptions to cover alternate uses and enhance coverage. The following table provides a list of molecules and the search terms we used to query PubMed database:

 

Case study

Molecule

Query

Case study 1

IL-1 alpha

"IL-1+alpha+OR+IL-1alpha+OR+Interleukin-1+alpha+OR+Interleukin-1alpha"

IL-1+beta

"IL-1+beta+OR+IL-1beta+OR+Interleukin-1+beta+OR+Interleukin-1beta"

IL-6

"IL-6+OR+Interleukin-6"

IL-8

"IL-8+OR+Interleukin-8"

IL-13

"IL-13+OR+Interleukin-13"

IL-24

"IL-24+OR+Interleukin-24"

MMP

"matrix+metalloproteinase"

CD38

"CD38"

MCP-2

"MCP-2+OR+chemokine+ccl8+OR+chemokine"

IFN-gamma

"IFN-gamma+OR+interferon+gamma+OR+IFNG+OR+IFN-G"

Stanniocalcin

"stanniocalcin+OR+teleocalcin"

serpinB2

"serpinB2+OR+plasminogen+activator+inhibitor-2+OR+PAI-2"

IGF-1

"IGF-1+OR+IGF1+OR+insulin-like+growth+factor+1"

MCP1

"MCP1+OR+MCP-1+OR+monocyte+chemotactic+protein+1"

Nicotinamide

"nicotinamide+OR+niacinamide+OR+niacin"

MMP-3

"MMP-3+OR+MMP3+OR+matrix+metallopeptidase+3"

SFRP1

"SFRP1+OR+SFRP-1+OR+secreted+frizzled-related+protein+1"

MMP12

"MMP12+OR+MMP-12+OR+matrix+metallopeptidase+12"

CXCL1

"CXCL1+OR+CXCL-1",

Poly(ADP-ribose)

"poly-ADP-ribose+OR+pADP+ribose+OR+poly+adenosine+diphosphate+ribose"

Case study 2

IL-1

"IL-1+beta+OR+IL-1beta+OR+Interleukin-1+beta+OR+Interleukin-1beta"

IL-8

"IL-8+OR+Interleukin-8"

TNF-alpha

"TNF-alpha+OR+tumour+necrosis+factor+alpha+OR+TNFalpha"

IFN-gamma

"IFN-gamma+OR+interferon+gamma+OR+IFNG+OR+IFN-G"

NF-KB

"NF-KB+OR+NF-kappa-B"

CREB

"CREB+OR+Cyclic-AMP+response+element+binding+protein"

JunD

"JunD+OR+jun+D+proto-oncogene"

ATF2

"ATF2+OR+activating+transcription+factor+2"

ATF3

"ATF3+OR+activating+transcription+factor+3"

STAT1

"STAT1+OR+signal+transducer+and+activator+of+transcription+1"

STAT2

"STAT2+OR+signal+transducer+and+activator+of+transcription+2+OR+stat2+transcription+factor"

c-jun

"c-jun+OR+c-jun+transcription+factor+OR+jun+genes"

c-fos

"c-fos+OR+proto-oncogene+proteins+c-fos+OR+fos+genes"

BRCA1

"BRCA1+OR+breast+cancer+1+OR+brca1+genes+OR+brca1+protein"

Case study 3

Glutamine

"glutamine"

PKM2

"PK-M2+OR+PKM2+OR+pyruvate+kinase+muscle+isozyme+ OR+PKM+OR+pyruvate+kinase+type+K"

OIP3

"OIP3"

CTHBP

"cytosolic+thyroid+hormone-binding+protein+OR+CTHBP"

Pyruvate kinase

"pyruvate+kinase"