The Human Genome Project (HGP) was a 13-year effort with the goal of generating the first sequence of the human genome. In 2003, HGP produced a genome sequence that accounted for over 90% of the human genome and was considered as close to complete as the technologies for sequencing DNA allowed at the time. HGP unlocked the door to a vast but unannotated collection of genes.
In the following decades, via experimental studies, researchers painstakingly curated reannotations in the form of biochemical reaction graphs. Though gene set enrichment analysis considers groups within these annotation graphs, it disregards group dependencies. John A. Burns School of Medicine Postdoctoral Researcher Joshua G. Burkhart, along with co-authors in JABSOM’s Bioinformatics Core, including supervisor and director Dr. Youping Deng, are making advancements in biochemical reaction network analysis. Their work is now on display in Patterns, a top data science journal from Cell Press that focuses on groundbreaking, original research.
Dr. Burkhart and the Bioinformatics Core generated a Graph Neural Network based on the Reactome reaction network. The Graph Neural Network can find associations, supported by prior literature, that traditional differential gene expression and hypergeometric enrichment analyses cannot.
JABSOM researchers show how integrating the curated relationships from this graph with gene expression values from other studies can be used to identify biochemical reactions associated with tissue-specific diseases.
Dr. Burkhart learned the model performs comparably to conventional deep learning and better than random network—and other negative—controls. The qualitative benefit of their approach is that additional gene expression datasets may be used for re-tuning the final classification layer, and extracted features may be located in the Reactome reaction network, revealing specific biochemical reactions and subnetworks.
In the future, similar approaches could enable fruitful reanalyses of
prior work, highlighting influential relationships and pinpointing biochemical reactions. As more experimental research databases become available, Dr. Burkhart and the Bioinformatics Core envision extensions of their work may predict the effects of rare or indistinct genetic variations and guide precision medicine.
To read the published research in Patterns, click here.