Pingzhao Hu

Department of Biochemistry and Medical Genetics, University of Manitoba

“Machine Learning Approaches for Predicting Protein Functions and Disease Outcomes Using Omics Data ”

Date: Thursday, November 20, 2014

It has been known that earlier detection of diseases (e.g., cancer) is the key for treatments. Traditionally, clinical information-based classifier usually has low prediction accuracy. It has been expected that molecular classifier is one of the most promising tools to improve the accuracy. To do this, it is vital to identify clinically-significant while biological function relevant protein/gene biomarkers. However, although many genomes have been sequenced, almost half of genes in these sequenced genomes have no function information.

To solve the issues, we developed network-based machine learning frameworks. For predicting sample disease status, we proposed to identify subnetwork-based biomarkers from co-expression network and developed a modular-based linear discriminant analysis approach by integrating ‘essential’ correlation structure among genes into the predictor rather than considering all types of correlations (e.g., strong, weak and noise correlations) or ignoring all these correlations. Hence, the correlated gene clusters, which are related to the diagnostic classes we look for, can have potential functional interpretation. For predicting protein functions, we devised an iterative relaxation labeling procedure to find its maximally likely labeling on protein network. Contrary to the traditional methods, which treated gene ontology (GO) terms as a flat structure, we addressed the problem of multi-label multi-class classification of protein functions by taking into account the inter-correlation of GO terms in a hierarchy structure.

Important Dates

July 1: Canada Day (University Closed)

July 2: Canada Day observed (University Closed)

Where are they now?

Hsing-Ming Chang, Ph.D. (2012)

Robert Platt, M.Sc. (1993)