Talks

2022

(09-30) Iowa State University, Department of Statistics, Student Poster

Title: Constructing Large Scale Gene Networks by Partial Correlation Graphs with Information Incorporation

Authors: Hao Wang, Yumou Qiu*, Hongqing Guo, Yanhai Yin, Peng Liu*

Abstract: Current research in the field of gene networks focuses on developing rigorous statistical inference under high-dimensional data and incorporating biological knowledge to improve network accuracy. Among different approaches for network analysis, Gaussian graphical model (GGM) shows its advantages because it provides a conditional dependency measure for gene-gene direct interactions by partial correlations. Compared to Pearson’s correlation, partial correlation shows how genes interact with each other after controlling potential confounding factors. We consider to make statistical inference for high-dimensional partial correlation graphs more informative in the sense that biological information can be included if any is available and the error rate can be rigorously controlled at the same time.

To measure conditional dependency between two genes, we apply regularized node-wise regression to remove the effects of other genes on those two genes. We propose a novel procedure to estimate a large partial correlation graph among genes by including biological knowledge into the node-wise regression, which we denote as partial correlation graph with information incorporation (PCGII). The proposed procedure overcomes the ill-condition of the number of genes being much greater than the sample size and can infer the gene network with false discovery rate control. We show the biological knowledge is able to improve the accuracy of the inferred graph for gene conditional dependency. We compare the proposed method with several existing approaches through simulation studies and justify the advantages of PCGII in terms of better error rate control and higher power even if the prior set includes false positives. Furthermore, the utility of the proposed procedure is demonstrated in real data analysis. PCGII is able to recover a confirmed regulatory relationship and also uniquely detect a gene with a role of hub node. PCGII additionally identifies several direct interactions which brings light on potential functional relationships in the system.

2019

(10-18) Genetics Group: Reconstruction of Gene Network by c-level Partial Correlation Graphs

Abstract: A key aim in system biology is to understand molecules’ structural and functional processes in a living cell. With the development of high-throughput technologies, quantitative methods can be applied on large scale ‘omics’ datasets. Due to the nature of intricate relationships of all molecules in a cell, network-based methods have become a popular approach to reconstruct gene-gene, gene-protein, and protein-protein interactions. Among different network approaches, Gaussian Graphical Model show advantages in reconstructing gene co- expression networks because it is able to capture the direct association between genes with partial correlations. However, estimating and inferring partial correlations under high-dimensional setting are very challenging. A method utilizing penalized partial correlations called exact hypothesis testing for shrinkage based Gaussian graphical models (Shrunk MLE) is able to overcome the high-dimension problem. However, the statistical inference of such penalized partial correlations is not satisfying. In this project, a novel network inference method, named c-level Partial Correlation Graph (c-level PCG), is applied on the gene expression dataset to model gene-gene direct association. It overcomes the ill-condition of p greater than n and successfully inferrs estimated partial correlation with false discovery rate controlled. Compared to Shrunk MLE, c-level PCG is able to achieve much higher statistical power and control the false discovery rate at the same time, according to our simulation studies.