Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health.
Degree: PhD, Bioinformatics, 2011, University of Michigan
This dissertation develops methods of integrative statistical learning to studies of two human
diseases - respiratory infectious diseases and leukemia. It concerns integrating statistically
principled approaches to connect data with knowledge for improved understanding
of diseases. A wide spectrum of temporal and high-dimensional biological and medical
datasets were considered.
The first question studied in this thesis examined host responses to viral insult. In
a human challenge study, eight transcriptional response patterns were identified in hosts
experimentally challenged with influenza H3N2/Wisconsin viruses. These patterns are
highly correlated with and predictive of symptoms. A non-passive asymptomatic state was
revealed and associated with subclinical infections. The findings were validated and extended to three additional viral pathogens (influenza H1N1, Rhinovirus, and RSV). Their
differences and similarities were compared and contrasted. Statistical models were developed
for exposure detection and risk stratification. Experimental validations have been performed by collaborators at the Duke University.
The second question studied in this thesis investigated the regulatory roles of Hoxa9
and Meis1 in hematopoiesis and leukemia. Methods were developed to characterize their
global in vivo binding patterns and to identify their functional cofactors and collaborators.
The combinatorial effects of these factors were modeled and related to specific epigenetic
signatures. A new biological model was proposed to explain their synergistic functions in
leukemic transformation. Experimental validations have been performed by members of the Hess laboratory.
Motivated by problems encountered in these studies, two algorithms were developed
to identify spatial and temporal patterns from high-throughput data. The first method determines
temporal relationships between gene pathways during disease progression. It performs spectral analysis on graph Laplacian-embedded significance measures of pathway activity. The second algorithm proposes probabilistic modeling of protein binding events. Based on information geometry theory, it applies hypothesis testing coupled with jackknife-bias correction to characterize protein-protein interactions. Experimental validations were shown for both algorithms.
In conclusion, this dissertation addressed issues in the design of statistical methods
to identify characteristic and predictive features of human diseases. It demonstrated the
effectiveness of integrating simple techniques in bioinformatics analysis. Several bioinformatics
tools were developed to facilitate the analysis of high-dimensional time-series datasets.
Advisors/Committee Members: Hero Iii, Alfred O. (committee member), Hess, Jay L. (committee member), Burns Jr., Daniel M. (committee member), Omenn, Gilbert S. (committee member), Shedden, Kerby A. (committee member).
Subjects/Keywords: Integrative Statistical Learning in High-dimensional Time-series Data; Host Transcriptional Responses to Respiratory Viral Pathogens; Role of Hoxa9 in Leukemic Transformation; Spectral Analysis of Temporal Pathway Activity Using Graph Lapalacian; Information Geometric Analysis of Motif Profiles in ChIP-sequencing; Predictive Modeling and Classification in High-dimensional and Temporal Data; Biomedical Engineering; Genetics; Microbiology and Immunology; Pathology; Science (General); Statistics and Numeric Data; Health Sciences; Science
…of temporal and high-dimensional biological and medical
datasets were considered.
The first… …spatial and temporal patterns from high-throughput data. The first method determines temporal… …adjusted to incorporate new data and modeling techniques. In addition, the learning strategy may… …cohesive manner. The findings suggest that, in a large-scale data modeling
situation, models may… …46
III. Towards Early Detection: Temporal Spectrum of Host Response in
to Zotero / EndNote / Reference
APA (6th Edition):
Huang, Y. (2011). Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/84435
Chicago Manual of Style (16th Edition):
Huang, Yongsheng. “Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health.” 2011. Doctoral Dissertation, University of Michigan. Accessed October 20, 2020.
MLA Handbook (7th Edition):
Huang, Yongsheng. “Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health.” 2011. Web. 20 Oct 2020.
Huang Y. Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health. [Internet] [Doctoral dissertation]. University of Michigan; 2011. [cited 2020 Oct 20].
Available from: http://hdl.handle.net/2027.42/84435.
Council of Science Editors:
Huang Y. Integrative Statistical Learning with Applications to Predicting Features of Diseases and Health. [Doctoral Dissertation]. University of Michigan; 2011. Available from: http://hdl.handle.net/2027.42/84435