Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for +publisher:"University of Miami" +contributor:("J. Sunil Rao"). Showing records 1 – 2 of 2 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters

1. Liu, Hongmei. Some New Theories and Applications on L1 Shrinkage Estimation.

Degree: PhD, Biostatistics (Medicine), 2017, University of Miami

In this thesis, I develop some new variable selection and statistical modeling techniques in the framework of L1 shrinkage estimation with applications to high dimensional genomic and pharmacogenomic datasets. In the first part of the thesis, I revisit the problem of variable selection in linear regression models. While numerous variable selection procedures have been developed, their finite sample performance can often be less than satisfactory. I develop a new strategy for variable selection in the adaptive least absolute shrinkage and selection operator (Lasso) and adaptive elastic-net estimations with pn diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on model dimension. Conditioning on dimension effectively mitigates overfitting. But to deal with underfitting these MFs are then prediction-wighted. I show that the new method is not only model selection consistent, but also has attractive convergence rate, which lead to outstanding finite sample performance. In the second part, I propose a new statistical model to re-explore the Genomics of Drug Sensitivity (GDSC) study \citep{garnett2012systematic}. To link drug sensitivity with genomic profiles, the study screened 639 human tumor cell lines with 130 cancer drugs ranging from known chemotherapeutic agents to experimental compounds. However, the statistical challenges still exist in analyses of this dataset: i)biomarkers cluster among the cell lines; ii) clusters can overlap (e.g. a cell line may belong to multiple clusters); iii) drugs should be modeled jointly. I introduce a new multivariate regression model with a latent overlapping cluster indicator variable to address these issues. I then propose the generalized mixture of multivariate regression (GMMR) models and build a connection with it to the new model. I develop a new EM algorithm for numerical computations in the GMMR model. The proposed new model can answer specific questions in the GDSC data: i) can cancer-specific therapeutic biomarkers be detected, ii) can drug resistance patterns be identified along with predictive strategies to circumvent resistance using alternate drugs? In the third part of the thesis, I set out to tackle another challenging problem related to GDSC data  – that of validating models built on one dataset but tested on similar datasets generated in other laboratories. The Genomics of Drug Sensitivity (GDSC) and Cancer Cell Line Encyclopedia (CCLE) are two major resources that can be used to mine for therapeutic biomarkers for cancers of a large variety. Recent studies found that while the genomic profiling seems consistent, the drug response data is not. As a result, both predictions and signatures do not validate well for models built on one dataset and tested on the other. I present a partitioning strategy based on a data sharing concept, which directly estimates the amount of discordance between datasets and in doing so, also allows for extraction of… Advisors/Committee Members: J. Sunil Rao, Hermant Ishwaran, Lily Wang, Nagi Ayad.

Subjects/Keywords: L1 shrinkage estimation; variable selection; overlapping clustering; model validation

Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Sample image

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Liu, H. (2017). Some New Theories and Applications on L1 Shrinkage Estimation. (Doctoral Dissertation). University of Miami. Retrieved from https://scholarlyrepository.miami.edu/oa_dissertations/1954

Chicago Manual of Style (16th Edition):

Liu, Hongmei. “Some New Theories and Applications on L1 Shrinkage Estimation.” 2017. Doctoral Dissertation, University of Miami. Accessed January 28, 2020. https://scholarlyrepository.miami.edu/oa_dissertations/1954.

MLA Handbook (7th Edition):

Liu, Hongmei. “Some New Theories and Applications on L1 Shrinkage Estimation.” 2017. Web. 28 Jan 2020.

Vancouver:

Liu H. Some New Theories and Applications on L1 Shrinkage Estimation. [Internet] [Doctoral dissertation]. University of Miami; 2017. [cited 2020 Jan 28]. Available from: https://scholarlyrepository.miami.edu/oa_dissertations/1954.

Council of Science Editors:

Liu H. Some New Theories and Applications on L1 Shrinkage Estimation. [Doctoral Dissertation]. University of Miami; 2017. Available from: https://scholarlyrepository.miami.edu/oa_dissertations/1954

2. Tang, Fei. Random Forest Missing Data Approaches.

Degree: PhD, Biostatistics (Medicine), 2017, University of Miami

Random forest (RF) missing data algorithms are an attractive approach for imputing missing data. They have the desirable properties of being able to handle mixed types of missing data, they are adaptive to interactions and nonlinearity, and they have the potential to scale to big data settings. Currently there are many different RF imputation algorithms, but relatively little guidance about their efficacy. Using a large, diverse collection of data sets, imputation performance of various RF algorithms was assessed under different missing data mechanisms. Algorithms included proximity imputation, on the fly imputation, and imputation utilizing multivariate unsupervised and supervised splittingthe latter class representing a generalization of a new promising imputation algorithm called missForest. Our findings reveal RF imputation to be generally robust with performance improving with increasing correlation. Performance was good under moderate to high missingness, and even (in certain cases) when data was missing not at random. Real data analysis using the RF imputation methods was conducted on the MESA data. Advisors/Committee Members: Hemant Ishwaran, J. Sunil Rao, Lily Wang, Panagiota V. Caralis.

Subjects/Keywords: Random Forest; Imputation; MESA data; Missing data

Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Tang, F. (2017). Random Forest Missing Data Approaches. (Doctoral Dissertation). University of Miami. Retrieved from https://scholarlyrepository.miami.edu/oa_dissertations/1852

Chicago Manual of Style (16th Edition):

Tang, Fei. “Random Forest Missing Data Approaches.” 2017. Doctoral Dissertation, University of Miami. Accessed January 28, 2020. https://scholarlyrepository.miami.edu/oa_dissertations/1852.

MLA Handbook (7th Edition):

Tang, Fei. “Random Forest Missing Data Approaches.” 2017. Web. 28 Jan 2020.

Vancouver:

Tang F. Random Forest Missing Data Approaches. [Internet] [Doctoral dissertation]. University of Miami; 2017. [cited 2020 Jan 28]. Available from: https://scholarlyrepository.miami.edu/oa_dissertations/1852.

Council of Science Editors:

Tang F. Random Forest Missing Data Approaches. [Doctoral Dissertation]. University of Miami; 2017. Available from: https://scholarlyrepository.miami.edu/oa_dissertations/1852

.