Full Record

New Search | Similar Records

Author
Title Dirty statistical models
URL
Publication Date
Date Accessioned
Discipline/Department Electrical and Computer Engineering
University/Publisher University of Texas – Austin
Abstract In fields across science and engineering, we are increasingly faced with problems where the number of variables or features we need to estimate is much larger than the number of observations. Under such high-dimensional scaling, for any hope of statistically consistent estimation, it becomes vital to leverage any potential structure in the problem such as sparsity, low-rank structure or block sparsity. However, data may deviate significantly from any one such statistical model. The motivation of this thesis is: can we simultaneously leverage more than one such statistical structural model, to obtain consistency in a larger number of problems, and with fewer samples, than can be obtained by single models? Our approach involves combining via simple linear superposition, a technique we term dirty models. The idea is very simple: while any one structure might not capture the data, a superposition of structural classes might. Dirty models thus searches for a parameter that can be decomposed into a number of simpler structures such as (a) sparse plus block-sparse, (b) sparse plus low-rank and (c) low-rank plus block-sparse. In this thesis, we propose dirty model based algorithms for different problems such as multi-task learning, graph clustering and time-series analysis with latent factors. We analyze these algorithms in terms of the number of observations we need to estimate the variables. These algorithms are based on convex optimization and sometimes they are relatively slow. We provide a class of low-complexity greedy algorithms that not only can solve these optimizations faster, but also guarantee the solution. Other than theoretical results, in each case, we provide experimental results to illustrate the power of dirty models.
Subjects/Keywords Structure learning; Statistical inference; Dirty models; High-dimensional statistics; Machine learning; Sparse and low-rank decomposition; Graph clustering; Time series analysis; Greedy dirty algorithms
Contributors Sanghavi, Sujay Rajendra, 1979- (advisor); Caramanis, Constantine (committee member); Ghosh, Joydeep (committee member); Dhillon, Inderjit (committee member); Ravikumar, Pradeep (committee member)
Language en
Country of Publication us
Record ID handle:2152/ETD-UT-2012-05-5088
Repository texas
Date Retrieved
Date Indexed 2018-10-22
Note [] text; [department] Electrical and Computer Engineering;

Sample Images

.