You searched for subject:(High dimensional)
.
Showing records 1 – 30 of
572 total matches.
◁ [1] [2] [3] [4] [5] … [20] ▶
1.
Freyaldenhoven, Simon.
Essays on Factor Models and Latent Variables in
Economics.
Degree: Department of Economics, 2018, Brown University
URL: https://repository.library.brown.edu/studio/item/bdr:792643/
► This dissertation examines the modeling of latent variables in economics in a variety of settings. The first two chapters contribute to the growing body of…
(more)
▼ This dissertation examines the modeling of latent
variables in economics in a variety of settings. The first two
chapters contribute to the growing body of work on how best to find
meaning in
high dimensional datasets. In Chapter 1, I extend the
theory on factor models by incorporating ``local'' factors into the
model. Local factors affect a decreasing fraction of the observed
variables. This implies a continuum of eigenvalues of the
covariance matrix, as is commonly observed in applications. I find
that the factor strength at which the principal component estimator
gives consistent factor estimates coincides with the factor
strength at which factors are economically important in many
economic models. I further propose a novel class of estimators for
the number of those factors. Unlike estimators that have been
proposed in the past, my estimators use information in the
eigenvectors as well as in the eigenvalues. Monte Carlo evidence
suggests significant finite sample gains over existing estimators.
In an empirical application, I find evidence of local factors in a
large panel of US macroeconomic indicators. In Chapter 2, I
establish that Sparse Principal Components can consistently recover
local factors. I further develop a unifying framework that
encompasses both factor augmented regressions and
high-
dimensional
sparse linear regression models. I argue that factor augmented
regressions with local factors partially fill the gap in between
those approaches. Chapter 3 considers a linear panel event-study
design in which a latent factor may affect both the outcome of
interest and the timing of the event. This scenario would
invalidate a traditional difference-in-differences approach.
However, this chapter presents a novel method that nevertheless
allows a practitioner to identify the causal effect of the event on
the outcome of interest. A Covariate related to the latent factor,
but unaffected by the event, is used to achieve identification via
a GMM representation. This approach permits causal inference in the
presence of pre-event trends in the outcome.
Advisors/Committee Members: McCloskey, Adam (Advisor), Shapiro, Jesse (Reader), Renault, Eric (Reader), Kleibergen, Frank (Reader).
Subjects/Keywords: high dimensional data
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Freyaldenhoven, S. (2018). Essays on Factor Models and Latent Variables in
Economics. (Thesis). Brown University. Retrieved from https://repository.library.brown.edu/studio/item/bdr:792643/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Freyaldenhoven, Simon. “Essays on Factor Models and Latent Variables in
Economics.” 2018. Thesis, Brown University. Accessed February 28, 2021.
https://repository.library.brown.edu/studio/item/bdr:792643/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Freyaldenhoven, Simon. “Essays on Factor Models and Latent Variables in
Economics.” 2018. Web. 28 Feb 2021.
Vancouver:
Freyaldenhoven S. Essays on Factor Models and Latent Variables in
Economics. [Internet] [Thesis]. Brown University; 2018. [cited 2021 Feb 28].
Available from: https://repository.library.brown.edu/studio/item/bdr:792643/.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Freyaldenhoven S. Essays on Factor Models and Latent Variables in
Economics. [Thesis]. Brown University; 2018. Available from: https://repository.library.brown.edu/studio/item/bdr:792643/
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Illinois – Urbana-Champaign
2.
Wang, Runmin.
Statistical inference for high-dimensional data via U-statistcs.
Degree: PhD, Statistics, 2020, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/108476
► Owing to the advances in the science and technology, there is a surge of interest in high-dimensional data. Many methods developed in low or fixed…
(more)
▼ Owing to the advances in the science and technology, there is a surge of interest in
high-
dimensional data. Many methods developed in low or fixed
dimensional setting may not be theoretically valid under this new setting, and sometimes are not even applicable when the dimensionality is larger than the sample size. To circumvent the difficulties brought by the
high-dimensionality, we consider to use U-statistics based methods. In this thesis, we investigate the theoretical properties of U-statistics under the
high-
dimensional setting, and develop the novel U-statistics based methods to three problems.
In the first chapter, we propose a new formulation of self-normalization for inference about the mean of
high-
dimensional stationary processes by using a U-statistic based approach. Self-normalization has attracted considerable attention in the recent literature of time series analysis, but its scope of applicability has been limited to low-/fixed-
dimensional parameters for low-
dimensional time series. Our original test statistic is a U-statistic with a trimming parameter to remove the bias caused by weak dependence. Under the framework of nonlinear causal processes, we show the asymptotic normality of our U-statistic with the convergence rate dependent upon the order of the Frobenius norm of the long-run covariance matrix. The self-normalized test statistic is then constructed on the basis of recursive subsampled U-statistics and its limiting null distribution is shown to be a functional of time-changed Brownian motion, which differs from the pivotal limit used in the low-
dimensional setting. An interesting phenomenon associated with self-normalization is that it works in the
high-
dimensional context even if the convergence rate of original test statistic is unknown. We also present applications to testing for bandedness of the covariance matrix and testing for white noise for
high-
dimensional stationary time series and compare the finite sample performance with existing methods in simulation studies. At the root of our theoretical arguments, we extend the martingale approximation to the
high-
dimensional setting, which could be of independent theoretical interest.
In the second chapter, we consider change point testing and estimation for
high dimensional data. In the case of testing for a mean shift, we propose a new test which is based on U-statistics and utilizes the self-normalization principle. Our test targets dense alternatives in the
high dimensional setting and involves no tuning parameters. The weak convergence of a sequential U-statistic based process is shown as an important theoretical contribution. Extensions to testing for multiple unknown change points in the mean, and testing for changes in the covariance matrix are also presented with rigorous asymptotic theory and encouraging simulation results. Additionally, we illustrate how our approach can be used in combination with wild binary segmentation to estimate the number and location of multiple unknown change points.
In the third chapter, we…
Advisors/Committee Members: Shao, Xiaofeng (advisor), Shao, Xiaofeng (Committee Chair), Chen, Xiaohui (committee member), Fellouris, Georgios (committee member), Simpson, Douglas G (committee member).
Subjects/Keywords: High-dimensional data; U-statistics
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wang, R. (2020). Statistical inference for high-dimensional data via U-statistcs. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/108476
Chicago Manual of Style (16th Edition):
Wang, Runmin. “Statistical inference for high-dimensional data via U-statistcs.” 2020. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed February 28, 2021.
http://hdl.handle.net/2142/108476.
MLA Handbook (7th Edition):
Wang, Runmin. “Statistical inference for high-dimensional data via U-statistcs.” 2020. Web. 28 Feb 2021.
Vancouver:
Wang R. Statistical inference for high-dimensional data via U-statistcs. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/2142/108476.
Council of Science Editors:
Wang R. Statistical inference for high-dimensional data via U-statistcs. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2020. Available from: http://hdl.handle.net/2142/108476

University of New South Wales
3.
Gilbert, Alexander.
Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation.
Degree: Mathematics & Statistics, 2018, University of New South Wales
URL: http://handle.unsw.edu.au/1959.4/60171
;
https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51009/SOURCE2?view=true
► Approximating high and infinite dimensional integrals numerically is in general a very difficult problem. However, it is also one that arises in several applications from…
(more)
▼ Approximating
high and infinite
dimensional integrals numerically is in general a very difficult problem. However, it is also one that arises in several applications from statistics, finance and uncertainty quantification, thus motivating a real need for the development and analysis of efficient algorithms. The difficulty lies in the fact that in general
high-
dimensional problems suffer from the curse of dimensionality where the cost of an approximation rises exponentially with dimension. However, knowing certain properties of the integrands allows one to identify problems that do not suffer from the curse and for which efficient algorithms can be developed. In this thesis we study numerical integration algorithms, specifically Quasi-Monte Carlo (QMC) quadrature rules and the Multivariate Decomposition Method (MDM), when bounds on the first mixed derivatives are known. The focus in this thesis is on analysis and development of algorithms, a new application for QMC methods from the field of uncertainty quantification and efficient strategies for implementing numerical integration algorithms.The main results of this thesis are as follows. First, we present a full error analysis for the application of QMC methods to approximate the expectation of the smallest eigenvalue of an elliptic differential operator with coefficients that are parametrised by infinitely-many stochastic variables. Eigenvalue problems are used to model many physical situations in engineering and the natural sciences, and this problem is motivated by uncertainty quantification of such problems. It also represents a new application for QMC methods. Second, we provide explicit details and numerical results on how to efficiently implement the Multivariate Decomposition Method (MDM) for approximating infinite-
dimensional integrals. The third contribution of this thesis is a new method of constructing optimal active sets for use in the MDM. Finally, we present two user-friendly Component-by-Component algorithms for constructing QMC lattice rules, which automatically choose good function space weight parameters. In all cases we present numerical results that display the advantages of the algorithms, and where appropriate substantiate our corresponding theoretical results.
Advisors/Committee Members: Kuo, Frances, Mathematics & Statistics, Faculty of Science, UNSW, Sloan, Ian, Mathematics & Statistics, Faculty of Science, UNSW.
Subjects/Keywords: High-dimensional integration; Numerical integration; Quasi-Monte Carlo; Infinite-dimensional integration
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gilbert, A. (2018). Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation. (Doctoral Dissertation). University of New South Wales. Retrieved from http://handle.unsw.edu.au/1959.4/60171 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51009/SOURCE2?view=true
Chicago Manual of Style (16th Edition):
Gilbert, Alexander. “Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation.” 2018. Doctoral Dissertation, University of New South Wales. Accessed February 28, 2021.
http://handle.unsw.edu.au/1959.4/60171 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51009/SOURCE2?view=true.
MLA Handbook (7th Edition):
Gilbert, Alexander. “Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation.” 2018. Web. 28 Feb 2021.
Vancouver:
Gilbert A. Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation. [Internet] [Doctoral dissertation]. University of New South Wales; 2018. [cited 2021 Feb 28].
Available from: http://handle.unsw.edu.au/1959.4/60171 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51009/SOURCE2?view=true.
Council of Science Editors:
Gilbert A. Algorithms for numerical integration in high and infinite dimensions: Analysis, applications and implementation. [Doctoral Dissertation]. University of New South Wales; 2018. Available from: http://handle.unsw.edu.au/1959.4/60171 ; https://unsworks.unsw.edu.au/fapi/datastream/unsworks:51009/SOURCE2?view=true

University of Alberta
4.
Fedoruk, John P.
Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements.
Degree: MS, Department of Mathematical and Statistical
Sciences, 2016, University of Alberta
URL: https://era.library.ualberta.ca/files/cm039k5065
► In an increasingly data-driven society, there is a growing need to simplify high-dimensional data sets. Over the course of the past three decades, the Johnson…
(more)
▼ In an increasingly data-driven society, there is a
growing need to simplify high-dimensional data sets. Over the
course of the past three decades, the Johnson and Lindenstrauss
(JL) lemma has evolved from a highly abstract mathematical result
into a useful tool for dealing with data sets of immense
dimensionality. The lemma asserts that a set of high-dimensional
points can be projected into lower dimensions while approximately
preserving the pairwise distance structure. The JL lemma has been
revisited many times, with improvements to both its sharpness
(i.e., bound on the reduced dimensionality) and its simplicity
(i.e., mathematical derivation). In 2008 Matousek provided
generalizations of the JL lemma that lacked the sharpness of
earlier approaches. The current investigation seeks to strengthen
Matousek's results by maintaining generality while improving
sharpness. First, Matousek's results are reproved with more
detailed mathematics and, second, computational solutions are
obtained on simulated data in Matlab. The reproofs result in a more
specific bound than suggested by Matousek while maintaining his
level of generality. However, the reproofs lack the sharpness
suggested by earlier, less general approaches to the JL lemma. The
computational solutions suggest the existence of a result that
maintains Matousek's generality while attaining the sharpness
suggested by his predecessors. The collective results of the
current investigation support the notion that computational
solutions play a critical role in the development of mathematical
theory.
Subjects/Keywords: Dimensionality Reduction; High Dimensional Data; Johnson Lindenstrauss
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Fedoruk, J. P. (2016). Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements. (Masters Thesis). University of Alberta. Retrieved from https://era.library.ualberta.ca/files/cm039k5065
Chicago Manual of Style (16th Edition):
Fedoruk, John P. “Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements.” 2016. Masters Thesis, University of Alberta. Accessed February 28, 2021.
https://era.library.ualberta.ca/files/cm039k5065.
MLA Handbook (7th Edition):
Fedoruk, John P. “Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements.” 2016. Web. 28 Feb 2021.
Vancouver:
Fedoruk JP. Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements. [Internet] [Masters thesis]. University of Alberta; 2016. [cited 2021 Feb 28].
Available from: https://era.library.ualberta.ca/files/cm039k5065.
Council of Science Editors:
Fedoruk JP. Dimensionality Reduction via the Johnson and Lindenstrauss
Lemma: Mathematical and Computational Improvements. [Masters Thesis]. University of Alberta; 2016. Available from: https://era.library.ualberta.ca/files/cm039k5065

University of Michigan
5.
Qian, Cheng.
Some Advances on Modeling High-Dimensional Data with Complex Structures.
Degree: PhD, Statistics, 2017, University of Michigan
URL: http://hdl.handle.net/2027.42/140828
► Recent advances in technology have created an abundance of high-dimensional data and made its analysis possible. These data require new, computationally efficient methodology and new…
(more)
▼ Recent advances in technology have created an abundance of
high-
dimensional data and made its analysis possible. These data require new, computationally efficient methodology and new kind of asymptotic analysis. This thesis consists of four projects that deal with
high-
dimensional data with complex structures.
The first project focuses on the graph estimation problem for Gaussian graphical models. Graphical models are commonly used in representing conditional independence between random variables, and learning the conditional independence structure from data has attracted much attention in recent years. However, almost all commonly used graph learning methods rely on the assumption that the observations share the same mean vector. In the first project, we extend the Gaussian graphical model to the setting where the observations are connected by a network and the mean vector can be different for different observations. We propose an efficient estimation method for the model, and under the assumption of network cohesion, we show that our method can accurately estimate the inverse covariance matrix as well as the corresponding graph structure, both from the theoretical perspective and using numerical studies. To further demonstrate the effectiveness of the proposed method, we also analyze a statisticians' coauthorship network data to learn the term dependency based on statistics publications.
The second project addresses the directed acyclic graph (DAG) estimation problem. Estimation of the DAG structure is often a challenging problem as the computational complexity scales exponentially in the graph size when the total ordering of the DAG is unknown. To reduce the computational cost, and also with the aim of improving the estimation accuracy via the bias-variance trade-off, we propose a two-step approach for estimating the DAG, when data are generated from a linear structural equation model. In the first step, we infer the moral graph of the DAG via estimation of the inverse covariance matrix, which reduces the parameter space that one would search for the DAG. In the second step, we apply a penalized likelihood method for estimating the DAG restricted in the reduced space. Numerical studies indicate that the proposed method compares favorably with the one-step method in terms of both computational cost and estimation accuracy.
The third and fourth projects investigate supervised learning problems. Specifically, in the third project, we study the cointegration problem for multivariate time series data and propose a method for identifying cointegrating vectors with simultaneously group and elementwise sparse structures. Such a sparsity structure enables the elimination of certain coordinates of the original multivariate series from all cointegrated series, leading to parsimonious and potentially more interpretable cointegrating vectors. Specifically, we formulate an optimization problem based on the profile likelihood and propose an iterative algorithm for solving the optimization problem. The proposed…
Advisors/Committee Members: Zhu, Ji (committee member), Jin, Judy (committee member), Levina, Elizaveta (committee member), Shedden, Kerby A (committee member).
Subjects/Keywords: High-Dimensional; Statistics and Numeric Data; Science
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Qian, C. (2017). Some Advances on Modeling High-Dimensional Data with Complex Structures. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/140828
Chicago Manual of Style (16th Edition):
Qian, Cheng. “Some Advances on Modeling High-Dimensional Data with Complex Structures.” 2017. Doctoral Dissertation, University of Michigan. Accessed February 28, 2021.
http://hdl.handle.net/2027.42/140828.
MLA Handbook (7th Edition):
Qian, Cheng. “Some Advances on Modeling High-Dimensional Data with Complex Structures.” 2017. Web. 28 Feb 2021.
Vancouver:
Qian C. Some Advances on Modeling High-Dimensional Data with Complex Structures. [Internet] [Doctoral dissertation]. University of Michigan; 2017. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/2027.42/140828.
Council of Science Editors:
Qian C. Some Advances on Modeling High-Dimensional Data with Complex Structures. [Doctoral Dissertation]. University of Michigan; 2017. Available from: http://hdl.handle.net/2027.42/140828

Cornell University
6.
Gaynanova, Irina.
Estimation Of Sparse Low-Dimensional Linear Projections.
Degree: PhD, Statistics, 2015, Cornell University
URL: http://hdl.handle.net/1813/40643
► Many multivariate analysis problems are unified under the framework of linear projections. These projections can be tailored towards the analysis of variance (principal components), classification…
(more)
▼ Many multivariate analysis problems are unified under the framework of linear projections. These projections can be tailored towards the analysis of variance (principal components), classification (discriminant analysis) or network recovery (canonical correlation analysis). Traditional techniques form these projections by using all of the original variables, however in recent years there has been a lot of interest in performing variable selection. The main goal of this dissertation is to elucidate some of the fundamental issues that arise in highdimensional multivariate analysis and provide computationally efficient and theoretically sound alternatives to existing heuristic techniques
Advisors/Committee Members: Booth,James (chair), Wells,Martin Timothy (coChair), Mezey,Jason G. (committee member), Wegkamp,Marten H. (committee member).
Subjects/Keywords: multivariate analysis; high-dimensional statistics; classification
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gaynanova, I. (2015). Estimation Of Sparse Low-Dimensional Linear Projections. (Doctoral Dissertation). Cornell University. Retrieved from http://hdl.handle.net/1813/40643
Chicago Manual of Style (16th Edition):
Gaynanova, Irina. “Estimation Of Sparse Low-Dimensional Linear Projections.” 2015. Doctoral Dissertation, Cornell University. Accessed February 28, 2021.
http://hdl.handle.net/1813/40643.
MLA Handbook (7th Edition):
Gaynanova, Irina. “Estimation Of Sparse Low-Dimensional Linear Projections.” 2015. Web. 28 Feb 2021.
Vancouver:
Gaynanova I. Estimation Of Sparse Low-Dimensional Linear Projections. [Internet] [Doctoral dissertation]. Cornell University; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/1813/40643.
Council of Science Editors:
Gaynanova I. Estimation Of Sparse Low-Dimensional Linear Projections. [Doctoral Dissertation]. Cornell University; 2015. Available from: http://hdl.handle.net/1813/40643

Penn State University
7.
Yu, Ye.
A New Variable Screening Procedure for COX'S Model.
Degree: 2014, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/23542
► Survival data with ultrahigh dimensional covariates such as genetic markers have been collected in medical studies and other �fields. In this thesis, we propose a…
(more)
▼ Survival data with ultrahigh
dimensional covariates such as genetic markers have been collected in medical studies and other �fields. In this thesis, we propose a feature screening procedure for the Cox model with ultrahigh
dimensional covariates. The proposed procedure is distinguished from the existing sure independence screening (SIS) procedures (Fan, Feng and Wu, 2010, Zhao and Li, 2012) in that the proposed procedure is based on joint likelihood of potential active predictors, and therefore is not a marginal screening procedure.
The proposed procedure can effectively identify active predictors that are jointly dependent but marginally independent of the response without performing an
iterative procedure. We develop a computationally effective algorithm to carry out the proposed procedure and establish the ascent property of the proposed algorithm. We also conduct Monte Carlo simulation to evaluate the �finite sample performance of the proposed procedure and further compare the proposed procedure
and existing SIS procedures. The proposed methodology is also demonstrated through an empirical analysis of a real data example.
Advisors/Committee Members: Runze Li, Thesis Advisor/Co-Advisor.
Subjects/Keywords: screening; COX's model; high dimensional; iterative procedure
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Yu, Y. (2014). A New Variable Screening Procedure for COX'S Model. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/23542
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Yu, Ye. “A New Variable Screening Procedure for COX'S Model.” 2014. Thesis, Penn State University. Accessed February 28, 2021.
https://submit-etda.libraries.psu.edu/catalog/23542.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Yu, Ye. “A New Variable Screening Procedure for COX'S Model.” 2014. Web. 28 Feb 2021.
Vancouver:
Yu Y. A New Variable Screening Procedure for COX'S Model. [Internet] [Thesis]. Penn State University; 2014. [cited 2021 Feb 28].
Available from: https://submit-etda.libraries.psu.edu/catalog/23542.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Yu Y. A New Variable Screening Procedure for COX'S Model. [Thesis]. Penn State University; 2014. Available from: https://submit-etda.libraries.psu.edu/catalog/23542
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
8.
Löffler, Matthias.
Statistical inference in high-dimensional matrix models.
Degree: PhD, 2020, University of Cambridge
URL: https://www.repository.cam.ac.uk/handle/1810/298064
► Matrix models are ubiquitous in modern statistics. For instance, they are used in finance to assess interdependence of assets, in genomics to impute missing data…
(more)
▼ Matrix models are ubiquitous in modern statistics. For instance, they are used in finance to assess interdependence of assets, in genomics to impute missing data and in movie recommender systems to model the relationship between users and movie ratings.
Typically such models are either high-dimensional, meaning that the number of parameters may exceed the number of data points by many orders of magnitudes, or nonparametric in the sense that the quantity of interest is an infinite dimensional operator. This leads to new algorithms and also to new theoretical phenomena that may occur when estimating a parameter of interest or functionals of it or when constructing confidence sets. In this thesis, we will exemplarily consider three such matrix models and develop statistical theory for them: Matrix completion, Principal Component Analysis (PCA) with Gaussian data and transition operators of Markov chains. \ \\
We start with matrix completion and investigate the existence of adaptive confidence sets in the 'Bernoulli' and 'trace-regression' models. In the 'Bernoulli' model we show that adaptive confidence sets do not exist when the variance of the errors is unknown, whereas we give an explicit construction in the ’trace-regression’ model. Finally, in the known variance case, we show that adaptive confidence sets do also exist in the 'Bernoulli' model based on a testing argument. \ \\
Next, we consider PCA in a Gaussian observation model with complexity measured by the effective rank, the reciprocal of the percentage of variance explained by the first principal component. We investigate estimation of linear functionals of eigenvectors and prove Berry-Essen type bounds. Due to the high-dimensionality of the problem we discover a new phenomenon: The plug-in estimator based on the sample eigenvector can have non-negligible bias and hence may be not √{n}-consistent anymore. We show how to de-bias this estimator, achieving √{n}-convergence rates, and prove exact matching minimax lower bounds. \ \\
Finally, we consider nonparametric estimation of the transition operator of a Markov chain and its transition density. We assume that the singular values of the transition operator decay exponentially. For example, this assumption is fulfilled by discrete, low frequency observations of periodised, reversible stochastic differential equations. Using penalization techniques from low rank matrix estimation we develop a new algorithm and show improved convergence rates.
Subjects/Keywords: High-dimensional Statistics; Low-rank inference; PCA
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Löffler, M. (2020). Statistical inference in high-dimensional matrix models. (Doctoral Dissertation). University of Cambridge. Retrieved from https://www.repository.cam.ac.uk/handle/1810/298064
Chicago Manual of Style (16th Edition):
Löffler, Matthias. “Statistical inference in high-dimensional matrix models.” 2020. Doctoral Dissertation, University of Cambridge. Accessed February 28, 2021.
https://www.repository.cam.ac.uk/handle/1810/298064.
MLA Handbook (7th Edition):
Löffler, Matthias. “Statistical inference in high-dimensional matrix models.” 2020. Web. 28 Feb 2021.
Vancouver:
Löffler M. Statistical inference in high-dimensional matrix models. [Internet] [Doctoral dissertation]. University of Cambridge; 2020. [cited 2021 Feb 28].
Available from: https://www.repository.cam.ac.uk/handle/1810/298064.
Council of Science Editors:
Löffler M. Statistical inference in high-dimensional matrix models. [Doctoral Dissertation]. University of Cambridge; 2020. Available from: https://www.repository.cam.ac.uk/handle/1810/298064

Delft University of Technology
9.
Grisel, Bastiaan (author).
The analysis of three-dimensional embeddings in Virtual Reality.
Degree: 2018, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:afad36f5-64c7-4969-9615-93d89b43e65f
► Dimensionality reduction algorithms transform high-dimensional datasets with many attributes per observation into lower-dimensional representations (called embeddings) such that the structure of the dataset is maintained…
(more)
▼ Dimensionality reduction algorithms transform high-dimensional datasets with many attributes per observation into lower-dimensional representations (called embeddings) such that the structure of the dataset is maintained as well as possible. In this research, the use of Virtual Reality (VR) to analyse these embeddings has been evaluated and compared to the analysis on a desktop computer. The rationale for using VR is two-fold: three-dimensional embeddings generally better preserve the structure of a high-dimensional dataset than two-dimensional embeddings and the analysis of three-dimensional embeddings is difficult on desktop monitors. A user study (n=29) has been conducted in which participants performed the common analysis task of cluster identification. The task has been performed using a two-dimensional embedding on a desktop computer, a three-dimensional embedding on a desktop computer and a three-dimensional embedding in Virtual Reality. On average, participants that had at least used VR once before could better and more consistently identify clusters in the VR experiments compared to other methods. Participants found it easier to analyse a three-dimensional embedding in VR compared to analysing it on a desktop computer.
Computer Science | Data Science and Technology
Advisors/Committee Members: Eisemann, Elmar (mentor), Vilanova Bartroli, Anna (graduation committee), Brinkman, Willem-Paul (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: virtual; reality; embedding; visualisation; data; high-dimensional
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Grisel, B. (. (2018). The analysis of three-dimensional embeddings in Virtual Reality. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:afad36f5-64c7-4969-9615-93d89b43e65f
Chicago Manual of Style (16th Edition):
Grisel, Bastiaan (author). “The analysis of three-dimensional embeddings in Virtual Reality.” 2018. Masters Thesis, Delft University of Technology. Accessed February 28, 2021.
http://resolver.tudelft.nl/uuid:afad36f5-64c7-4969-9615-93d89b43e65f.
MLA Handbook (7th Edition):
Grisel, Bastiaan (author). “The analysis of three-dimensional embeddings in Virtual Reality.” 2018. Web. 28 Feb 2021.
Vancouver:
Grisel B(. The analysis of three-dimensional embeddings in Virtual Reality. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Feb 28].
Available from: http://resolver.tudelft.nl/uuid:afad36f5-64c7-4969-9615-93d89b43e65f.
Council of Science Editors:
Grisel B(. The analysis of three-dimensional embeddings in Virtual Reality. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:afad36f5-64c7-4969-9615-93d89b43e65f

University of Minnesota
10.
Zhu, Yunzhang.
Grouping penalties and its applications to high-dimensional models.
Degree: PhD, Statistics, 2014, University of Minnesota
URL: http://hdl.handle.net/11299/165147
► Part I: In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To…
(more)
▼ Part I: In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph,regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph.Part II: Gaussian graphical models are useful to analyze and visualize conditional dependence relationships between interacting units. Motivated from network analysis under different experimental conditions, such as gene networks for disparate cancer subtypes, we model structural changes over multiple networks with possible heterogeneities. In particular, we estimate multiple precision matrices describing dependencies among interacting units through maximum penalized likelihood. Of particular interest are homogeneous groups of similar entries across and zero-entries of these matrices, referred to as clustering and sparseness structures, respectively. A non-convex method is proposed to seek a sparse representation for each matrix and identify clusters of the entries across the matrices. Computationally, we develop an efficient method on the basis of difference convex programming, the augmented Lagrangian method and the block-wise coordinate descent method, which is scalable to hundreds of graphs of thousands nodes through a simple necessary and sufficient partition rule, which divides nodes into smaller disjoint subproblems excluding zero-coefficients nodes for arbitrary graphs with convex relaxation. Theoretically, a finite-sample error bound is derived for the proposed method to reconstruct the clustering and sparseness structures. This leads to consistent reconstruction of these two structures simultaneously, permitting the number of unknown parameters to be exponential in the sample size, and yielding the optimal performance of the oracle estimator as if the true structures were given a priori. Simulation studies suggest that the method enjoys the benefit of…
Subjects/Keywords: Graphical models; Grouping penalty; High-dimensional statistics
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhu, Y. (2014). Grouping penalties and its applications to high-dimensional models. (Doctoral Dissertation). University of Minnesota. Retrieved from http://hdl.handle.net/11299/165147
Chicago Manual of Style (16th Edition):
Zhu, Yunzhang. “Grouping penalties and its applications to high-dimensional models.” 2014. Doctoral Dissertation, University of Minnesota. Accessed February 28, 2021.
http://hdl.handle.net/11299/165147.
MLA Handbook (7th Edition):
Zhu, Yunzhang. “Grouping penalties and its applications to high-dimensional models.” 2014. Web. 28 Feb 2021.
Vancouver:
Zhu Y. Grouping penalties and its applications to high-dimensional models. [Internet] [Doctoral dissertation]. University of Minnesota; 2014. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/11299/165147.
Council of Science Editors:
Zhu Y. Grouping penalties and its applications to high-dimensional models. [Doctoral Dissertation]. University of Minnesota; 2014. Available from: http://hdl.handle.net/11299/165147

University of Minnesota
11.
Ye, Changqing.
Network selection, information filtering and scalable computation.
Degree: PhD, Statistics, 2014, University of Minnesota
URL: http://hdl.handle.net/11299/172631
► This dissertation explores two application scenarios of sparsity pursuit method on large scale data sets. The first scenario is classification and regression in analyzing high…
(more)
▼ This dissertation explores two application scenarios of sparsity pursuit method on large scale data sets. The first scenario is classification and regression in analyzing high dimensional structured data, where predictors corresponds to nodes of a given directed graph. This arises in, for instance, identification of disease genes for the Parkinson's diseases from a network of candidate genes. In such a situation, directed graph describes dependencies among the genes, where direction of edges represent certain causal effects. Key to high-dimensional structured classification and regression is how to utilize dependencies among predictors as specified by directions of the graph. In this dissertation, we develop a novel method that fully takes into account such dependencies formulated through certain nonlinear constraints. We apply the proposed method to two applications, feature selection in large margin binary classification and in linear regression. We implement the proposed method through difference convex programming for the cost function and constraints. Finally, theoretical and numerical analyses suggest that the proposed method achieves the desired objectives. An application to disease gene identification is presented.The second application scenario is personalized information filtering which extracts the information specifically relevant to a user, predicting his/her preference over a large number of items, based on the opinions of users who think alike or its content. This problem is cast into the framework of regression and classification, where we introduce novel partial latent models to integrate additional user-specific and content-specific predictors, for higher predictive accuracy. In particular, we factorize a user-over-item preference matrix into a product of two matrices, each representing a user's preference and an item preference by users. Then we propose a likelihood method to seek a sparsest latent factorization, from a class of over-complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.
Subjects/Keywords: High dimensional data; Machine learning; Recommendation; Statistics
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ye, C. (2014). Network selection, information filtering and scalable computation. (Doctoral Dissertation). University of Minnesota. Retrieved from http://hdl.handle.net/11299/172631
Chicago Manual of Style (16th Edition):
Ye, Changqing. “Network selection, information filtering and scalable computation.” 2014. Doctoral Dissertation, University of Minnesota. Accessed February 28, 2021.
http://hdl.handle.net/11299/172631.
MLA Handbook (7th Edition):
Ye, Changqing. “Network selection, information filtering and scalable computation.” 2014. Web. 28 Feb 2021.
Vancouver:
Ye C. Network selection, information filtering and scalable computation. [Internet] [Doctoral dissertation]. University of Minnesota; 2014. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/11299/172631.
Council of Science Editors:
Ye C. Network selection, information filtering and scalable computation. [Doctoral Dissertation]. University of Minnesota; 2014. Available from: http://hdl.handle.net/11299/172631

University of Minnesota
12.
Chen, Sheng.
Computational and Statistical Aspects of High-Dimensional Structured Estimation.
Degree: PhD, Computer Science, 2018, University of Minnesota
URL: http://hdl.handle.net/11299/198991
► Modern statistical learning often faces high-dimensional data, for which the number of features that should be considered is very large. In consideration of various constraints…
(more)
▼ Modern statistical learning often faces high-dimensional data, for which the number of features that should be considered is very large. In consideration of various constraints encountered in data collection, such as cost and time, however, the available samples for applications in certain domains are of small size compared with the feature sets. In this scenario, statistical estimation becomes much more challenging than in the large-sample regime. Since the information revealed by small samples is inadequate for finding the optimal model parameters, the estimator may end up with incorrect models that appear to fit the observed data but fail to generalize to unseen ones. Owning to the prior knowledge about the underlying parameters, additional structures can be imposed to effectively reduce the parameter space, in which it is easier to identify the true one with limited data. This simple idea has inspired the study of high-dimensional statistics since its inception. Over the last two decades, sparsity has been one of the most popular structures to exploit when we estimate a high-dimensional parameter, which assumes that the number of nonzero elements in parameter vector/matrix is much smaller than its ambient dimension. For simple scenarios such as linear models, L1-norm based convex estimators like Lasso and Dantzig selector, have been widely used to find the true parameter with reasonable amount of computation and provably small error. Recent years have also seen a variety of structures proposed beyond sparsity, e.g., group sparsity and low-rankness of matrix, which are demonstrated to be useful in many applications. On the other hand, the aforementioned estimators can be extended to leverage new types of structures by finding appropriate convex surrogates like the L1 norm for sparsity. Despite their success on individual structures, current developments towards a unified understanding of various structures are still incomplete in both computational and statistical aspects. Moreover, due to the nature of the model or the parameter structure, the associated estimator can be inherently non-convex, which may need additional care when we consider such unification of different structures. In this thesis, we aim to make progress towards a unified framework for the estimation with general structures, by studying the high-dimensional structured linear model and other semi-parametric and non-convex extensions. In particular, we introduce the generalized Dantzig selector (GDS), which extends the original Dantzig selector for sparse linear models. For the computational aspect, we develop an efficient optimization algorithm to compute the GDS. On statistical side, we establish the recovery guarantees of GDS using certain geometric measures. Then we demonstrate that those geometric measures can be bounded by utilizing simple information of the structures. These results on GDS have been extended to the matrix setting as well. Apart from the linear model, we also investigate one of its semi-parametric extension – the…
Subjects/Keywords: High-Dimensional Statistics; Machine Learning; Structured Estimation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Chen, S. (2018). Computational and Statistical Aspects of High-Dimensional Structured Estimation. (Doctoral Dissertation). University of Minnesota. Retrieved from http://hdl.handle.net/11299/198991
Chicago Manual of Style (16th Edition):
Chen, Sheng. “Computational and Statistical Aspects of High-Dimensional Structured Estimation.” 2018. Doctoral Dissertation, University of Minnesota. Accessed February 28, 2021.
http://hdl.handle.net/11299/198991.
MLA Handbook (7th Edition):
Chen, Sheng. “Computational and Statistical Aspects of High-Dimensional Structured Estimation.” 2018. Web. 28 Feb 2021.
Vancouver:
Chen S. Computational and Statistical Aspects of High-Dimensional Structured Estimation. [Internet] [Doctoral dissertation]. University of Minnesota; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/11299/198991.
Council of Science Editors:
Chen S. Computational and Statistical Aspects of High-Dimensional Structured Estimation. [Doctoral Dissertation]. University of Minnesota; 2018. Available from: http://hdl.handle.net/11299/198991

Massey University
13.
Ullah, Insha.
Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
.
Degree: 2015, Massey University
URL: http://hdl.handle.net/10179/6608
► High-dimensional data sets, particularly those where the number of variables exceeds the number of observations, are now common in many subject areas including genetics, ecology,…
(more)
▼ High-dimensional data sets, particularly those where the number of variables exceeds
the number of observations, are now common in many subject areas including
genetics, ecology, and statistical pattern recognition to name but a few. The
sample covariance matrix becomes rank deficient and is not invertible when the
number of variables are more than the number of observations. This poses a serious
problem for many classical multivariate techniques that rely on an inverse
of a covariance matrix. Recently, regularized alternatives to the sample covariance
have been proposed, which are not only guaranteed to be positive definite
but also provide reliable estimates. In this Thesis, we bring together some of the
important recent regularized estimators of the covariance matrix and explore their
performance in high-dimensional scenarios via numerical simulations. We make
use of these regularized estimators and attempt to improve the performance of the
three classical multivariate techniques in high-dimensional settings.
In a multivariate random effects models, estimating the between-group covariance
is a well known problem. Its classical estimator involves the difference of two
mean square matrices and often results in negative elements on the main diagonal.
We use a lasso-regularized estimate of the between-group mean square and
propose a new approach to estimate the between-group covariance based on the
EM-algorithm. Using simulation, the procedure is shown to be quite effective and
the estimate obtained is always positive definite.
Multivariate analysis of variance (MANOVA) face serious challenges due to the undesirable
properties of the sample covariance in high-dimensional problems. First,
it suffer from low power and does not maintain accurate type-I error when the
dimension is large as compared to the sample size. Second, MANOVA relies on
the inverse of a covariance matrix and fails to work when the number of variables
exceeds the number of observation. We use an approach based on the lasso regularization
and present a comparative study of the existing approaches including
our proposal. The lasso approach is shown to be an improvement in some cases,
in terms of power of the test, over the existing high-dimensional methods.
Another problem that is addressed in the Thesis is how to detect unusual future
observations when the dimension is large. The Hotelling T2 control chart has
traditionally been used for this purpose. The charting statistic in the control chart
rely on the inverse of a covariance matrix and is not reliable in high-dimensional
problems. To get a reliable estimate of the covariance matrix we use a distribution
free shrinkage estimator. We make use of the available baseline set of data and
propose a procedure to estimate the control limits for monitoring the individual
future observations. The procedure do not assume multivariate normality and
seems robust to the violation of multivariate normality. The simulation study
shows that the new method performs better than…
Subjects/Keywords: Multivariate analysis;
High-dimensional data;
Covariance
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ullah, I. (2015). Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
. (Thesis). Massey University. Retrieved from http://hdl.handle.net/10179/6608
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Ullah, Insha. “Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
.” 2015. Thesis, Massey University. Accessed February 28, 2021.
http://hdl.handle.net/10179/6608.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Ullah, Insha. “Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
.” 2015. Web. 28 Feb 2021.
Vancouver:
Ullah I. Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
. [Internet] [Thesis]. Massey University; 2015. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10179/6608.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Ullah I. Contributions to high-dimensional data analysis : some applications of the regularized covariance matrices : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Albany, New Zealand
. [Thesis]. Massey University; 2015. Available from: http://hdl.handle.net/10179/6608
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Cambridge
14.
Löffler, Matthias.
Statistical inference in high-dimensional matrix models.
Degree: PhD, 2020, University of Cambridge
URL: https://doi.org/10.17863/CAM.45122
;
https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.793044
► Matrix models are ubiquitous in modern statistics. For instance, they are used in finance to assess interdependence of assets, in genomics to impute missing data…
(more)
▼ Matrix models are ubiquitous in modern statistics. For instance, they are used in finance to assess interdependence of assets, in genomics to impute missing data and in movie recommender systems to model the relationship between users and movie ratings. Typically such models are either high-dimensional, meaning that the number of parameters may exceed the number of data points by many orders of magnitudes, or nonparametric in the sense that the quantity of interest is an infinite dimensional operator. This leads to new algorithms and also to new theoretical phenomena that may occur when estimating a parameter of interest or functionals of it or when constructing confidence sets. In this thesis, we will exemplarily consider three such matrix models and develop statistical theory for them: Matrix completion, Principal Component Analysis (PCA) with Gaussian data and transition operators of Markov chains. We start with matrix completion and investigate the existence of adaptive confidence sets in the 'Bernoulli' and 'trace-regression' models. In the 'Bernoulli' model we show that adaptive confidence sets do not exist when the variance of the errors is unknown, whereas we give an explicit construction in the 'trace-regression' model. Finally, in the known variance case, we show that adaptive confidence sets do also exist in the 'Bernoulli' model based on a testing argument. Next, we consider PCA in a Gaussian observation model with complexity measured by the effective rank, the reciprocal of the percentage of variance explained by the first principal component. We investigate estimation of linear functionals of eigenvectors and prove Berry-Essen type bounds. Due to the high-dimensionality of the problem we discover a new phenomenon: The plug-in estimator based on the sample eigenvector can have non-negligible bias and hence may be not √n-consistent anymore. We show how to de-bias this estimator, achieving √n-convergence rates, and prove exact matching minimax lower bounds. Finally, we consider nonparametric estimation of the transition operator of a Markov chain and its transition density. We assume that the singular values of the transition operator decay exponentially. For example, this assumption is fulfilled by discrete, low frequency observations of periodised, reversible stochastic differential equations. Using penalization techniques from low rank matrix estimation we develop a new algorithm and show improved convergence rates.
Subjects/Keywords: High-dimensional Statistics; Low-rank inference; PCA
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Löffler, M. (2020). Statistical inference in high-dimensional matrix models. (Doctoral Dissertation). University of Cambridge. Retrieved from https://doi.org/10.17863/CAM.45122 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.793044
Chicago Manual of Style (16th Edition):
Löffler, Matthias. “Statistical inference in high-dimensional matrix models.” 2020. Doctoral Dissertation, University of Cambridge. Accessed February 28, 2021.
https://doi.org/10.17863/CAM.45122 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.793044.
MLA Handbook (7th Edition):
Löffler, Matthias. “Statistical inference in high-dimensional matrix models.” 2020. Web. 28 Feb 2021.
Vancouver:
Löffler M. Statistical inference in high-dimensional matrix models. [Internet] [Doctoral dissertation]. University of Cambridge; 2020. [cited 2021 Feb 28].
Available from: https://doi.org/10.17863/CAM.45122 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.793044.
Council of Science Editors:
Löffler M. Statistical inference in high-dimensional matrix models. [Doctoral Dissertation]. University of Cambridge; 2020. Available from: https://doi.org/10.17863/CAM.45122 ; https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.793044

Vanderbilt University
15.
-3545-4710.
Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics.
Degree: PhD, Economics, 2020, Vanderbilt University
URL: http://hdl.handle.net/1803/15939
► The new generation machine learning and high-dimensional techniques have become powerful tools for economists. In economics, researchers are often facing cross-sectional dependence. However, the existing…
(more)
▼ The new generation machine learning and
high-
dimensional techniques have become powerful tools
for economists. In economics, researchers are often facing cross-sectional dependence. However,
the existing methods are often established under an independent sampling assumption. Failure
of accounting for such dependence can potentially lead to false positive research results. This
dissertation attempts to provide a first look at some new machine learning and
high-
dimensional
methods under various of cross-sectional dependence assumptions.
Advisors/Committee Members: Sasaki, Yuya (advisor).
Subjects/Keywords: cluster robust inference; high-dimensional; machine learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
-3545-4710. (2020). Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics. (Doctoral Dissertation). Vanderbilt University. Retrieved from http://hdl.handle.net/1803/15939
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Chicago Manual of Style (16th Edition):
-3545-4710. “Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics.” 2020. Doctoral Dissertation, Vanderbilt University. Accessed February 28, 2021.
http://hdl.handle.net/1803/15939.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
MLA Handbook (7th Edition):
-3545-4710. “Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics.” 2020. Web. 28 Feb 2021.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Vancouver:
-3545-4710. Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics. [Internet] [Doctoral dissertation]. Vanderbilt University; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/1803/15939.
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Council of Science Editors:
-3545-4710. Three Essays in Cluster Robust Machine Learning and High-Dimensional Econometrics. [Doctoral Dissertation]. Vanderbilt University; 2020. Available from: http://hdl.handle.net/1803/15939
Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

Virginia Tech
16.
Blake, Patrick Michael.
Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer.
Degree: MS, Electrical Engineering, 2019, Virginia Tech
URL: http://hdl.handle.net/10919/87392
► Many data sets have too many features for conventional pattern recognition techniques to work properly. This thesis investigates techniques that alleviate these difficulties. One such…
(more)
▼ Many data sets have too many features for conventional pattern recognition techniques to work properly. This thesis investigates techniques that alleviate these difficulties. One such technique, biclustering, clusters data in both dimensions and is inherently resistant to the challenges posed by having too many features. However, the algorithms that implement biclustering have limitations in that the user must know at least the structure of the data and how many biclusters to expect. This is where the VIsual Statistical Data Analyzer, or VISDA, can help. It is a visualization tool that successively and progressively explores the structure of the data, identifying clusters along the way. This thesis proposes coupling VISDA with biclustering to overcome some of the challenges of data sets with too many features. Further, to increase the performance, usability, and maintainability as well as reduce costs, VISDA was translated from Matlab to a Python version called VISDApy. Both VISDApy and the overall process were demonstrated with real and synthetic data sets. The results of this work have the potential to improve analysts’ understanding of the relationships within complex data sets and their ability to make informed decisions from such data.
Advisors/Committee Members: Wang, Yue J. (committeechair), Xuan, Jianhua (committee member), Yu, Guoqiang (committee member).
Subjects/Keywords: high-dimensional data; biclustering; VISDA; VISDApy
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Blake, P. M. (2019). Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer. (Masters Thesis). Virginia Tech. Retrieved from http://hdl.handle.net/10919/87392
Chicago Manual of Style (16th Edition):
Blake, Patrick Michael. “Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer.” 2019. Masters Thesis, Virginia Tech. Accessed February 28, 2021.
http://hdl.handle.net/10919/87392.
MLA Handbook (7th Edition):
Blake, Patrick Michael. “Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer.” 2019. Web. 28 Feb 2021.
Vancouver:
Blake PM. Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer. [Internet] [Masters thesis]. Virginia Tech; 2019. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10919/87392.
Council of Science Editors:
Blake PM. Biclustering and Visualization of High Dimensional Data using VIsual Statistical Data Analyzer. [Masters Thesis]. Virginia Tech; 2019. Available from: http://hdl.handle.net/10919/87392

Princeton University
17.
Li, Yan.
Optimal Learning in High Dimensions
.
Degree: PhD, 2016, Princeton University
URL: http://arks.princeton.edu/ark:/88435/dsp014m90dx99b
► Collecting information in the course of sequential decision-making can be extremely challenging in high-dimensional settings, where the number of measurement budget is much smaller than…
(more)
▼ Collecting information in the course of sequential decision-making can be extremely challenging in
high-
dimensional settings, where the number of measurement budget is much smaller than both the number of alternatives and the number of parameters in the model. In the parametric setting, we derive a knowledge gradient policy with
high-
dimensional sparse additive belief models, where there are hundreds or even thousands of features, but only a small portion of these features contain explanatory power. This policy is a unique and novel hybrid of Bayesian ranking and selection with a frequentist learning approach called Lasso. Particularly, our method naturally combines a B-spline basis of finite order and approximates the nonparametric additive model and functional ANOVA model. Theoretically, we provide the estimation error bounds of the posterior mean estimate and the functional estimate. We also demonstrate how this method is applied to learn the structure of large RNA molecules. In the nonparametric setting, we explore
high-
dimensional sparse belief functions, without putting any assumptions on the model structure. A knowledge gradient policy in the framework of regularized regression trees is developed. This policy provides an effective and efficient method for sequential information collection as well as feature selection for nonparametric belief models. We also show how this method can be used in two clinical settings: identifying optimal clinical pathways for patients, and reducing medical expenses in finding the best doctors for a sequence of patients.
Advisors/Committee Members: Powell, Warren B (advisor).
Subjects/Keywords: Bayesian Optimization;
High-dimensional Statistics;
Optimal Learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Li, Y. (2016). Optimal Learning in High Dimensions
. (Doctoral Dissertation). Princeton University. Retrieved from http://arks.princeton.edu/ark:/88435/dsp014m90dx99b
Chicago Manual of Style (16th Edition):
Li, Yan. “Optimal Learning in High Dimensions
.” 2016. Doctoral Dissertation, Princeton University. Accessed February 28, 2021.
http://arks.princeton.edu/ark:/88435/dsp014m90dx99b.
MLA Handbook (7th Edition):
Li, Yan. “Optimal Learning in High Dimensions
.” 2016. Web. 28 Feb 2021.
Vancouver:
Li Y. Optimal Learning in High Dimensions
. [Internet] [Doctoral dissertation]. Princeton University; 2016. [cited 2021 Feb 28].
Available from: http://arks.princeton.edu/ark:/88435/dsp014m90dx99b.
Council of Science Editors:
Li Y. Optimal Learning in High Dimensions
. [Doctoral Dissertation]. Princeton University; 2016. Available from: http://arks.princeton.edu/ark:/88435/dsp014m90dx99b

University of Minnesota
18.
Sivakumar, Vidyashankar.
Beyond Sub-Gaussian and Independent Data in High Dimensional Regression.
Degree: PhD, Computer Science, 2020, University of Minnesota
URL: http://hdl.handle.net/11299/217800
► The past three decades has seen major developments in high-dimensional regression models leading to their successful use in applications from multiple domains including climate science,…
(more)
▼ The past three decades has seen major developments in high-dimensional regression models leading to their successful use in applications from multiple domains including climate science, finance, recommendation systems, computational biology, signal processing to name a few. The underlying assumption in high-dimensional regression models is that the phenomenon under study can be explained by a simple model with few variables. In high-dimensional parametric regression models with parameters existing in high-dimensional space, the simplicity assumption is encoded by a sparsity constraint to be satisfied by the parameter vector. Statistical analysis of high-dimensional regression models delves into the study of the properties of the models, including how faithfully the models recover the assumed true sparse parameter and model sensitivity to different data assumptions. While major progress has been made over the past several years, non-asymptotic statistical analysis of high-dimensional regression models still makes standard data assumptions of (sub)-Gaussianity and independence which do not hold in some practical applications. For example, datasets in climate and finance are known to have variables with heavier tails than Gaussian or bandit algorithms have data that is sequentially chosen thus violating the independence assumption. The topic of this thesis is the non-asymptotic statistical analysis and study of high-dimensional regression estimators under non-standard data assumptions, including analysis of traditional estimators like regularized least squares as also design of new algorithms to improve estimation performance. Our technical results highlight geometric properties of high-dimensional models and hence all results are expressed in terms of geometric quantities associated with the sparsity structure assumed for the parameter. Much of the analysis borrows tools and techniques from random matrix analysis, probability tools like generic chaining and, in general, probability results for behavior of random variables, vectors in high-dimensional space. We analyze four problems: 1. Regularized least squares with sub-exponential data: Data in multiple domains like finance, climate science are known to be sub-exponential, which have probability distributions with tails heavier than Gaussians but dominated by a suitably scaled centered exponential distribution. We study non-asymptotic estimation performance of the regularized least squares estimator with i.i.d. sub-exponential data showing that the estimation performance is slightly worse compared to the i.i.d. sub-Gaussian setting. 2. High-dimensional quantile regression: We study the quantile regression problem in high dimensions which models the conditional quantile of a response given covariates. While least squares regression is ideal to model the conditional mean of a response variable which is symmetric (sub)-Gaussian, there are multiple applications where it is imperative/of interest to model conditional quantiles of the response given covariates to…
Subjects/Keywords: Bandits and online learning; High-dimensional regression
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Sivakumar, V. (2020). Beyond Sub-Gaussian and Independent Data in High Dimensional Regression. (Doctoral Dissertation). University of Minnesota. Retrieved from http://hdl.handle.net/11299/217800
Chicago Manual of Style (16th Edition):
Sivakumar, Vidyashankar. “Beyond Sub-Gaussian and Independent Data in High Dimensional Regression.” 2020. Doctoral Dissertation, University of Minnesota. Accessed February 28, 2021.
http://hdl.handle.net/11299/217800.
MLA Handbook (7th Edition):
Sivakumar, Vidyashankar. “Beyond Sub-Gaussian and Independent Data in High Dimensional Regression.” 2020. Web. 28 Feb 2021.
Vancouver:
Sivakumar V. Beyond Sub-Gaussian and Independent Data in High Dimensional Regression. [Internet] [Doctoral dissertation]. University of Minnesota; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/11299/217800.
Council of Science Editors:
Sivakumar V. Beyond Sub-Gaussian and Independent Data in High Dimensional Regression. [Doctoral Dissertation]. University of Minnesota; 2020. Available from: http://hdl.handle.net/11299/217800

University of Waterloo
19.
Alev, Vedat Levi.
Higher Order Random Walks, Local Spectral Expansion, and Applications.
Degree: 2020, University of Waterloo
URL: http://hdl.handle.net/10012/16310
► The study of spectral expansion of graphs and expander graphs has been an extremely fruitful line of research in Mathematics and Computer Science, with applications…
(more)
▼ The study of spectral expansion of graphs and expander graphs has been an extremely fruitful line of research in Mathematics and Computer Science, with applications ranging from random walks and fast sampling to optimization. In this dissertation, we study high dimensional local spectral expansion, which is a generalization of the theory of spectral expansion of graphs, to simplicial complexes.
We study two random walks on simplicial complexes, which we call the down-up walk, which captures a wide array of natural random walks which can be used to sample random combinatorial objects via the so-called heat-bath dynamics, and the swap walk, which can be thought as a random walk on a sparse version of the Kneser graph.
First, we give a sharp bound for the spectral gap of the down-up walks in terms of the local spectral expansion. Using this bound, we argue that the natural Markov chains for (i) sampling a random independent of fixed size s of a graph G = (V,E) is rapidly mixing, so long as s ≤ |V|/(∆+η) – where ∆ is the maximum degree of any vertex in G, and η is the magnitude of the least eigenvalue of the adjacency matrix of G; and (ii) sampling a common independent set from two partition matroids of fixed size s is rapidly mixing, so long as s ≤ r/3 – where r is the maximum size of any common independent set contained in both partition matroids.
Next, we study the spectrum of the swap walks, and show that using local spectral expansion we can relate the spectrum of the swap walk on any simplicial complex to the spectrum of the Kneser graph. We will mention applications of this result in (i) approximating constraint satisfaction problems (CSPs) on instances where the constraint hypergraph is a high dimensional local spectral expander; and in (ii) the construction of new families of list decodable codes based on (sparse) Ramanujan complexes of Lubotzky, Samuels, and Vishne.
Subjects/Keywords: spectral gap; Markov chains; high dimensional expanders; high dimensional expansion; random sampling
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Alev, V. L. (2020). Higher Order Random Walks, Local Spectral Expansion, and Applications. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/16310
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Alev, Vedat Levi. “Higher Order Random Walks, Local Spectral Expansion, and Applications.” 2020. Thesis, University of Waterloo. Accessed February 28, 2021.
http://hdl.handle.net/10012/16310.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Alev, Vedat Levi. “Higher Order Random Walks, Local Spectral Expansion, and Applications.” 2020. Web. 28 Feb 2021.
Vancouver:
Alev VL. Higher Order Random Walks, Local Spectral Expansion, and Applications. [Internet] [Thesis]. University of Waterloo; 2020. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/10012/16310.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Alev VL. Higher Order Random Walks, Local Spectral Expansion, and Applications. [Thesis]. University of Waterloo; 2020. Available from: http://hdl.handle.net/10012/16310
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

NSYSU
20.
Tai, Chiech-an.
An Automatic Data Clustering Algorithm based on Differential Evolution.
Degree: Master, Computer Science and Engineering, 2013, NSYSU
URL: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0730113-152814
► As one of the traditional optimization problems, clustering still plays a vital role for the re-searches both theoretically and practically nowadays. Although many successful clustering…
(more)
▼ As one of the traditional optimization problems, clustering still plays a vital role for the re-searches both theoretically and practically nowadays. Although many successful clustering algorithms have been presented, most (if not all) need to be given the number of clusters before the clustering procedure is invoked. A novel differential evolution based clustering algorithm is presented in this paper to solve the problem of automatically determining the number of clusters. The proposed algorithm, called enhanced differential evolution for automatic cluster-ing (EDEAC), leverages the strengths of two technologies: a novel histogram-based analysis technique for finding the approximate number of clusters and a heuristic search algorithm for
fine-tuning the automatic clustering results. The experimental results show that the proposed algorithm can not only determine the approximate number of clusters automatically, but it can also provide an accurate number of clusters rapidly even for
high dimensional datasets com-pared to other existing automatic clustering algorithms.
Advisors/Committee Members: Chun-Wei Tsai (chair), Ming-Chao Chiang (committee member), Chu-Sing Yang (chair), Tzung-Pei Hong (chair).
Subjects/Keywords: automatic clustering; data clustering; high-dimensional dataset; histogram analysis; differential evolution
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tai, C. (2013). An Automatic Data Clustering Algorithm based on Differential Evolution. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0730113-152814
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Tai, Chiech-an. “An Automatic Data Clustering Algorithm based on Differential Evolution.” 2013. Thesis, NSYSU. Accessed February 28, 2021.
http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0730113-152814.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Tai, Chiech-an. “An Automatic Data Clustering Algorithm based on Differential Evolution.” 2013. Web. 28 Feb 2021.
Vancouver:
Tai C. An Automatic Data Clustering Algorithm based on Differential Evolution. [Internet] [Thesis]. NSYSU; 2013. [cited 2021 Feb 28].
Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0730113-152814.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Tai C. An Automatic Data Clustering Algorithm based on Differential Evolution. [Thesis]. NSYSU; 2013. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0730113-152814
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

NSYSU
21.
Wang, Kai-hsuan.
Optical study of monolayer MoS2 film in high magnetic field.
Degree: Master, Physics, 2015, NSYSU
URL: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0628115-091340
► Recently, motivated by the discovery of graphene, two-dimensional materials have attracted more attention. MoS2 is one of focused two-dimensional material [1,2], owning two its gapped…
(more)
▼ Recently, motivated by the discovery of graphene, two-
dimensional materials have attracted more attention. MoS2 is one of focused two-
dimensional material [1,2], owning two its gapped energy structure and the similar crystal structure with graphene. The energy structure of MoS2 changes from the indirect band gap of 1.29 eV to the direct band gap of 1.9 eV when it is thinned from bulk to monolayer [3-7]. Recently, Yen-Hung Ho et al. have studied the Landau levels split on monolayer MoS2 in
high magnetic fields, which suggests the energy gap will show linear dependent with increase of magnetic field [8]. However, the
high-field experimental evidence still lack.
To study the magnetic-field dependent of energy structure, we have constructed an optical spectrum system. Similar experiments were performed in a middle-pulsed-
high-magnetic-field system that is located in the International MegaGauss Science Laboratory, The Institute for Solid State Physics, University of Tokyo, Japan. In Taiwan, we can measure the spectrum from 90 K to 300 K. In Japan, we measured only in 4 K, 77 K and room temperature.
In the literature, the monolayer MoS2 film have two absorption peaks, 1.95 and 2.08 eV, in the visible region [9]. The absorption peaks show blue shift when temperature decreases. This result is same as other one, the reason just caused by thermal expansion [10]. We observed that the temperature dependence of absorption peak showed dramatic change at T ~ 200 K, which cannot be explained by thermal expansion effect. The thermal-induced lattice anomaly could cause it. To know the origin of this behavior, more detailed experiments are needed.
We performed the magnetic-field-dependent optical measurements by using the pulse magnetic fields. With our home made coil design we acheieved peak field value of 8 T, 29 T and 52 T. When T = 77 K, the peaks behave red shift with increase of magnetic fields. In case of 4 K, the peaks show no tendency with magnetic field. Due to limited magnetic fields for the measured optical spectrum, it is hard to give a convincible conclusion on magnetic field effect of MoS2. After we synthesize new samples, we will perform further experiments in more different magnetic fields with the pulsed-magnetic-field system in Taiwan.
Advisors/Committee Members: Jim-Long Her (committee member), Hung-Duen Yang (committee member), Jiunn-Yuan Lin (chair), Hsiung Chou (chair).
Subjects/Keywords: band-gap; MoS2; absorption spectrum; high magnetic field; Two-dimensional materials
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wang, K. (2015). Optical study of monolayer MoS2 film in high magnetic field. (Thesis). NSYSU. Retrieved from http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0628115-091340
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Wang, Kai-hsuan. “Optical study of monolayer MoS2 film in high magnetic field.” 2015. Thesis, NSYSU. Accessed February 28, 2021.
http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0628115-091340.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Wang, Kai-hsuan. “Optical study of monolayer MoS2 film in high magnetic field.” 2015. Web. 28 Feb 2021.
Vancouver:
Wang K. Optical study of monolayer MoS2 film in high magnetic field. [Internet] [Thesis]. NSYSU; 2015. [cited 2021 Feb 28].
Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0628115-091340.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Wang K. Optical study of monolayer MoS2 film in high magnetic field. [Thesis]. NSYSU; 2015. Available from: http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0628115-091340
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of California – Riverside
22.
Zakaria, Jesin.
Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data.
Degree: Computer Science, 2013, University of California – Riverside
URL: http://www.escholarship.org/uc/item/660316zp
► Data mining and knowledge discovery has attracted a great deal of attention in information technology in recent years. The rapid progress of computer hardware technology…
(more)
▼ Data mining and knowledge discovery has attracted a great deal of attention in information technology in recent years. The rapid progress of computer hardware technology in the past three decades provides a great enhancement to the database and information industry. The size and complexity of real world data is dramatically increasing with the growth of hardware technology. Although new efficient algorithms to deal with such data are constantly being proposed, the mining of large scale high dimensional data still presents a lot of challenges. In this dissertation, several novel algorithms are proposed to handle such challenges. These algorithms are applied to domains as diverse as electrocardiography (ECG), stock market data, geospatial data, power supply data, audio data, image data, etc. This dissertation contributes to the data mining community in the following three ways:Firstly, we propose a novel algorithm for clustering time series data efficiently in the presence of noise or extraneous data. Most existing methods for time series clustering rely on distances calculated from the entire raw data. As a consequence, most work on time series clustering only considers the clustering of individual time series "behaviors," e.g., individual heart beats and contrives the time series in some way to make them all equal in length. However, for any real world problem, formatting the data in such a way is often a harder task than the clustering itself. In order to remove these unrealistic assumptions, we have developed a new primitive called unsupervised shapelet or u-shapelet and shown its utility for clustering time series.Secondly, in order to speed up the discovery of u-shapelet and make it scalable we have proposed two optimization techniques which can speed up the unsupervised shapelet discovery independently of each other. Moreover, if we combine the two optimization procedures, it results in a super linear speedup. In addition to the above, we can also cast our u-shapelet discovery algorithm as an anytime algorithm. In my final contribution, we have developed a novel and robust algorithm for mining mice vocalizations with symbolized representation. Our algorithm processes large scale, high dimensional, noisy mice vocalization by dimensionality reduction and cardinality reduction and make it suitable for knowledge discovery like classification, clustering, similarity search, motif discovery, contrast set mining etc.
Subjects/Keywords: Computer science; Clustering; Data Mining; High Dimensional Data; Scalable; Time Series
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zakaria, J. (2013). Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data. (Thesis). University of California – Riverside. Retrieved from http://www.escholarship.org/uc/item/660316zp
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zakaria, Jesin. “Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data.” 2013. Thesis, University of California – Riverside. Accessed February 28, 2021.
http://www.escholarship.org/uc/item/660316zp.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zakaria, Jesin. “Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data.” 2013. Web. 28 Feb 2021.
Vancouver:
Zakaria J. Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data. [Internet] [Thesis]. University of California – Riverside; 2013. [cited 2021 Feb 28].
Available from: http://www.escholarship.org/uc/item/660316zp.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zakaria J. Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data. [Thesis]. University of California – Riverside; 2013. Available from: http://www.escholarship.org/uc/item/660316zp
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of California – Berkeley
23.
Bhattacharyya, Sharmodeep.
A Study of High-dimensional Clustering and Statistical Inference on Networks.
Degree: Statistics, 2013, University of California – Berkeley
URL: http://www.escholarship.org/uc/item/9sx0k48k
► Clustering is an important unsupervised classification technique. In supervised classification, we are provided with a collection of labeled (pre-classified) patterns and the problem is to…
(more)
▼ Clustering is an important unsupervised classification technique. In supervised classification, we are provided with a collection of labeled (pre-classified) patterns and the problem is to label a newly encountered, yet unlabeled, pattern. At first, we consider clustering in Euclidean space in large dimensions. Then, we delve into the discrete setting of networks. We go into the issues related to network modeling and then into a specific method of clustering in networks. In the first chapter, we consider the problem of estimation and deriving theoretical properties of the estimators for the elliptical distributions. The class of elliptical distributions have distributions with varied tail behavior. So, estimation under class of elliptic distributions lead to automatic robust estimators. The goal of the chapter is to propose efficient and adaptive regularized estimators for the nonparametric component, mean and covariance matrix of the elliptical distributions in both high and fixed dimensional situations. An algorithm for regularized estimation of mixture of elliptical distributions will also lead to an algorithm for finding elliptical clusters in high dimensional space and such an approach is also given in the chapter. In clustering, one of the main challenges is the detection of number of clusters. Most clustering algorithms need the number of clusters to be specified beforehand. In chapter two, we propose a new method of selecting number of clusters, based on hypothesis testing. The study of networks has received increased attention recently not only from the social sciences and statistics but also from physicists,computer scientists and mathematicians. But a proper statistical analysis of features of different stochastic models of networks is still underway. In chapter three, we give an account of different network models and then we analyze a specific nonparametric model for networks. We consider the nonparametric estimate of link probabilities in dense social graphs in the context of network modeling and exploratory statistics.In chapter four, we also propose bootstrap methods for finding empirical distribution of count features or `moments' and smooth functions of these for the networks. Using these methods, we can not only estimate variance of count features but also get good estimates of such feature counts, which are usually expensive to compute numerically in large networks. In our paper, we prove theoretical properties of the bootstrap variance estimates of the count features as well as show their efficacy through simulation. We also use the method on publicly available Facebook network data for estimate of variance and expectation of some count features. In chapter five, we propose a clustering or community detection scheme for networks. One of the principal problem in networks is community detection. Many algorithms have been proposed for community finding but most of them do not have have theoretical guarantee for sparse networks and networks close to phase transition boundary proposed by…
Subjects/Keywords: Statistics; Bootstrap; Clustering; Community detection; Elliptical distributions; High-dimensional inference; Networks
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bhattacharyya, S. (2013). A Study of High-dimensional Clustering and Statistical Inference on Networks. (Thesis). University of California – Berkeley. Retrieved from http://www.escholarship.org/uc/item/9sx0k48k
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Bhattacharyya, Sharmodeep. “A Study of High-dimensional Clustering and Statistical Inference on Networks.” 2013. Thesis, University of California – Berkeley. Accessed February 28, 2021.
http://www.escholarship.org/uc/item/9sx0k48k.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Bhattacharyya, Sharmodeep. “A Study of High-dimensional Clustering and Statistical Inference on Networks.” 2013. Web. 28 Feb 2021.
Vancouver:
Bhattacharyya S. A Study of High-dimensional Clustering and Statistical Inference on Networks. [Internet] [Thesis]. University of California – Berkeley; 2013. [cited 2021 Feb 28].
Available from: http://www.escholarship.org/uc/item/9sx0k48k.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Bhattacharyya S. A Study of High-dimensional Clustering and Statistical Inference on Networks. [Thesis]. University of California – Berkeley; 2013. Available from: http://www.escholarship.org/uc/item/9sx0k48k
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Tulane University
24.
Qu, Zhe.
High-dimensional statistical data integration.
Degree: 2019, Tulane University
URL: https://digitallibrary.tulane.edu/islandora/object/tulane:106916
► [email protected]
Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A representative model for the integrative analysis of…
(more)
▼ [email protected]
Modern biomedical studies often collect multiple types of high-dimensional data on a common set of objects. A representative model for the integrative analysis of multiple data types is to decompose each data matrix into a low-rank common-source matrix generated by latent factors shared across all data types, a low-rank distinctive-source matrix corresponding to each data type, and an additive noise matrix. We propose a novel decomposition method, called the decomposition-based generalized canonical correlation analysis, which appropriately defines those matrices by imposing a desirable orthogonality constraint on distinctive latent factors that aims to sufficiently capture the common latent factors. To further delineate the common and distinctive patterns between two data types, we propose another new decomposition method, called the common and distinctive pattern analysis. This method takes into account the common and distinctive information between the coefficient matrices of the common latent factors. We develop consistent estimation approaches for both proposed decompositions under high-dimensional settings, and demonstrate their finite-sample performance via extensive simulations. We illustrate the superiority of proposed methods over the state of the arts by real-world data examples obtained from The Cancer Genome Atlas and Human Connectome Project.
1
Zhe Qu
Advisors/Committee Members: Hyman, James (Thesis advisor), School of Science & Engineering Mathematics (Degree granting institution).
Subjects/Keywords: High-dimensional data analysis; Data integration; Canonical correlation analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Qu, Z. (2019). High-dimensional statistical data integration. (Thesis). Tulane University. Retrieved from https://digitallibrary.tulane.edu/islandora/object/tulane:106916
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Qu, Zhe. “High-dimensional statistical data integration.” 2019. Thesis, Tulane University. Accessed February 28, 2021.
https://digitallibrary.tulane.edu/islandora/object/tulane:106916.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Qu, Zhe. “High-dimensional statistical data integration.” 2019. Web. 28 Feb 2021.
Vancouver:
Qu Z. High-dimensional statistical data integration. [Internet] [Thesis]. Tulane University; 2019. [cited 2021 Feb 28].
Available from: https://digitallibrary.tulane.edu/islandora/object/tulane:106916.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Qu Z. High-dimensional statistical data integration. [Thesis]. Tulane University; 2019. Available from: https://digitallibrary.tulane.edu/islandora/object/tulane:106916
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Tulane University
25.
Xu, Chao.
Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits.
Degree: 2018, Tulane University
URL: https://digitallibrary.tulane.edu/islandora/object/tulane:78817
► Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in…
(more)
▼ Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in the extreme phenotypic samples within the top and bottom percentiles, EPS can boost the study power compared with the random sampling with the same sample size. The existing statistical methods for EPS data test the variants/regions individually. However, many disorders are caused by multiple genetic factors. Therefore, it is critical to simultaneously model the effects of genetic factors, which may increase the power of current genetic studies and identify novel disease-associated genetic factors in EPS. The challenge of the simultaneous analysis of genetic data is that the number (p ~10,000) of genetic factors is typically greater than the sample size (n ~1,000) in a single study. The standard linear model would be inappropriate for this p>n problem due to the rank deficiency of the design matrix. An alternative solution is to apply a penalized regression method – the least absolute shrinkage and selection operator (LASSO).
LASSO can deal with this high-dimensional (p>n) problem by forcing certain regression coefficients to be zero. Although the application of LASSO in genetic studies under random sampling has been widely studied, its statistical inference and testing under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function to investigate the genetic associations, including the gene expression and rare variant analyses. The comprehensive simulation shows EPS-LASSO outperforms existing methods with superior power when the effects are large and stable type I error and FDR control. Together with the real data analysis of genetic study for obesity, our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors.
1
Chao Xu
Advisors/Committee Members: Deng, Hong-Wen (Thesis advisor), School of Public Health & Tropical Medicine Biostatistics and Bioinformatics (Degree granting institution).
Subjects/Keywords: extreme sampling; high-dimensional regression; genetic data analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Xu, C. (2018). Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits. (Thesis). Tulane University. Retrieved from https://digitallibrary.tulane.edu/islandora/object/tulane:78817
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Xu, Chao. “Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits.” 2018. Thesis, Tulane University. Accessed February 28, 2021.
https://digitallibrary.tulane.edu/islandora/object/tulane:78817.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Xu, Chao. “Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits.” 2018. Web. 28 Feb 2021.
Vancouver:
Xu C. Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits. [Internet] [Thesis]. Tulane University; 2018. [cited 2021 Feb 28].
Available from: https://digitallibrary.tulane.edu/islandora/object/tulane:78817.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Xu C. Hypothesis Testing for High-Dimensional Regression Under Extreme Phenotype Sampling of Continuous Traits. [Thesis]. Tulane University; 2018. Available from: https://digitallibrary.tulane.edu/islandora/object/tulane:78817
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
26.
Hwang, Sung Jin.
Geometric Representations of High Dimensional Random Data.
Degree: PhD, Electrical Engineering-Systems, 2012, University of Michigan
URL: http://hdl.handle.net/2027.42/96097
► This thesis introduces geometric representations relevant to the analysis of datasets of random vectors in high dimension. These representations are used to study the behavior…
(more)
▼ This thesis introduces geometric representations relevant to the analysis of datasets of random vectors in
high dimension. These representations are used to study the behavior of near-neighbor clusters in the dataset, shortest paths through the dataset, and evolution of multivariate probability distributions over the dataset. The results in this thesis have wide applicability to machine learning problems and are illustrated for problems including: spectral clustering; dimensionality reduction; activity recognition; and video indexing and retrieval.
This thesis makes several contributions. The first contribution is the shortest path over random points in a Riemannian manifold. More precisely, we establish complete convergence results of power-weighted shortest path lengths in compact Riemannian manifolds to conformal deformation distances. These shortest path results are used to interpret and extend Coiffman's anisotropic diffusion maps for clustering and dimensionality reduction. The second contribution is statistical manifolds that describe differences between curves evolving over a space of probability measures. A statistical manifold is a space of probability measures induced by the Fisher-Riemann metric. We propose to compare smoothly evolving probability distributions in statistical manifold by the surface area of the region between a pair of curves. The surface area measure is applied to activity classification for human movements. The third contribution proposes a dimensionality reduction and cluster analysis framework that uses a quantum mechanical model. This model leads to a generalization of geometric clustering methods such as k-means and Laplacian eigenmap in which the logical equivalence relation "two points are in the same cluster" is relaxed to a probabilistic equivalence relation.
Advisors/Committee Members: Damelin, Steven B. (committee member), Hero Iii, Alfred O. (committee member), Gilbert, Anna Catherine (committee member), Nadakuditi, Rajesh Rao (committee member), Scott, Clayton D. (committee member).
Subjects/Keywords: High Dimensional Data; Engineering
…foundation to analyze and understand the practice. When random data
from a high dimensional space… …representations for high-dimensional data are based on linear
models. For example, principal component… …and Alfred O. Hero III (2012). “Shortest path
for high-dimensional data… …high. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A run through the family… …z; R2 ). Since ∣ξ i − ξ j ∣ ≤ 2−1 R2 , with high probability we have L n (ξ i…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hwang, S. J. (2012). Geometric Representations of High Dimensional Random Data. (Doctoral Dissertation). University of Michigan. Retrieved from http://hdl.handle.net/2027.42/96097
Chicago Manual of Style (16th Edition):
Hwang, Sung Jin. “Geometric Representations of High Dimensional Random Data.” 2012. Doctoral Dissertation, University of Michigan. Accessed February 28, 2021.
http://hdl.handle.net/2027.42/96097.
MLA Handbook (7th Edition):
Hwang, Sung Jin. “Geometric Representations of High Dimensional Random Data.” 2012. Web. 28 Feb 2021.
Vancouver:
Hwang SJ. Geometric Representations of High Dimensional Random Data. [Internet] [Doctoral dissertation]. University of Michigan; 2012. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/2027.42/96097.
Council of Science Editors:
Hwang SJ. Geometric Representations of High Dimensional Random Data. [Doctoral Dissertation]. University of Michigan; 2012. Available from: http://hdl.handle.net/2027.42/96097

Georgia Tech
27.
Ahlin, Konrad Jeffrey.
The secant and traveling artificial potential field approaches to high dimensional robotic path planning.
Degree: PhD, Mechanical Engineering, 2018, Georgia Tech
URL: http://hdl.handle.net/1853/62196
► The field of robotic path planning is rich and diverse. As more complicated systems have become automated, the need for simple methods that can navigate…
(more)
▼ The field of robotic path planning is rich and diverse. As more complicated systems have become automated, the need for simple methods that can navigate
high dimensional spaces has increased. However, most path planning methods, such as Road Map methods and Search methods, increase exponentially with dimension, making them undesirable for complex robotics. Thus, the Secant and Traveling Artificial Potential Field (TAPF) approaches were developed. The Secant and TAPF approaches are modifications to the general Artificial Potential Field (APF) path planning algorithm with desirable properties, which make them ideal for path planning in
high dimensional space. All APF methods grow linearly with dimension; however, general APF methods are not guaranteed to converge given an arbitrary field of obstacles, significantly hindering the applicability of the APF algorithm. By specially tuning the artificial forces generated by the Secant and TAPF approaches, these methods can be shown to be globally asymptotically stable at the target location for a point robot in a field of point obstacles. To extend this theory for more practical applications, the concept of a boundary layer was introduced into the path planning algorithm. The boundary layer is a finite radius that encompasses an obstacle, such that the field is transformed within the boundary layer to account for the solid shape. By warping the landscape within the boundary layer, the system becomes mathematically equivalent to avoiding a point in space. From these advancements, the Secant and TAPF approaches were then demonstrated on planar robots and manipulators. These real-world systems were handled by selecting individual points on the robot that need to converge and treating them as separate systems coupled together by the defined constraints. For example, a planar robot is dynamically equivalent to two points constrained by a link. Similarly, a manipulator could be considered to be n-points jointed together. With the use of the Secant and TAPF approaches to the APF algorithm, robotic control and path planning could be drastically simplified, even for complex systems.
Advisors/Committee Members: Sadegh, Nader (advisor), Hu, Ai-Ping (advisor), Zhou, Hao-Min (committee member), Ueda, Jun (committee member), Isler, Volkan (committee member).
Subjects/Keywords: Artificial; Potential; Field; Secant; Robotic; Path planning; Trajectory; Dynamic; High dimensional
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ahlin, K. J. (2018). The secant and traveling artificial potential field approaches to high dimensional robotic path planning. (Doctoral Dissertation). Georgia Tech. Retrieved from http://hdl.handle.net/1853/62196
Chicago Manual of Style (16th Edition):
Ahlin, Konrad Jeffrey. “The secant and traveling artificial potential field approaches to high dimensional robotic path planning.” 2018. Doctoral Dissertation, Georgia Tech. Accessed February 28, 2021.
http://hdl.handle.net/1853/62196.
MLA Handbook (7th Edition):
Ahlin, Konrad Jeffrey. “The secant and traveling artificial potential field approaches to high dimensional robotic path planning.” 2018. Web. 28 Feb 2021.
Vancouver:
Ahlin KJ. The secant and traveling artificial potential field approaches to high dimensional robotic path planning. [Internet] [Doctoral dissertation]. Georgia Tech; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/1853/62196.
Council of Science Editors:
Ahlin KJ. The secant and traveling artificial potential field approaches to high dimensional robotic path planning. [Doctoral Dissertation]. Georgia Tech; 2018. Available from: http://hdl.handle.net/1853/62196

University of Illinois – Urbana-Champaign
28.
Ouyang, Yunbo.
Scalable sparsity structure learning using Bayesian methods.
Degree: PhD, Statistics, 2018, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/101264
► Learning sparsity pattern in high dimension is a great challenge in both implementation and theory. In this thesis we develop scalable Bayesian algorithms based on…
(more)
▼ Learning sparsity pattern in
high dimension is a great challenge in both implementation and theory. In this thesis we develop scalable Bayesian algorithms based on EM algorithm and variational inference to learn sparsity structure in various models. Estimation consistency and selection consistency of our methods are established. First, a nonparametric Bayes estimator is proposed for the problem of estimating a sparse sequence based on Gaussian random variables. We adopt the popular two-group prior with one component being a point mass at zero, and the other component being a mixture of Gaussian distributions. Although the Gaussian family has been shown to be suboptimal for this problem, we find that Gaussian mixtures, with a proper choice on the means and mixing weights, have the desired asymptotic behavior, e.g., the corresponding posterior concentrates on balls with the desired minimax rate. Second, the above estimator could be directly applied to the
high dimensional linear classification. In theory, we not only build a bridge to connect the estimation error of the mean difference and the classification error in different scenarios, also provide sufficient conditions of sub-optimal classifiers and optimal classifiers. Third, we study adaptive ridge regression for linear models. Adaptive ridge regression is closely related with Bayesian variable selection problem with Gaussian mixture spike-and-slab prior because it resembles EM algorithm developed in Wang et al. (2016) for the above problem. The output of adaptive ridge regression can be used to construct a distribution estimator to approximate posterior. We show the approximate posterior has the desired concentration property and adaptive ridge regression estimator has desired predictive error. Last, we propose a Bayesian approach to sparse principal components analysis (PCA). We show that our algorithm, which is based on variational approximation, achieves Bayesian selection consistency. Empirical studies have demonstrated the competitive performance of the proposed algorithm.
Advisors/Committee Members: Liang, Feng (advisor), Liang, Feng (Committee Chair), Qu, Annie (committee member), Narisetty, Naveen N (committee member), Zhu, Ruoqing (committee member).
Subjects/Keywords: Bayesian statistics; high-dimensional data analysis; variable selection
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ouyang, Y. (2018). Scalable sparsity structure learning using Bayesian methods. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/101264
Chicago Manual of Style (16th Edition):
Ouyang, Yunbo. “Scalable sparsity structure learning using Bayesian methods.” 2018. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed February 28, 2021.
http://hdl.handle.net/2142/101264.
MLA Handbook (7th Edition):
Ouyang, Yunbo. “Scalable sparsity structure learning using Bayesian methods.” 2018. Web. 28 Feb 2021.
Vancouver:
Ouyang Y. Scalable sparsity structure learning using Bayesian methods. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2018. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/2142/101264.
Council of Science Editors:
Ouyang Y. Scalable sparsity structure learning using Bayesian methods. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2018. Available from: http://hdl.handle.net/2142/101264

Texas A&M University
29.
Song, Qifan.
Variable Selection for Ultra High Dimensional Data.
Degree: PhD, Statistics, 2014, Texas A&M University
URL: http://hdl.handle.net/1969.1/153224
► Variable selection plays an important role for the high dimensional data analysis. In this work, we first propose a Bayesian variable selection approach for ultra-high…
(more)
▼ Variable selection plays an important role for the
high dimensional data analysis. In this work, we first propose a Bayesian variable selection approach for ultra-
high dimensional linear regression based on the strategy of split-and-merge. The proposed approach consists of two stages: (i) split the ultra-
high dimensional data set into a number of lower
dimensional subsets and select relevant variables from each of the subsets, and (ii) aggregate the variables selected from each subset and then select relevant variables from the aggregated data set. Since the proposed approach has an embarrassingly parallel structure, it can be easily implemented in a parallel architecture and applied to big data problems with millions or more of explanatory variables. Under mild conditions, we show that the proposed approach is consistent. That is, asymptotically, the true explanatory variables will be correctly identified by the proposed approach as the sample size becomes large. Extensive comparisons of the proposed approach have been made with the penalized likelihood approaches,
such as Lasso, elastic net, SIS and ISIS. The numerical results show that the proposed approach generally outperforms the penalized likelihood approaches. The models selected by the proposed approach tend to be more sparse and closer to the true model.
In the frequentist realm, penalized likelihood methods have been widely used in variable selection problems, where the penalty functions are typically symmetric about 0, continuous and nondecreasing in (0,∞). The second contribution of this work is that, we propose a new penalized likelihood method, reciprocal Lasso (or in short, rLasso), based on a new class of penalty functions which are decreasing in (0,∞), discontinuous at 0, and converge to infinity when the coefficients approach zero. The new penalty functions give nearly zero coefficients infinity penalties; in contrast, the conventional penalty functions give nearly zero coefficients nearly zero penalties (e.g., Lasso and SCAD) or constant penalties (e.g., L0 penalty). This distinguishing feature makes rLasso very attractive for variable selection: It can effectively avoid selecting overly dense models. We establish the consistency of the rLasso for variable selection and coefficient estimation under both the low and
high dimensional settings. Since the rLasso penalty functions induce an objective function with multiple local minima, we also propose an efficient Monte Carlo optimization algorithm to solve the minimization problem. Our simulation results show that the rLasso outperforms other popular penalized likelihood methods, such as Lasso, SCAD, MCP, SIS, ISIS and EBIC. It can produce sparser and more accurate coefficient estimates, and have a higher probability to catch true models.
Advisors/Committee Members: Liang, Faming (advisor), Carroll, Raymond (committee member), Johnson, Valen (committee member), Lahiri, Soumendra (committee member), Zhou, Jianxin (committee member).
Subjects/Keywords: High Dimensional Variable Selection; Big Data; Penalized Likelihood Approach; Posterior Consistency
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Song, Q. (2014). Variable Selection for Ultra High Dimensional Data. (Doctoral Dissertation). Texas A&M University. Retrieved from http://hdl.handle.net/1969.1/153224
Chicago Manual of Style (16th Edition):
Song, Qifan. “Variable Selection for Ultra High Dimensional Data.” 2014. Doctoral Dissertation, Texas A&M University. Accessed February 28, 2021.
http://hdl.handle.net/1969.1/153224.
MLA Handbook (7th Edition):
Song, Qifan. “Variable Selection for Ultra High Dimensional Data.” 2014. Web. 28 Feb 2021.
Vancouver:
Song Q. Variable Selection for Ultra High Dimensional Data. [Internet] [Doctoral dissertation]. Texas A&M University; 2014. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/1969.1/153224.
Council of Science Editors:
Song Q. Variable Selection for Ultra High Dimensional Data. [Doctoral Dissertation]. Texas A&M University; 2014. Available from: http://hdl.handle.net/1969.1/153224

McMaster University
30.
Pichika, Sathish chandra.
Sparse Canonical Correlation Analysis (SCCA): A Comparative Study.
Degree: MSc, 2011, McMaster University
URL: http://hdl.handle.net/11375/11779
► Canonical Correlation Analysis (CCA) is one of the multivariate statistical methods that can be used to find relationship between two sets of variables. I…
(more)
▼ Canonical Correlation Analysis (CCA) is one of the multivariate statistical methods that can be used to find relationship between two sets of variables. I highlighted challenges in analyzing high-dimensional data with CCA. Recently, Sparse CCA (SCCA) methods have been proposed to identify sparse linear combinations of two sets of variables with maximal correlation in the context of high-dimensional data. In my thesis, I compared three different SCCA approaches. I evaluated the three approaches as well as the classical CCA on simulated datasets and illustrated the methods with publicly available genomic and proteomic datasets.
Master of Science (MSc)
Advisors/Committee Members: Beyene, Joseph, Narayanaswamy Balakrishnan and Aaron Childs, Mathematics and Statistics.
Subjects/Keywords: CCA; SCCA; High-Dimensional; Multivariate Analysis; Multivariate Analysis
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Pichika, S. c. (2011). Sparse Canonical Correlation Analysis (SCCA): A Comparative Study. (Masters Thesis). McMaster University. Retrieved from http://hdl.handle.net/11375/11779
Chicago Manual of Style (16th Edition):
Pichika, Sathish chandra. “Sparse Canonical Correlation Analysis (SCCA): A Comparative Study.” 2011. Masters Thesis, McMaster University. Accessed February 28, 2021.
http://hdl.handle.net/11375/11779.
MLA Handbook (7th Edition):
Pichika, Sathish chandra. “Sparse Canonical Correlation Analysis (SCCA): A Comparative Study.” 2011. Web. 28 Feb 2021.
Vancouver:
Pichika Sc. Sparse Canonical Correlation Analysis (SCCA): A Comparative Study. [Internet] [Masters thesis]. McMaster University; 2011. [cited 2021 Feb 28].
Available from: http://hdl.handle.net/11375/11779.
Council of Science Editors:
Pichika Sc. Sparse Canonical Correlation Analysis (SCCA): A Comparative Study. [Masters Thesis]. McMaster University; 2011. Available from: http://hdl.handle.net/11375/11779
◁ [1] [2] [3] [4] [5] … [20] ▶
.