Group Sparsity in Regression and PCA

Deng, Yanzhen

Group Sparsity in Regression and PCA

dc.contributor.author	Deng, Yanzhen
dc.date.accessioned	2019-10-01T18:22:36Z
dc.date.available	NO_RESTRICTION
dc.date.available	2019-10-01T18:22:36Z
dc.date.issued	2019
dc.date.submitted	2019
dc.identifier.uri	https://hdl.handle.net/2027.42/151380
dc.description.abstract	In the field of high-dimensional statistics, it is commonly assumed that only a small subset of the variables are relevant and sparse estimators are pursued to exploit this assumption. Sparse estimation methodologies are often straightforward to construct, and indeed there is a full spectrum of sparse algorithms covering almost all statistical learning problems. In contrast, theoretical developments are more limited and often focus on asymptotic theories. In applications, non-asymptotic results may be more relevant. The goal of this work is to show how non-asymptotic statistical theory can be developed for sparse estimation problems that assume group sparsity. We discuss three different problems: principal component analysis (PCA), sliced inverse regression (SIR) and multivariate regression. For PCA, we study a two-stage thresholding algorithm and provide theories that go beyond the common spiked-covariance model. SIR is then related to PCA in some special settings, and it is shown that the theory of sparse PCA can be modified to work for SIR. Regression represents another important research direction in high-dimensional analysis. We study a linear regression model in which both the response and predictors are grouped, as an extension of group Lasso. Despite the distinctions in these problems, the proofs of consistency and support recovery share some common elements: concentration inequalities and union probability bounds, which are also the foundation of most existing sparse estimation theories. The proofs are presented in modules in order to clearly reveal how most sparse estimators can be theoretically justified. Moreover, we identify those modules that are possibly not optimized to show the limitation of the existing proof techniques and how they could be extended.
dc.language.iso	en_US
dc.subject	Sparse estimation
dc.subject	Principal component analysis
dc.subject	Sliced inverse regression
dc.subject	Group Lasso
dc.subject	High dimensional data
dc.subject	Non-asymptotic theory
dc.title	Group Sparsity in Regression and PCA
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Statistics
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Hsing, Tailen
dc.contributor.committeemember	Hero III, Alfred O
dc.contributor.committeemember	Tewari, Ambuj
dc.contributor.committeemember	Zhu, Ji
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbtoplevel	Science
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/151380/1/dengyz_1.pdf
dc.identifier.orcid	0000-0003-0213-7994
dc.identifier.name-orcid	Deng, Yanzhen; 0000-0003-0213-7994	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: dengyz_1.pdf
Size:: 1.623MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.