Show simple item record

High-Dimensional Statistical Inference: Phase Transition, Power Enhancement, and Sampling

dc.contributor.authorHe, Yinqiu
dc.date.accessioned2021-09-24T19:26:26Z
dc.date.available2021-09-24T19:26:26Z
dc.date.issued2021
dc.identifier.urihttps://hdl.handle.net/2027.42/169995
dc.description.abstractThe ``Big Data'' era features large amounts of high-dimensional data, in which the number of characteristics per subject is large. The high dimensionality of such big data can pose many new challenges for statistical inference, including (I) the invalidity of classical approximation theory, (II) the loss of statistical power, and (III) the increase of computational burden. This dissertation studies three important problems that arise in this context. (I) The first part introduces a newly discovered phase transition phenomenon of the widely used likelihood ratio tests. In particular, it is broadly recognized that classical large-sample approximation theory that is valid under finite dimensions may fail under high dimensions. But there is usually a lack of understanding of when such transition happens as the data dimension increases. This issue can hinder the validation of statistical inference in practice. Focusing on the popular likelihood ratio tests, we derive necessary and sufficient conditions characterizing the phase transition boundaries where Wilks' theorem becomes invalid. Based on this, we further obtain sharp characterization of the approximation bias of Wilks' theorem. (II) The second part proposes a novel adaptive testing framework that can maintain high statistical power against a variety of alternative hypotheses. Particularly, many scientific questions in high-dimensional data analyses can be formulated as testing high-dimensional parameters globally, e.g., testing whether there exists any association between a large number of SPNs and certain heritable disease in genome-wide association studies. In these problems, many existing methods are designed to capture certain directional information in a high-dimensional space and thus only powerful for specific alternatives. To enhance the statistical power, we construct an innovative family of test statistics that can capture the information in different directions of a high-dimensional space. For a broad class of problems, we establish high-dimensional asymptotic theory for the constructed statistics and develop testing procedures that are adaptively powerful across a wide range of scenarios. (III) The third part concerns the computational challenge of quantifying rare-event probabilities in statistical inference. In particular, analyzing high-dimensional data frequently involves a large number of hypotheses and results in stringent significance thresholds. It is therefore often required to accurately estimate an extreme tail probability of each test statistic. However, analytical formulae are usually unavailable for nontrivial statistics, and naive Monte Carlo methods usually require a huge number of simulations and are computationally costly. Driven by rare-event issues arising from testing covariance structures, we develop an asymptotically efficient importance sampling algorithm to compute the extreme tail probabilities of the popular ratio statistic of the largest eigenvalue to the trace of a Wishart matrix.
dc.language.isoen_US
dc.subjectHigh-dimensional statistics
dc.titleHigh-Dimensional Statistical Inference: Phase Transition, Power Enhancement, and Sampling
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineStatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberHe, Xuming
dc.contributor.committeememberXu, Gongjun
dc.contributor.committeememberSong, Peter Xuekun
dc.contributor.committeememberSun, Yuekai
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/169995/1/yqhe_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/3040
dc.identifier.orcid0000-0002-4829-033X
dc.identifier.name-orcidHe, Yinqiu; 0000-0002-4829-033Xen_US
dc.working.doi10.7302/3040en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.