Statistical Learning for Latent Attribute Models

Ma, Chenchen

Statistical Learning for Latent Attribute Models

dc.contributor.author	Ma, Chenchen
dc.date.accessioned	2022-09-06T16:27:08Z
dc.date.available	2022-09-06T16:27:08Z
dc.date.issued	2022
dc.date.submitted	2022
dc.identifier.uri	https://hdl.handle.net/2027.42/174627
dc.description.abstract	Latent variable models are popularly used in unsupervised learning to uncover the latent structures underlying observed data and have seen great successes in representation learning in many applications and scientific disciplines. Latent attribute models, also known as cognitive diagnosis models or diagnostic classification models, are a special family of discrete latent variable models that have been widely applied in modern psychological and biomedical research with diagnostic purposes. Despite the wide usage in various fields, the models' discrete nature and complex restricted structures pose many new challenges for efficient learning and statistical inference. Moreover, with the large-scale item and subject pools emerging in modern educational and psychological measurements, efficient algorithms for uncovering latent structures of both items and subjects are desired. This dissertation studies four important problems that arise in this context. (I) The first part develops novel methodologies and efficient algorithms to learn the latent and hierarchical structures in latent attribute models. Specifically, researchers in many applications are interested in hierarchical structures among the latent attributes, such as prerequisite relationships among target skills in educational settings. However, in most cognitive diagnosis applications, the number of latent attributes, the attribute-attribute hierarchical structures, the item-attribute dependence structures, as well as the item-level diagnostic models, need to be fully or partially pre-specified, which may be subjective and misspecified as noted by many recent studies. In this part, we consider the problem of jointly learning these latent quantities and hierarchical structures from observed data with minimal model assumptions. A penalized likelihood approach is proposed for joint learning, an Expectation-Maximization (EM) algorithm is developed for efficient computation, and statistical consistency theory is established under mild conditions. (II) The second part generalizes the methodologies in part I to simultaneously infer the subgroup structures of both subjects and items. We consider the model-based co-clustering algorithms and aim to automatically select numbers of clusters and uncover latent block structures. Specifically, based on latent block models, we propose a penalized co-clustering approach that is capable of learning the numbers of clusters and inner block structures simultaneously. Efficient EM algorithms have been developed and comprehensive simulation studies demonstrate their superiority. (III) The third part concerns the important yet unaddressed problem of testing the latent hierarchical structures in latent attribute models. Testing the hierarchical structures is shown to be equivalent to testing the sparsity structure of the proportion parameter vector. However, due to the irregularity of the problem, the asymptotic distribution of the popular likelihood ratio test becomes nonstandard and tends to provide unsatisfactory finite sample performance under practical conditions. To tackle these challenges, we discuss the conditions of testability issues, provide statistical understandings of the failures, and propose a practical resampling-based procedure. (IV) The fourth part introduces a unified estimation framework to bridge the gap between parametric and nonparametric methods in cognitive diagnosis to better understand their relationship. In particular, a number of parametric and nonparametric methods for estimating latent attribute models have been developed and applied in a wide range of contexts. However, in the literature, a wide chasm exists between these two families of methods, and their relationship to each other is not well understood. Driven by this divide, we propose a unified framework and provide both theoretical analysis and practical recommendations under various cognitive diagnosis settings.
dc.language.iso	en_US
dc.subject	Latent Variable Models
dc.subject	Cognitive Diagnosis
dc.subject	Hierarchical Structures
dc.subject	Co-clustering
dc.subject	Hypothesis Testing
dc.subject	Nonparametric
dc.title	Statistical Learning for Latent Attribute Models
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Statistics
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Xu, Gongjun
dc.contributor.committeemember	Wu, Zhenke
dc.contributor.committeemember	Tan, Kean Ming
dc.contributor.committeemember	Zhu, Ji
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbsecondlevel	Education
dc.subject.hlbsecondlevel	Psychology
dc.subject.hlbtoplevel	Science
dc.subject.hlbtoplevel	Social Sciences
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/174627/1/chenchma_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/6358
dc.identifier.orcid	0000-0003-2784-9920
dc.identifier.name-orcid	Ma, Chenchen; 0000-0003-2784-9920	en_US
dc.working.doi	10.7302/6358	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: chenchma_1.pdf
Size:: 6.326MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.