Causal inference for data subject to non-compliance and missing values.

Peng, Yahong

Causal inference for data subject to non-compliance and missing values.

Peng, Yahong

2001

View/Open

3016933.pdf
Restricted to current U-M faculty, staff, and students

(6.9MB

PDF)
Access Restricted to UM users only.

Abstract

Non-compliance is very common in randomized experiments involving human participants. The intent-to-treat method and the as-treated method are two common methods for estimating treatment effects in randomized experiments, but both have limitations. The intent-to-treatment method estimates the effect of treatment allocation rather than the treatment itself, and the as-treated method is subject to selection bias. Rubin causal model (RCM) provides a useful alternative way to analyze the data subject to non-compliance. The parameter of interest in the RCM approach is the treatment effect for the sub-population of compliers, called the Complier-Average Causal Effect (CACE). In addition to non-compliance, the existence of missing values in the outcome and the baseline covariates further complicates the data analysis. Under Rubin Causal Model framework, this dissertation develops new models for causal inference from experiments subject to non-compliance and missing values. Two problems are considered: (1) Inference for the CACE for discrete outcomes with non-compliance and missing values only in the outcomes; (2) Inference for the CACE for discrete or continuous outcomes with non-compliance and missing values in the outcomes and the covariates. A non-hierarchical form of loglinear models (Agresti, 1990; McCullagh and Nelder, 1989) is proposed for the first problem, and an extension of the general location model (Olkin and Tate, 1961; Little and Schluchter, 1985) is proposed for the second problem. Models are developed for both ignorable and latent ignorable missing data mechanisms as described in Frangakis and Rubin (1999). Inferences under these models are developed based on EM algorithms and Bayesian Markov-chain Monte Carlo methods. In addition, simulation studies are carried out comparing the likelihood-based approaches in this dissertation with existing methods, specifically the instrumental variable methods in social science literature. Sensitivity of the inference to model assumptions and the influence of missing data mechanisms are also investigated by simulations. Based on the simulation studies, likelihood-based methods appear to be more efficient than IV methods, and inferences for the CACE appear to be quite robust to lack of normality of the distribution of continuous outcomes for both likelihood-based and IV approaches. Results of the IV methods and the likelihood-based methods for binary outcomes are more sensitive to misspecification of the probit model. The proposed methods are applied to data from a Job Search prevention intervention for unemployed workers which motivated this research, and the intervention helps to increase the re-employment rate among compliers.

Subjects

Causal Inference

Compliance

Data

Missing Values

Non

Noncompliance

Subject

Types

Thesis

Handle

https://hdl.handle.net/2027.42/126233

URI

http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:3016933

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.