Accounting for not‐at‐random missingness through imputation stacking

Beesley, Lauren J.; Taylor, Jeremy M. G.

Accounting for not‐at‐random missingness through imputation stacking

dc.contributor.author	Beesley, Lauren J.
dc.contributor.author	Taylor, Jeremy M. G.
dc.date.accessioned	2021-12-02T02:31:00Z
dc.date.available	2022-12-01 21:30:58	en
dc.date.available	2021-12-02T02:31:00Z
dc.date.issued	2021-11-30
dc.identifier.citation	Beesley, Lauren J.; Taylor, Jeremy M. G. (2021). "Accounting for not‐at‐random missingness through imputation stacking." Statistics in Medicine 40(27): 6118-6132.
dc.identifier.issn	0277-6715
dc.identifier.issn	1097-0258
dc.identifier.uri	https://hdl.handle.net/2027.42/171019
dc.description.abstract	Not‐at‐random missingness presents a challenge in addressing missing data in many health research applications. In this article, we propose a new approach to account for not‐at‐random missingness after multiple imputation through weighted analysis of stacked multiple imputations. The weights are easily calculated as a function of the imputed data and assumptions about the not‐at‐random missingness. We demonstrate through simulation that the proposed method has excellent performance when the missingness model is correctly specified. In practice, the missingness mechanism will not be known. We show how we can use our approach in a sensitivity analysis framework to evaluate the robustness of model inference to different assumptions about the missingness mechanism, and we provide R package StackImpute to facilitate implementation as part of routine sensitivity analyses. We apply the proposed method to account for not‐at‐random missingness in human papillomavirus test results in a study of survival for patients diagnosed with oropharyngeal cancer.
dc.publisher	John Wiley and Sons, Inc
dc.subject.other	stacked imputation
dc.subject.other	chained equations multiple imputation
dc.subject.other	fully conditional specification
dc.subject.other	not‐at‐random missingness
dc.subject.other	sensitivity analysis
dc.title	Accounting for not‐at‐random missingness through imputation stacking
dc.type	Article
dc.rights.robots	IndexNoFollow
dc.subject.hlbsecondlevel	Statistics and Numeric Data
dc.subject.hlbsecondlevel	Public Health
dc.subject.hlbsecondlevel	Medicine (General)
dc.subject.hlbtoplevel	Social Sciences
dc.subject.hlbtoplevel	Health Sciences
dc.subject.hlbtoplevel	Science
dc.description.peerreviewed	Peer Reviewed
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/171019/1/sim9174-sup-0001-supinfo.pdf
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/171019/2/sim9174_am.pdf
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/171019/3/sim9174.pdf
dc.identifier.doi	10.1002/sim.9174
dc.identifier.source	Statistics in Medicine
dc.identifier.citedreference	Wang N, Robins JM. Large‐sample theory for parametric multiple imputation procedures. Biometrika. 1998; 85 ( 4 ): 935 ‐ 948.
dc.identifier.citedreference	Rubin DB. Multiple Imputation for Nonresponse in Surveys. 1st ed. New York, NY: John Wiley and Sons, Inc; 1987.
dc.identifier.citedreference	Raghunathan TE. A multivariate technique for multiply imputing missing values using a sequence of regression models. Surv Methodol. 2001; 27 ( 1 ): 85 ‐ 95.
dc.identifier.citedreference	Van Buuren S, Brand JPL, Groothuis‐Oudshoorn CGM, Rubin DB. Fully conditional specification in multivariate imputation. J Stat Comput Simul. 2006; 76 ( 12 ): 1049 ‐ 1064.
dc.identifier.citedreference	Little RJA, Rubin DB. Statistical Analysis with Missing Data. 2nd ed. Hoboken, NJ: John Wiley and Sons, Inc; 2002.
dc.identifier.citedreference	Tompsett DM, Leacy F, Moreno‐Betancur M, Heron J, White IR. On the use of the not‐at‐random fully conditional specification (NARFCS) procedure in practice. Stat Med. 2018; 37 ( 15 ): 2338 ‐ 2353. https://doi.org/10.1002/sim.7643
dc.identifier.citedreference	Jolani S. Dual Imputation Strategies for Analyzing Incomplete Data PhD thesis. Utrecht University; 2012.
dc.identifier.citedreference	Carpenter JR, Kenward MG, White IR. Sensitivity analysis after multiple imputation under missing at random: a weighting approach. Stat Methods Med Res. 2007; 16 ( 3 ): 259 ‐ 275.
dc.identifier.citedreference	Rezvan PH, White IR, Lee KJ, Carlin JB, Simpson JA. Evaluation of a weighting approach for performing sensitivity analysis after multiple imputation. BMC Med Res Methodol. 2015; 15 ( 83 ): 1 ‐ 16. https://doi.org/10.1186/s12874‐015‐0074‐2
dc.identifier.citedreference	Smuk M. Missing Data Methodology: Sensitivity Analysis after Multiple Imputation PhD thesis. London School of Hygiene and Tropical Medicine; 2015.
dc.identifier.citedreference	Beesley LJ, Taylor JMG. A stacked approach for chained equations multiple imputation incorporating the substantive model. Biometrics. 2020; 1 ‐ 13. https://doi.org/10.1111/biom.13372
dc.identifier.citedreference	Bernhardt P. A comparison of stacked and pooled multiple imputation. Paper presented at: Proceedings of the Joint Statistical Meetings Poster Presentation; 2019: Denver, Colorado, USA.
dc.identifier.citedreference	Tanner MA. Methods for the Exploration of Posterior Distributions and Likelihood Functions. 2nd ed. New York, NY: Springer; 1993.
dc.identifier.citedreference	Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat. 2013; 23 ( 6 ): 1352 ‐ 1371. https://doi.org/10.1080/10543406.2013.834911
dc.identifier.citedreference	Wood AM, White IR, Royston P. How should variable selection be performed with multiply imputed data? Stat Med. 2008; 27: 3227 ‐ 3246.
dc.identifier.citedreference	Louis TA. Finding the observed information matrix when using the EM algorithm. J R Stat Soc. 1982; 44 ( 2 ): 226 ‐ 233.
dc.identifier.citedreference	Ratitch B, Kelly MO, Tosiello R. Missing data in clinical trials: from clinical assumptions to statistical analysis using pattern mixture models. Pharm Stat. 2013; 12 ( 6 ): 337 ‐ 347. https://doi.org/10.1002/pst.1549
dc.identifier.citedreference	Tompsett D, Sutton S, Seaman SR, White IR. A general method for elicitation, imputation, and sensitivity analysis for incomplete repeated binary data. Stat Med. 2020; 39 ( 22 ): 2921 ‐ 2935. https://doi.org/10.1002/sim.8584
dc.identifier.citedreference	Rezvan PH, Lee KJ, Simpson JA. Sensitivity analysis within multiple imputation framework using delta‐adjustment: application to longitudinal study of Australian children. Longitud Life Course Stud. 2018; 9 ( 3 ): 259 ‐ 278.
dc.identifier.citedreference	Héraud‐Bousquet V, Larsen C, Carpenter J, Desenclos JC, Strat YL. Practical considerations for sensitivity analysis after multiple imputation applied to epidemiological studies with incomplete data. BMC Med Res Methodol. 2012; 12 ( 73 ): 1 ‐ 11.
dc.identifier.citedreference	Beesley LJ, Hawkins PG, Amlani LM, et al. Individualized survival prediction for patients with oropharyngeal cancer in the human papillomavirus era. Cancer. 2019; 125 ( 1 ): 69 ‐ 78.
dc.working.doi	NO	en
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: sim9174-sup-0001-supinfo.pdf
Size:: 446.9KB
Format:: PDF

View/Open

Name:: sim9174_am.pdf
Size:: 946.7KB
Format:: PDF

View/Open

Name:: sim9174.pdf
Size:: 867.5KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.