Model development including interactions with multiple imputed data

Hendry, Gillian M; Naidoo, Rajen N; Zewotir, Temesgen; North, Delia; Mentz, Graciela

Model development including interactions with multiple imputed data

dc.contributor.author	Hendry, Gillian M
dc.contributor.author	Naidoo, Rajen N
dc.contributor.author	Zewotir, Temesgen
dc.contributor.author	North, Delia
dc.contributor.author	Mentz, Graciela
dc.date.accessioned	2015-01-11T19:02:13Z
dc.date.available	2015-01-11T19:02:13Z
dc.date.issued	2014-12-19
dc.identifier.citation	BMC Medical Research Methodology. 2014 Dec 19;14(1):136
dc.identifier.uri	https://hdl.handle.net/2027.42/110125	en_US
dc.description.abstract	Abstract Background Multiple imputation is a reliable tool to deal with missing data and is becoming increasingly popular in biostatistics. However, building a model with interactions that are not specified a priori, in the presence of missing data, presents a challenge. On the one hand, the interactions are needed to impute the data, while on the other hand, the data is needed to identify the interactions. The objective of this study was to present a way in which this challenge can be addressed. Methods This paper investigates two strategies in which model development with interactions is achieved using a single data set generated from the Expectation Maximization (EM) algorithm. Imputation using both the fully conditional specification approach and the multivariate normal approach is carried out and results are compared. The strategies are illustrated with data from a study of ambient pollution and childhood asthma in Durban, South Africa. Results The different approaches to model building and imputation yielded similar results despite the data being mainly categorical. Both strategies investigated for building the model using the multivariate normal imputed data resulted in the identical set of variables and interactions being identified; while models built using data imputed by fully conditional specification were marginally different for the two strategies. It was found that, for both imputation approaches, model building with backward elimination applied to the initial EM data set was easier to implement, and produced good results, compared to those from a complete case analysis. Conclusions Developing a predictive model including interactions with data that suffers from missingness is easily done by identifying significant interactions and then applying backward elimination to a single data set imputed from the EM algorithm. It is hoped that this idea can be further developed and, by addressing this practical dilemma, there will be increased adoption of multiple imputation in medical research when data suffers from missingness.
dc.title	Model development including interactions with multiple imputed data
dc.type	Article	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/110125/1/12874_2014_Article_1145.pdf
dc.identifier.doi	10.1186/1471-2288-14-136	en_US
dc.language.rfc3066	en
dc.rights.holder	Hendry et al.; licensee BioMed Central.
dc.date.updated	2015-01-11T19:02:17Z
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: 12874_2014_Article_1145.pdf
Size:: 530.9KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.