Model development including interactions with multiple imputed data
dc.contributor.author | Hendry, Gillian M | |
dc.contributor.author | Naidoo, Rajen N | |
dc.contributor.author | Zewotir, Temesgen | |
dc.contributor.author | North, Delia | |
dc.contributor.author | Mentz, Graciela | |
dc.date.accessioned | 2015-01-11T19:02:13Z | |
dc.date.available | 2015-01-11T19:02:13Z | |
dc.date.issued | 2014-12-19 | |
dc.identifier.citation | BMC Medical Research Methodology. 2014 Dec 19;14(1):136 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/110125 | en_US |
dc.description.abstract | Abstract Background Multiple imputation is a reliable tool to deal with missing data and is becoming increasingly popular in biostatistics. However, building a model with interactions that are not specified a priori, in the presence of missing data, presents a challenge. On the one hand, the interactions are needed to impute the data, while on the other hand, the data is needed to identify the interactions. The objective of this study was to present a way in which this challenge can be addressed. Methods This paper investigates two strategies in which model development with interactions is achieved using a single data set generated from the Expectation Maximization (EM) algorithm. Imputation using both the fully conditional specification approach and the multivariate normal approach is carried out and results are compared. The strategies are illustrated with data from a study of ambient pollution and childhood asthma in Durban, South Africa. Results The different approaches to model building and imputation yielded similar results despite the data being mainly categorical. Both strategies investigated for building the model using the multivariate normal imputed data resulted in the identical set of variables and interactions being identified; while models built using data imputed by fully conditional specification were marginally different for the two strategies. It was found that, for both imputation approaches, model building with backward elimination applied to the initial EM data set was easier to implement, and produced good results, compared to those from a complete case analysis. Conclusions Developing a predictive model including interactions with data that suffers from missingness is easily done by identifying significant interactions and then applying backward elimination to a single data set imputed from the EM algorithm. It is hoped that this idea can be further developed and, by addressing this practical dilemma, there will be increased adoption of multiple imputation in medical research when data suffers from missingness. | |
dc.title | Model development including interactions with multiple imputed data | |
dc.type | Article | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/110125/1/12874_2014_Article_1145.pdf | |
dc.identifier.doi | 10.1186/1471-2288-14-136 | en_US |
dc.language.rfc3066 | en | |
dc.rights.holder | Hendry et al.; licensee BioMed Central. | |
dc.date.updated | 2015-01-11T19:02:17Z | |
dc.owningcollname | Interdisciplinary and Peer-Reviewed |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.