Show simple item record

Poststratification fusion learning in longitudinal data analysis

dc.contributor.authorTang, Lu
dc.contributor.authorSong, Peter X.‐k.
dc.date.accessioned2021-10-05T15:05:02Z
dc.date.available2022-10-05 11:04:59en
dc.date.available2021-10-05T15:05:02Z
dc.date.issued2021-09
dc.identifier.citationTang, Lu; Song, Peter X.‐k. (2021). "Poststratification fusion learning in longitudinal data analysis." Biometrics 77(3): 914-928.
dc.identifier.issn0006-341X
dc.identifier.issn1541-0420
dc.identifier.urihttps://hdl.handle.net/2027.42/170201
dc.description.abstractStratification is a very commonly used approach in biomedical studies to handle sample heterogeneity arising from, for examples, clinical units, patient subgroups, or missing- data. A key rationale behind such approach is to overcome potential sampling biases in statistical inference. Two issues of such stratification- based strategy are (i) whether individual strata are sufficiently distinctive to warrant stratification, and (ii) sample size attrition resulted from the stratification may potentially lead to loss of statistical power. To address these issues, we propose a penalized generalized estimating equations approach to reducing the complexity of parametric model structures due to excessive stratification. Specifically, we develop a data- driven fusion learning approach for longitudinal data that improves estimation efficiency by integrating information across similar strata, yet still allows necessary separation for stratum- specific conclusions. The proposed method is evaluated by simulation studies and applied to a motivating example of psychiatric study to demonstrate its usefulness in real world settings.
dc.publisherCRC Press
dc.publisherWiley Periodicals, Inc.
dc.subject.otherGEE
dc.subject.otherpattern- mixture model
dc.subject.otherregularization
dc.subject.otherstratification
dc.subject.othervariable selection
dc.titlePoststratification fusion learning in longitudinal data analysis
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelMathematics
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/170201/1/biom13333.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/170201/2/biom13333-sup-0002-Appendices.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/170201/3/biom13333_am.pdf
dc.identifier.doi10.1111/biom.13333
dc.identifier.sourceBiometrics
dc.identifier.citedreferenceShen, X. and Huang, H.- C. ( 2010 ) Grouping pursuit through a regularization solution surface. Journal of the American Statistical Association, 105, 727 - 739.
dc.identifier.citedreferenceLittle, R.J. ( 1993 ) Pattern- mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88, 125 - 134.
dc.identifier.citedreferenceJohnson, B.A., Lin, D. and Zeng, D. ( 2008 ) Penalized estimating functions and variable selection in semiparametric regression models. Journal of the American Statistical Association, 103, 672 - 680.
dc.identifier.citedreferenceJorgensen, B. ( 1997 ). The Theory of Dispersion Models. Boca Raton, FL: CRC Press.
dc.identifier.citedreferenceKe, Z.T., Fan, J. and Wu, Y. ( 2015 ) Homogeneity pursuit. Journal of the American Statistical Association, 110, 175 - 194.
dc.identifier.citedreferenceKroenke, K., Spitzer, R.L. and Williams, J.B. ( 2001 ) The PHQ- 9: validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606 - 613.
dc.identifier.citedreferenceMa, S. and Huang, J. ( 2017 ) A concave pairwise fusion approach to subgroup analysis. Journal of the American Statistical Association, 112, 410 - 423.
dc.identifier.citedreferenceNeuhaus, J.M., Kalbfleisch, J.D. and Hauck, W.W. ( 1991 ) A comparison of cluster- specific and population- averaged approaches for analyzing correlated binary data. International Statistical Review, 59, 25 - 35.
dc.identifier.citedreferenceOllier, E., Samson, A., Delavenne, X. and Viallon, V. ( 2016 ) A SAEM algorithm for fused lasso penalized nonlinear mixed effect models: application to group comparison in pharmacokinetics. Computational Statistics & Data Analysis, 95, 207 - 221.
dc.identifier.citedreferenceQu, A. and Song, P.X.- K. ( 2002 ) Testing ignorable missingness in estimating equation approaches for longitudinal data. Biometrika, 89, 841 - 850.
dc.identifier.citedreferenceQu, A., Yi, G., Song, P. X.- K. and Wang, P. ( 2011 ) Assessing the validity of weighted generalized estimating equations. Biometrika, 98, 215 - 224.
dc.identifier.citedreferenceSen, S., Kranzler, H.R., Krystal, J.H., Speller, H., Chan, G., Gelernter, J. and Guille, C. ( 2010 ) A prospective cohort study investigating factors associated with depression during medical internship. Archives of General Psychiatry, 67, 557 - 565.
dc.identifier.citedreferenceLiang, K.- Y. and Zeger, S.L. ( 1986 ) Longitudinal data analysis using generalized linear models. Biometrika, 73, 13 - 22.
dc.identifier.citedreferenceSong, P. X.- K. ( 2007 ). Correlated Data Analysis: Modeling, Analytics, and Applications. New York, NY: Springer.
dc.identifier.citedreferenceSpitzer, R.L., Kroenke, K., Williams, J.B. and Löwe, B. ( 2006 ) A brief measure for assessing generalized anxiety disorder: the GAD- 7. Archives of Internal Medicine, 166, 1092 - 1097.
dc.identifier.citedreferenceTang, L. and Song, P.X. ( 2016 ) Fused lasso approach in regression coefficients clustering: learning parameter heterogeneity in data integration. The Journal of Machine Learning Research, 17, 3915 - 3937.
dc.identifier.citedreferenceTibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. ( 2005 ) Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 91 - 108.
dc.identifier.citedreferenceTibshirani, R.J. and Taylor, J. ( 2011 ) The solution path of the generalized lasso. The Annals of Statistics, 39, 1335 - 1371.
dc.identifier.citedreferenceVan de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. ( 2014 ) On asymptotically optimal confidence regions and tests for high- dimensional models. The Annals of Statistics, 42, 1166 - 1202.
dc.identifier.citedreferenceWang, F., Wang, L. and Song, P. X.- K. ( 2016 ) Fused lasso with the adaptation of parameter ordering in combining multiple studies with repeated measurements. Biometrics, 72, 1184 - 1193.
dc.identifier.citedreferenceWang, L., Zhou, J. and Qu, A. ( 2012 ) Penalized generalized estimating equations for high- dimensional longitudinal data analysis. Biometrics, 68, 353 - 360.
dc.identifier.citedreferenceZhang, C.- H. ( 2010 ) Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894 - 942.
dc.identifier.citedreferenceZhang, C.- H. and Zhang, S.S. ( 2014 ) Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 217 - 242.
dc.identifier.citedreferenceZou, H. ( 2006 ) The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418 - 1429.
dc.identifier.citedreferenceBach, F., Jenatton, R., Mairal, J. and Obozinski, G. ( 2012 ) Structured sparsity through convex optimization. Statistical Science, 27, 450 - 468.
dc.identifier.citedreferenceBondell, H.D. and Reich, B.J. ( 2008 ) Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR. Biometrics, 64, 115 - 123.
dc.identifier.citedreferenceBondell, H.D. and Reich, B.J. ( 2009 ) Simultaneous factor selection and collapsing levels in ANOVA. Biometrics, 65, 169 - 177.
dc.identifier.citedreferenceChen, J. and Chen, Z. ( 2012 ) Extended BIC for small- n- large- p sparse GLM. Statistica Sinica, 22, 555 - 574.
dc.identifier.citedreferenceChen, H.Y. and Little, R. ( 1999 ) A test of missing completely at random for generalised estimating equations with missing data. Biometrika, 86, 1 - 13.
dc.identifier.citedreferenceDawson, J.D. ( 1994 ) Stratification of summary statistic tests according to missing data patterns. Statistics in Medicine, 13, 1853 - 1863.
dc.identifier.citedreferenceDiggle, P.J. ( 1989 ) Testing for random dropouts in repeated measurement data. Biometrics, 45, 1255 - 1258.
dc.identifier.citedreferenceFan, J. and Li, R. ( 2001 ) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348 - 1360.
dc.identifier.citedreferenceFu, W.J. ( 2003 ) Penalized estimating equations. Biometrics, 59, 126 - 132.
dc.identifier.citedreferenceHao, B., Sun, W.W., Liu, Y. and Cheng, G. ( 2018 ) Simultaneous clustering and estimation of heterogeneous graphical models. Journal of Machine Learning Research, 18, 1 - 58.
dc.identifier.citedreferenceHunter, D.R. and Li, R. ( 2005 ) Variable selection using MM algorithms. The Annals of Statistics, 33, 1617.
dc.working.doiNOen
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.