Show simple item record

It’s all relative: Regression analysis with compositional predictors

dc.contributor.authorLi, Gen
dc.contributor.authorLi, Yan
dc.contributor.authorChen, Kun
dc.date.accessioned2023-07-14T13:58:07Z
dc.date.available2024-07-14 09:58:06en
dc.date.available2023-07-14T13:58:07Z
dc.date.issued2023-06
dc.identifier.citationLi, Gen; Li, Yan; Chen, Kun (2023). "It’s all relative: Regression analysis with compositional predictors." Biometrics 79(2): 1318-1329.
dc.identifier.issn0006-341X
dc.identifier.issn1541-0420
dc.identifier.urihttps://hdl.handle.net/2027.42/177287
dc.description.abstractCompositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. Moreover, such models usually lack a straightforward interpretation due to the interrelation between parts of a composition. We develop a novel relative-shift regression framework that directly uses proportions as predictors. The new framework provides a paradigm shift for regression analysis with compositional predictors and offers a superior interpretation of how shifting concentration between parts affects the response. New equi-sparsity and tree-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. A unified finite-sample prediction error bound is derived for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies and a real gut microbiome study. Guided by the taxonomy of the microbiome data, the framework identifies important taxa at different taxonomic levels associated with the neurodevelopment of preterm infants.
dc.publisherSpringer
dc.publisherWiley Periodicals, Inc.
dc.subject.otherrelative shift
dc.subject.othertree-guided regularization
dc.subject.othermicrobiome
dc.subject.otherfeature aggregation
dc.subject.otherequi-sparsity
dc.titleIt’s all relative: Regression analysis with compositional predictors
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelMathematics
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/177287/1/biom13703.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/177287/2/biom13703_am.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/177287/3/biom13703-sup-0002-SuppMat.pdf
dc.identifier.doi10.1111/biom.13703
dc.identifier.sourceBiometrics
dc.identifier.citedreferenceShi, P., Zhou, Y. & Zhang, A. ( 2021 ) High-dimensional log-error-in-variable regression with applications to microbial compositional data analysis. Biometrika, 109, 405 – 420.
dc.identifier.citedreferenceChen, X., Lin, Q., Kim, S., Carbonell, J.G. & Xing, E.P. ( 2012 ) Smoothing proximal gradient method for general structured sparse regression. The Annals of Applied Statistics, 6, 719 – 752.
dc.identifier.citedreferenceCombettes, P.L. & Müller, C.L. ( 2021 ) Regression models for compositional data: General log-contrast formulations, proximal optimization, and microbiome data applications. Statistics in Biosciences, 13, 217 – 242.
dc.identifier.citedreferenceCong, X., Judge, M., Xu, W., Diallo, A., Janton, S., Brownell, E.A. et al. ( 2017 ) Influence of infant feeding type on gut microbiome development in hospitalized preterm infants. Nursing Research, 66, 123 – 133.
dc.identifier.citedreferenceGarcia, T.P., Müller, S., Carroll, R.J. & Walzem, R.L. ( 2013 ) Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data. Bioinformatics, 30, 831 – 837.
dc.identifier.citedreferenceGloor, G.B., Wu, J.R., Pawlowsky-Glahn, V. & Egozcue, J.J. ( 2016 ) It’s all relative: analyzing microbiome data as compositions. Annals of Epidemiology, 26, 322 – 329.
dc.identifier.citedreferenceGreenacre, M. ( 2020 ) Amalgamations are valid in compositional data analysis, can be used in agglomerative clustering, and their log-ratios have an inverse transformation. Applied Computing and Geosciences, 5, 100017.
dc.identifier.citedreferenceHastie, T., Tibshirani, R. & Wainwright, M. ( 2019 ) Statistical Learning with Sparsity: The Lasso and Generalizations. London: Chapman and Hall/CRC.
dc.identifier.citedreferenceKim, S., Sohn, K.-A. & Xing, E.P. ( 2009 ) A multivariate regression approach to association analysis of a quantitative trait network. Bioinformatics, 25, i204 – i212.
dc.identifier.citedreferenceLi, H. ( 2015 ) Microbiome, metagenomics, and high-dimensional compositional data analysis. Annual Review of Statistics and Its Application, 2, 73 – 94.
dc.identifier.citedreferenceLin, W., Shi, P., Feng, R. & Li, H. ( 2014 ) Variable selection in regression with compositional covariates. Biometrika, 101, 785 – 797.
dc.identifier.citedreferenceNesterov, Y. ( 2005 ) Smooth minimization of non-smooth functions. Mathematical Programming, 103, 127 – 152.
dc.identifier.citedreferencePalarea-Albaladejo, J. & Martin-Fernandez, J. ( 2013 ) Values below detection limit in compositional chemical data. Analytica Chimica Acta, 764, 32 – 43.
dc.identifier.citedreferenceRandolph, T.W., Zhao, S., Copeland, W., Hullar, M. & Shojaie, A. ( 2018 ) Kernel-penalized regression for analysis of microbiome data. The Annals of Applied Statistics, 12, 540 – 566.
dc.identifier.citedreferenceShe, Y. ( 2010 ) Sparse regression with exact clustering. Electronic Journal of Statistics, 4, 1055 – 1096.
dc.identifier.citedreferenceShi, P., Zhang, A. & Li, H. ( 2016 ) Regression analysis for microbiome compositional data. The Annals of Applied Statistics, 10, 1019 – 1040.
dc.identifier.citedreferenceSilverman, J.D., Washburne, A.D., Mukherjee, S. & David, L.A. ( 2017 ) A phylogenetic transform enhances analysis of compositional microbiota data. Elife, 6, e21887.
dc.identifier.citedreferenceSun, Z., Xu, W., Cong, X., Li, G. & Chen, K. ( 2020 ) Log-contrast regression with functional compositional predictors: linking preterm infant’s gut microbiome trajectories in early postnatal period to neurobehavioral outcome. The Annals of Applied Statistics, 14, 1535 – 1556.
dc.identifier.citedreferenceTsilimigras, M.C. & Fodor, A.A. ( 2016 ) Compositional data analysis of the microbiome: fundamentals, tools, and challenges. Annals of Epidemiology, 26, 330 – 335.
dc.identifier.citedreferenceWang, T. & Zhao, H. ( 2017 ) Structured subcomposition selection in regression and its application to microbiome data analysis. The Annals of Applied Statistics, 11, 771 – 791.
dc.identifier.citedreferenceXia, Y., Sun, J. & Chen, D.-G. ( 2018 ) Statistical analysis of microbiome data with R (Vol. 847). Singapore: Springer.
dc.identifier.citedreferenceXu, T., Demmer, R.T. & Li, G. ( 2021 ) Zero-inflated poisson factor model with application to microbiome read counts. Biometrics, 77, 91 – 101.
dc.identifier.citedreferenceYan, X. & Bien, J. ( 2021 ) Rare feature selection in high dimensions. Journal of the American Statistical Association, 116, 887 – 900.
dc.identifier.citedreferenceAitchison, J. ( 1982 ) The statistical analysis of compositional data. Journal of the Royal Statistical Society: Series B, 44, 139 – 160.
dc.identifier.citedreferenceAitchison, J. ( 1983 ) Principal component analysis of compositional data. Biometrika, 70, 57 – 65.
dc.identifier.citedreferenceAitchison, J. & Bacon-Shone, J. ( 1984 ) Log contrast models for experiments with mixtures. Biometrika, 71, 323 – 330.
dc.identifier.citedreferenceAitchison, J. & Egozcue, J.J. ( 2005 ) Compositional data analysis: where are we and where should we be heading? Mathematical Geology, 37, 829 – 850.
dc.identifier.citedreferenceBeck, A. & Teboulle, M. ( 2009 ) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2, 183 – 202.
dc.identifier.citedreferenceBien, J., Yan, X., Simpson, L. & Müller, C.L. ( 2020 ) Tree-aggregated predictive modeling of microbiome data. bioRxiv.
dc.identifier.citedreferenceBien, J., Yan, X., Simpson, L. & Müller, C.L. ( 2021 ) Tree-aggregated predictive modeling of microbiome data. Scientific Reports, 11 ( 1 ), 1 – 13.
dc.identifier.citedreferenceBühlmann, P. & van de Geer, S. ( 2009 ) Statistics for High-Dimensional Data. Berlin: Springer.
dc.working.doiNOen
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.