Show simple item record

Statistical inference for Cox proportional hazards models with a diverging number of covariates

dc.contributor.authorXia, Lu
dc.contributor.authorNan, Bin
dc.contributor.authorLi, Yi
dc.date.accessioned2023-06-01T20:51:06Z
dc.date.available2024-07-01 16:51:05en
dc.date.available2023-06-01T20:51:06Z
dc.date.issued2023-06
dc.identifier.citationXia, Lu; Nan, Bin; Li, Yi (2023). "Statistical inference for Cox proportional hazards models with a diverging number of covariates." Scandinavian Journal of Statistics 50(2): 550-571.
dc.identifier.issn0303-6898
dc.identifier.issn1467-9469
dc.identifier.urihttps://hdl.handle.net/2027.42/176874
dc.description.abstractFor statistical inference on regression models with a diverging number of covariates, the existing literature typically makes sparsity assumptions on the inverse of the Fisher information matrix. Such assumptions, however, are often violated under Cox proportion hazards models, leading to biased estimates with under-coverage confidence intervals. We propose a modified debiased lasso method, which solves a series of quadratic programming problems to approximate the inverse information matrix without posing sparse matrix assumptions. We establish asymptotic results for the estimated regression coefficients when the dimension of covariates diverges with the sample size. As demonstrated by extensive simulations, our proposed method provides consistent estimates and confidence intervals with nominal coverage probabilities. The utility of the method is further demonstrated by assessing the effects of genetic markers on patients’ overall survival with the Boston Lung Cancer Survival Cohort, a large-scale epidemiology study investigating mechanisms underlying the lung cancer.
dc.publisherCambridge University Press
dc.publisherWiley Periodicals, Inc.
dc.subject.otherdebiased lasso
dc.subject.otherlung cancer
dc.subject.otherprecision matrix
dc.subject.otherquadratic programming
dc.subject.othersparsity
dc.subject.othercancer epidemiology
dc.titleStatistical inference for Cox proportional hazards models with a diverging number of covariates
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelStatistics (Mathematical)
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/176874/1/sjos12595_am.pdf
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/176874/2/sjos12595.pdf
dc.identifier.doi10.1111/sjos.12595
dc.identifier.sourceScandinavian Journal of Statistics
dc.identifier.citedreferencePurcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A. R., Bender, D., … $$ dots $$ and Sham, P. ( 2007 ) PLINK: A tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 81, 559 – 575.
dc.identifier.citedreferenceFan, J., & Li, R. ( 2002 ). Variable selection for Cox’s proportional hazards model and frailty model. The Annals of Statistics, 30, 74 – 99.
dc.identifier.citedreferenceFang, E. X., Ning, Y., & Liu, H. ( 2017 ). Testing and confidence intervals for high dimensional proportional hazards models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79, 1415 – 1437.
dc.identifier.citedreferenceFei, Z., & Li, Y. ( 2021 ). Estimation and inference for high dimensional generalized linear models: A splitting and smoothing approach. Journal of Machine Learning Research, 22, 1 – 32.
dc.identifier.citedreferenceGui, J., & Li, H. ( 2005 ). Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics, 21, 3001 – 3008.
dc.identifier.citedreferenceHerndon, J. E., Kornblith, A. B., Holland, J. C., & Paskett, E. D. ( 2008 ). Patient education level as a predictor of survival in lung cancer clinical trials. Journal of Clinical Oncology, 26, 4116 – 4123.
dc.identifier.citedreferenceHouston, K. A., Mitchell, K. A., King, J., White, A., & Ryan, B. M. ( 2018 ). Histologic lung cancer incidence rates and trends vary by race/ethnicity and residential county. Journal of Thoracic Oncology, 13, 497 – 509.
dc.identifier.citedreferenceHuang, J., Sun, T., Ying, Z., Yu, Y., & Zhang, C.-H. ( 2013 ). Oracle inequalities for the lasso in the Cox model. The Annals of Statistics, 41, 1142 – 1165.
dc.identifier.citedreferenceJanssen-Heijnen, M. L. G., & Coebergh, J.-W. W. ( 2001 ). Trends in incidence and prognosis of the histological subtypes of lung cancer in North America, Australia, New Zealand and Europe. Lung Cancer, 31, 123 – 137.
dc.identifier.citedreferenceJavanmard, A., & Montanari, A. ( 2014 ). Confidence intervals and hypothesis testing for high-dimensional regression. Journal of Machine Learning Research, 15, 2869 – 2909.
dc.identifier.citedreferenceKong, S., & Nan, B. ( 2014 ). Non-asymptotic oracle inequalities for the high-dimensional Cox regression via lasso. Statistica Sinica, 24, 25 – 42.
dc.identifier.citedreferenceKong, S., Yu, Z., Zhang, X., & Cheng, G. ( 2021 ). High-dimensional robust inference for Cox regression models using desparsified lasso. Scandinavian Journal of Statistics, 48, 1068 – 1095.
dc.identifier.citedreferenceMcKay, J. D., Hung, R. J., Han, Y., Zong, X., Carreras-Torres, R., Christiani, D. C., … $$ dots $$ and Amos, C. I. ( 2017 ) Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nature Genetics, 49, 1126 – 1132.
dc.identifier.citedreferenceNing, Y., & Liu, H. ( 2017 ). A general theory of hypothesis tests and confidence regions for sparse high dimensional models. The Annals of Statistics, 45, 158 – 195.
dc.identifier.citedreferenceQiu, L.-X., Yao, L., Xue, K., Zhang, J., Mao, C., Chen, B., … $$ dots $$ and Hu, X.-C. ( 2010 ) BRCA2 N372H polymorphism and breast cancer susceptibility: A meta-analysis involving 44,903 subjects. Breast Cancer Research and Treatment, 123, 487 – 490.
dc.identifier.citedreferenceSimon, N., Friedman, J., Hastie, T., & Tibshirani, R. ( 2011 ). Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of Statistical Software, 39, 1 – 13.
dc.identifier.citedreferenceTang, D., Zhao, Y. C., Qian, D., Liu, H., Luo, S., Patz, E. F., … $$ dots $$ and Wei, Q. ( 2020 ) Novel genetic variants in HDAC2 and PPARGC1A of the CREB-binding protein pathway predict survival of non-small-cell lung cancer. Molecular Carcinogenesis, 59, 104 – 115.
dc.identifier.citedreferenceTibshirani, R. ( 1997 ). The lasso method for variable selection in the Cox model. Statistics in Medicine, 16, 385 – 395.
dc.identifier.citedreferencevan de Geer, S., Bühlmann, P., Ritov, Y., & Dezeure, R. ( 2014 ). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42, 1166 – 1202.
dc.identifier.citedreferenceWang, S., van der Vaart, A. D., Xu, Q., Seneviratne, C., Pomerleau, O. F., Pomerleau, C. S., … $$ dots $$ and Li, M. D. ( 2014 ) Significant associations of CHRNA2 and CHRNA6 with nicotine dependence in European American and African American populations. Human Genetics, 133, 575 – 586.
dc.identifier.citedreferenceXia, L., Nan, B., & Li, Y. ( 2021 ). Debiased lasso for generalized linear models with a diverging number of covariates. Biometrics in press. https://doi.org/10.1111/biom.13587
dc.identifier.citedreferenceYu, H., Zhao, H., Wang, L.-E., Han, Y., Chen, W. V., Amos, C. I., Rafnar, T., Sulem, P., Stefansson, K., Landi, M. T., & Caporaso, N. ( 2011 ). An analysis of single nucleotide polymorphisms of 125 DNA repair genes in the Texas genome-wide association study of lung cancer with a replication for the XRCC4 SNPs. DNA Repair, 10, 398 – 407.
dc.identifier.citedreferenceYu, Y., Bradic, J., & Samworth, R. J. ( 2021 ). Confidence intervals for high-dimensional Cox models. Statistica Sinica, 31, 243 – 267.
dc.identifier.citedreferenceZhang, C.-H., & Zhang, S. S. ( 2014 ). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76, 217 – 242.
dc.identifier.citedreferenceZhang, H., Huang, J., & Sun, L. ( 2022 ). Projection-based and cross-validated estimation in high-dimensional Cox model. Scandinavian Journal of Statistics, 49, 353 – 372.
dc.identifier.citedreferenceAndersen, P. K., & Gill, R. D. ( 1982 ). Cox’s regression model for counting processes: A large sample study. The Annals of Statistics, 10, 1100 – 1120.
dc.identifier.citedreferenceAntoniadis, A., Fryzlewicz, P., & Letué, F. ( 2010 ). The Dantzig selector in Cox’s proportional hazards model. Scandinavian Journal of Statistics, 37, 531 – 552.
dc.identifier.citedreferenceBossé, Y., & Amos, C. I. ( 2018 ). A decade of GWAS results in lung cancer. Cancer Epidemiology, Biomarkers & Prevention, 27, 363 – 379.
dc.identifier.citedreferenceBoyd, S., & Vandenberghe, L. ( 2004 ). Convex optimization. Cambridge University Press.
dc.identifier.citedreferenceCai, T., Liu, W., & Luo, X. ( 2011 ). A constrained ℓ 1 $$ {ell}_1 $$ minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106, 594 – 607.
dc.identifier.citedreferenceCai, T. T., Liu, W., & Zhou, H. H. ( 2016 ). Estimating sparse precision matrix: Optimal rates of convergence and adaptive estimation. The Annals of Statistics, 44, 455 – 488.
dc.identifier.citedreferenceCox, D. R. ( 1972 ). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34, 187 – 202.
dc.working.doiNOen
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.