Show simple item record

Drawing inferences for high‐dimensional linear models: A selection‐assisted partial regression and smoothing approach

dc.contributor.authorFei, Zhe
dc.contributor.authorZhu, Ji
dc.contributor.authorBanerjee, Moulinath
dc.contributor.authorLi, Yi
dc.date.accessioned2019-09-30T15:31:24Z
dc.date.availableWITHHELD_10_MONTHS
dc.date.available2019-09-30T15:31:24Z
dc.date.issued2019-06
dc.identifier.citationFei, Zhe; Zhu, Ji; Banerjee, Moulinath; Li, Yi (2019). "Drawing inferences for high‐dimensional linear models: A selection‐assisted partial regression and smoothing approach." Biometrics 75(2): 551-561.
dc.identifier.issn0006-341X
dc.identifier.issn1541-0420
dc.identifier.urihttps://hdl.handle.net/2027.42/151307
dc.description.abstractDrawing inferences for high‐dimensional models is challenging as regular asymptotic theories are not applicable. This article proposes a new framework of simultaneous estimation and inferences for high‐dimensional linear models. By smoothing over partial regression estimates based on a given variable selection scheme, we reduce the problem to low‐dimensional least squares estimations. The procedure, termed as Selection‐assisted Partial Regression and Smoothing (SPARES), utilizes data splitting along with variable selection and partial regression. We show that the SPARES estimator is asymptotically unbiased and normal, and derive its variance via a nonparametric delta method. The utility of the procedure is evaluated under various simulation scenarios and via comparisons with the de‐biased LASSO estimators, a major competitor. We apply the method to analyze two genomic datasets and obtain biologically meaningful results.
dc.publisherWiley Periodicals, Inc.
dc.subject.otherSelection‐assisted Partial Regression and Smoothing (SPARES)
dc.subject.otherconfidence intervals
dc.subject.otherhigh‐dimensional inference
dc.subject.otherhypothesis testing
dc.subject.othermultisample‐splitting
dc.titleDrawing inferences for high‐dimensional linear models: A selection‐assisted partial regression and smoothing approach
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelMathematics
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/151307/1/biom13013.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/151307/2/biom13013-sup-0001-SuppData.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/151307/3/biom13013_am.pdf
dc.identifier.doi10.1111/biom.13013
dc.identifier.sourceBiometrics
dc.identifier.citedreferenceVan der Hage, J., van den Broek, L., Legrand, C., Clahsen, P., Bosch, C., Robanus‐ Maandag, E., et al. ( 2004 ). Overexpression of p70 s6 kinase protein is associated with increased risk of locoregional recurrence in node‐negative premenopausal early breast cancer patients. Br J Cancer 90, 1543 – 1550.
dc.identifier.citedreferenceLee, J. D. and Taylor, J. E. ( 2014 ). Exact post model selection inference for marginal screening. In Advances in Neural Information Processing Systems, 136–144.
dc.identifier.citedreferenceLee, J. D., Sun, D. L., Sun, Y., and Taylor, J. E. ( 2016 ). Exact post‐selection inference, with application to the lasso. Ann Stat 44, 907 – 927.
dc.identifier.citedreferenceMander, L. and Liu, H.‐W. ( 2010 ). Comprehensive Natural Products II: Chemistry and Biology, volume 1. Elsevier.
dc.identifier.citedreferenceMeinshausen, N., Meier, L., and Bühlmann, P. ( 2009 ). P‐values for high‐dimensional regression. J Am Stat Assoc 104, 1671 – 1681.
dc.identifier.citedreferenceNing, Y. and Liu, H. ( 2017 ). A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Ann Stat 45, 158 – 195.
dc.identifier.citedreferenceRahal, R., Frick, M., Romero, R., Korn, J. M., Kridel, R., Chan, F. C., et al. ( 2014 ). Pharmacological and genomic profiling identifies NF’ κ B‐targeted treatment strategies for mantle cell lymphoma. Nat Med 20, 87 – 92.
dc.identifier.citedreferenceSaleem, M., Qadir, M. I., Perveen, N., Ahmad, B., Saleem, U., and Irshad, T. ( 2013 ). Inhibitors of apoptotic proteins: New targets for anticancer therapy. Chem Biol Drug Des 82, 243 – 251.
dc.identifier.citedreferenceSchallmey, M., Singh, A., and Ward, O. P. ( 2004 ). Developments in the use of bacillus species for industrialproduction. Can J Microbiol 50, 1 – 17.
dc.identifier.citedreferenceSinclair, C. S., Rowley, M., Naderi, A., and Couch, F. J. ( 2003 ). The 17q23 amplicon and breast cancer. Breast Cancer Res Treat 78, 313 – 322.
dc.identifier.citedreferenceSlattery, M. L., Lundgreen, A., Herrick, J. S., and Wolff, R. K. ( 2011 ).Genetic variation in rps6ka1, rps6ka2, rps6kb1, rps6kb2, and pdk1 and risk of colon or rectal cancer. Mutat Res Fund Mol Mech Mutagen 706, 13 – 20.
dc.identifier.citedreferenceTibshirani, R. ( 1996 ). Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58, 267 – 288.
dc.identifier.citedreferenceTojo, S., Matsunaga, M., Matsumoto, T., Kang, C.‐M., Yamaguchi, H., Asai, K., et al. ( 2003 ). Organization and expression of the Bacillus subtilissigY operon. J Biochem 134, 935 – 946.
dc.identifier.citedreferenceVan de Geer, S., Bühlmann, P., Ritov, Y., and Dezeure, R. ( 2014 ). On asymptotically optimal confidence regions and tests for high‐dimensional models. Ann Stat 42, 1166 – 1202.
dc.identifier.citedreferenceWager, S. and Athey, S. ( 2018 ). Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc 113, 1228 – 1242.
dc.identifier.citedreferenceWager, S., Hastie, T., and Efron, B. ( 2014 ). Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. J Mach Learn Res 15, 1625 – 1651.
dc.identifier.citedreferenceWang, Y., Dong, Q., Zhang, Q., Li, Z., Wang, E., and Qiu, X. ( 2010 ). Overexpression of yes‐associated protein contributes to progression and poor prognosis of non‐small‐cell lung cancer. Cancer Sci 101, 1279 – 1285.
dc.identifier.citedreferenceWasserman, L. and Roeder, K. ( 2009 ). High dimensional variable selection. Ann Stat 37, 2178 – 2201.
dc.identifier.citedreferenceZhang, C.‐H. ( 2010 ). Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38, 894 – 942.
dc.identifier.citedreferenceZhang, C.‐H. and Zhang, S. S. ( 2014 ). Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc Series B 76, 217 – 242.
dc.identifier.citedreferenceZhang, Y., Ni, H.‐J., and Cheng, D.‐Y. ( 2013 ). Prognostic value of phosphorylated mTOR/RPS6KB1 in non‐small celllung cancer. Asian Pac J Cancer Prev 14, 3725 – 3728.
dc.identifier.citedreferenceZhao, P. and Yu, B. ( 2006 ). On model selection consistency of lasso. J Mach Learn Res 7, 2541 – 2563.
dc.identifier.citedreferenceZou, H. ( 2006 ). The adaptive lasso and its oracle properties. J Am Stat Assoc 101, 1418 – 1429.
dc.identifier.citedreferenceBach, F. R. ( 2008 ). Bolasso: Model consistent lasso estimation through the bootstrap. In Proceedings of the 25th International Conference on Machine learning, 33–40. ACM.
dc.identifier.citedreferenceBelloni, A., Chernozhukov, V., and Hansen, C. ( 2014 ). Inference on treatment effects after selection among high‐dimensional controls. Rev Econ Stud 81, 608 – 650.
dc.identifier.citedreferenceBelloni, A., Chernozhukov, V., and Wei, Y. ( 2013 ). Honest confidence regions for a regression parameter in logistic regression with a large number of controls. Technical report, CeMMAP working paper, Centre for Microdata Methods and Practice.
dc.identifier.citedreferenceBerk, R., Brown, L., Buja, A., Zhang, K., and Zhao, L. ( 2013 ). Valid post‐selection inference. Ann Stat 41, 802 – 837.
dc.identifier.citedreferenceBühlmann, P., Kalisch, M., and Meier, L. ( 2014 ). High‐dimensional statistics with a view toward applications in biology. Annu Rev Stat Appl 1, 255 – 278.
dc.identifier.citedreferenceCai, C., Chen, Q.‐B., Han, Z.‐D., Zhang, Y.‐Q., He, H.‐C., Chen, J.‐H., et al. ( 2015 ). Mir‐195 inhibits tumor progression by targeting rps6kb1 in human prostate cancer. Clin Cancer Res 21, 4922 – 4934.
dc.identifier.citedreferenceCarlson, M. ( 2015 ). hgu133plus2.db: Affymetrix Human Genome U133 Plus 2.0 Array annotation data (chip hgu133plus2). R package version 3.2.2.
dc.identifier.citedreferenceEfron, B. ( 2014 ). Estimation and accuracy after model selection. J Am Stat Assoc 109, 991 – 1007.
dc.identifier.citedreferenceFan, J. and Li, R. ( 2001 ). Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96, 1348 – 1360.
dc.identifier.citedreferenceFan, J. and Lv, J. ( 2008 ). Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Series B 70, 849 – 911.
dc.identifier.citedreferenceFan, J. and Song, R. ( 2010 ). Sure independence screening in generalized linear models with NP‐dimensionality. Ann Stat 38, 3567 – 3604.
dc.identifier.citedreferenceJavanmard, A. and Montanari, A. ( 2014 ). Confidence intervals and hypothesis testing for high‐dimensionalregression. J Mach Learn Research 15, 2869 – 2909.
dc.identifier.citedreferenceJavanmard, A. and Montanari, A. ( 2018 ). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann Stat 46, 2593 – 2622.
dc.identifier.citedreferenceKunst, F., Ogasawara, N., Moszer, I., Albertini, A., Alloni, G., Azevedo, V., et al. ( 1997 ). The complete genome sequence of the gram‐positive bacterium Bacillus subtilis. Nature 390, 249 – 256.
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.