Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Caseâ  Control Sequencing Studies

Larson, Nicholas B.; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A.; Isaacs, William B.; Xu, Jianfeng; Cooney, Kathleen A.; Lange, Ethan; Schleutker, Johanna; Carpten, John D.; Powell, Isaac; Bailey‐wilson, Joan; Cussenot, Olivier; Cancel‐tassin, Geraldine; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S.; Hsieh, Chih‐lin; Wiklund, Fredrik; Catolona, William J.; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote‐jarai, Zsofia; Ackerman, Michael J.; Olson, Timothy M.; Klein, Christopher J.; Thibodeau, Stephen N.; Schaid, Daniel J.

Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Caseâ Control Sequencing Studies

dc.contributor.author	Larson, Nicholas B.
dc.contributor.author	McDonnell, Shannon
dc.contributor.author	Albright, Lisa Cannon
dc.contributor.author	Teerlink, Craig
dc.contributor.author	Stanford, Janet
dc.contributor.author	Ostrander, Elaine A.
dc.contributor.author	Isaacs, William B.
dc.contributor.author	Xu, Jianfeng
dc.contributor.author	Cooney, Kathleen A.
dc.contributor.author	Lange, Ethan
dc.contributor.author	Schleutker, Johanna
dc.contributor.author	Carpten, John D.
dc.contributor.author	Powell, Isaac
dc.contributor.author	Bailey‐wilson, Joan
dc.contributor.author	Cussenot, Olivier
dc.contributor.author	Cancel‐tassin, Geraldine
dc.contributor.author	Giles, Graham
dc.contributor.author	MacInnis, Robert
dc.contributor.author	Maier, Christiane
dc.contributor.author	Whittemore, Alice S.
dc.contributor.author	Hsieh, Chih‐lin
dc.contributor.author	Wiklund, Fredrik
dc.contributor.author	Catolona, William J.
dc.contributor.author	Foulkes, William
dc.contributor.author	Mandal, Diptasri
dc.contributor.author	Eeles, Rosalind
dc.contributor.author	Kote‐jarai, Zsofia
dc.contributor.author	Ackerman, Michael J.
dc.contributor.author	Olson, Timothy M.
dc.contributor.author	Klein, Christopher J.
dc.contributor.author	Thibodeau, Stephen N.
dc.contributor.author	Schaid, Daniel J.
dc.date.accessioned	2016-10-17T21:19:22Z
dc.date.available	2017-11-01T15:31:29Z	en
dc.date.issued	2016-09
dc.identifier.citation	Larson, Nicholas B.; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A.; Isaacs, William B.; Xu, Jianfeng; Cooney, Kathleen A.; Lange, Ethan; Schleutker, Johanna; Carpten, John D.; Powell, Isaac; Bailey‐wilson, Joan ; Cussenot, Olivier; Cancel‐tassin, Geraldine ; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S.; Hsieh, Chih‐lin ; Wiklund, Fredrik; Catolona, William J.; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote‐jarai, Zsofia ; Ackerman, Michael J.; Olson, Timothy M.; Klein, Christopher J.; Thibodeau, Stephen N.; Schaid, Daniel J. (2016). "Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Caseâ Control Sequencing Studies." Genetic Epidemiology 40(6): 461-469.
dc.identifier.issn	0741-0395
dc.identifier.issn	1098-2272
dc.identifier.uri	https://hdl.handle.net/2027.42/134215
dc.description.abstract	Rare variants (RVs) have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional singleâ marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multimarker burdenâ type approaches attempt to identify aggregation of RVs across caseâ control status by analyzing relatively small partitions of the genome, such as genes. However, it is generally the case that the aggregative measure would be a mixture of causal and neutral variants, and these omnibus tests do not directly provide any indication of which RVs may be driving a given association. Recently, Bayesian variable selection approaches have been proposed to identify RV associations from a large set of RVs under consideration. Although these approaches have been shown to be powerful at detecting associations at the RV level, there are often computational limitations on the total quantity of RVs under consideration and compromises are necessary for largeâ scale application. Here, we propose a computationally efficient alternative formulation of this method using a probit regression approach specifically capable of simultaneously analyzing hundreds to thousands of RVs. We evaluate our approach to detect causal variation on simulated data and examine sensitivity and specificity in instances of high RV dimensionality as well as apply it to pathwayâ level RV analysis results from a prostate cancer (PC) risk caseâ control sequencing study. Finally, we discuss potential extensions and future directions of this work.
dc.publisher	Wiley Periodicals, Inc.
dc.publisher	Clarendon Press
dc.subject.other	MCMC
dc.subject.other	Nextâ generation sequencing
dc.subject.other	burden testing
dc.subject.other	prostate cancer
dc.title	Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Caseâ Control Sequencing Studies
dc.type	Article	en_US
dc.rights.robots	IndexNoFollow
dc.subject.hlbsecondlevel	Genetics
dc.subject.hlbsecondlevel	Molecular, Cellular and Developmental Biology
dc.subject.hlbsecondlevel	Biological Chemistry
dc.subject.hlbtoplevel	Health Sciences
dc.subject.hlbtoplevel	Science
dc.description.peerreviewed	Peer Reviewed
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/134215/1/gepi21983.pdf
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/134215/2/gepi21983_am.pdf
dc.identifier.doi	10.1002/gepi.21983
dc.identifier.source	Genetic Epidemiology
dc.identifier.citedreference	Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P, Verzilli C, Shen J, Tang Z, Bacanu SA, Fraser D and others. 2012. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337 ( 6090 ): 100 â 104.
dc.identifier.citedreference	Albert JH, Chib S. 1993. Bayesianâ analysis of binary and polychotomous response data. J Am Stat Assoc 88 ( 422 ): 669 â 679.
dc.identifier.citedreference	Bansal V, Libiger O, Torkamani A, Schork NJ. 2010. Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet 11 ( 11 ): 773 â 85.
dc.identifier.citedreference	Baragatti M. 2011. Bayesian variable selection for probit mixed models applied to gene selection. Bayesian Anal 6 ( 2 ): 209 â 229.
dc.identifier.citedreference	Cirulli ET, Goldstein DB. 2010. Uncovering the roles of rare variants in common disease through wholeâ genome sequencing. Nat Revi Genet 11 ( 6 ): 415 â 425.
dc.identifier.citedreference	DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M and others. 2011. A framework for variation discovery and genotyping using nextâ generation DNA sequencing data. Nat Genet 43 ( 5 ): 491 â 498.
dc.identifier.citedreference	Dering C, Hemmelmann C, Pugh E, Ziegler A. 2011. Statistical analysis of rare sequence variants: an overview of collapsing methods. Genet Epidemiol 35: S12 â S17.
dc.identifier.citedreference	Gelfand AE, Smith AFM, Lee TM. 1992. Bayesianâ analysis of constrained parameter and truncated data problems using Gibbs sampling. Am Stat Assoc 87 ( 418 ): 523 â 532.
dc.identifier.citedreference	Hastings WK. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 ( 1 ): 97 â 109.
dc.identifier.citedreference	Huang C, Freter C. 2015. Lipid metabolism, apoptosis and cancer therapy. Int J Mol Sci 16 ( 1 ): 924 â 949.
dc.identifier.citedreference	Jeffreys H. 1961. Theory of probability. Oxford: Clarendon Press.
dc.identifier.citedreference	Johnson AA, Jones GL, Neath RC. 2013. Componentâ wise Markov chain Monte Carlo: uniform and geometric ergodicity under mixing and composition. Stat Sci 28 ( 3 ): 360 â 375.
dc.identifier.citedreference	Joshiâ Tope G, Gillespie M, Vastrik I, D’Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L and others. 2005. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33 (Database Issue): D428 â D432.
dc.identifier.citedreference	Kanehisa M. 2002. The KEGG database. Novartis Found Sympo 247: 91 â 101; discussion 101â 103, 119â 128, 244â 252.
dc.identifier.citedreference	Kang G, Bi W, Zhao Y, Zhang JF, Yang JJ, Xu H, Loh ML, Hunger SP, Relling MV, Pounds S and others. 2014. A new system identification approach to identify genetic variants in sequencing studies for a binary phenotype. Hum Hered 78 ( 2 ): 104 â 116.
dc.identifier.citedreference	Lee KE, Sha NJ, Dougherty ER, Vannucci M, Mallick BK. 2003. Gene selection: a Bayesian variable selection approach. Bioinformatics 19 ( 1 ): 90 â 97.
dc.identifier.citedreference	Lee KJ, Jones GL, Caffo BS, Bassett SS. 2014. Spatial bayesian variable selection models on functional magnetic resonance imaging timeâ series data. Bayesian Anal 9 ( 3 ): 699 â 732.
dc.identifier.citedreference	Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, Christiani DC, Wurfel MM, Lin X. 2012. Optimal unified approach for rareâ variant association testing with application to smallâ sample caseâ control wholeâ exome sequencing studies. Am J Hum Genet 91 ( 2 ): 224 â 237.
dc.identifier.citedreference	Leonâ Novelo L, Moreno E, Casella G. 2012. Objective Bayes model selection in probit models. Stat Med 31 ( 4 ): 353 â 365.
dc.identifier.citedreference	Li B, Leal SM. 2008. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83 ( 3 ): 311 â 321.
dc.identifier.citedreference	Liang FM, Xiong MM. 2013. Bayesian detection of causal rare variants under posterior consistency. PloS One 8 ( 7 ): e69633.
dc.identifier.citedreference	Liu JS. 1994. The collapsed Gibbs sampler in Bayesian computations with applications to a geneâ regulation problem. J Am Stat Assoc 89 ( 427 ): 958 â 966.
dc.identifier.citedreference	Liu JS. 1996. Peskun’s theorem and a modified discreteâ state Gibbs sampler. Biometrika 83 ( 3 ): 681 â 682.
dc.identifier.citedreference	Logsdon BA, Dai JY, Auer PL, Johnsen JM, Ganesh SK, Smith NL, Wilson JG, Tracy RP, Lange LA, Jiao S and others. 2014. A variational Bayes discrete mixture test for rare variant association. Genet Epidemiol 38 ( 1 ): 21 â 30.
dc.identifier.citedreference	McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M and others. 2010. The genome analysis toolkit: a MapReduce framework for analyzing nextâ generation DNA sequencing data. Genome Res 20 ( 9 ): 1297 â 1303.
dc.identifier.citedreference	Neale BM, Rivas MA, Voight BF, Altshuler D, Devlin B, Orhoâ Melander M, Kathiresan S, Purcell SM, Roeder K, Daly MJ. 2011. Testing for an unusual distribution of rare variants. Plos Genet 7 ( 3 ): e1001322.
dc.identifier.citedreference	O’Hara RB, Sillanpaa MJ. 2009. A review of Bayesian variable selection methods: what, how and which. Bayesian Anal 4 ( 1 ): 85 â 117.
dc.identifier.citedreference	Peltola T, Marttinen P, Vehtari A. 2012. Finite adaptation and multistep moves in the Metropolisâ Hastings algorithm for variable selection in genomeâ wide association analysis. PloS One 7 ( 11 ): e49445.
dc.identifier.citedreference	Pritchard JK. 2001. Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 69 ( 1 ): 124 â 137.
dc.identifier.citedreference	Quintana MA, Berstein JL, Thomas DC, Conti DV. 2011. Incorporating model uncertainty in detecting rare variants: the Bayesian risk index. Genet Epidemiol 35 ( 7 ): 638 â 649.
dc.identifier.citedreference	Quintana MA, Conti DV. 2013. Integrative variable selection via Bayesian model uncertainty. Stat Med 32 ( 28 ): 4938 â 4953.
dc.identifier.citedreference	Shi MH, Dunson DB. 2011. Bayesian variable selection via particle stochastic search. Stat Probab Lett 81 ( 2 ): 283 â 291.
dc.identifier.citedreference	Tanner MA, Wing HW. 1987. The calculation of posterior distributions by data augmentationâ rejoinder. J Am Stat Assoc 82 ( 398 ): 548 â 550.
dc.identifier.citedreference	Thomson PA, Parla JS, McRae AF, Kramer M, Ramakrishnan K, Yao J, Soares DC, McCarthy S, Morris SW, Cardone L and others. 2014. 708 Common and 2010 rare DISC1 locus variants identified in 1542 subjects: analysis for association with psychiatric disorder and cognitive traits. Mol Psychiatry 19 ( 6 ): 668 â 675.
dc.identifier.citedreference	Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levyâ Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J and others. 2013. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43: 11.10.1 â 33.
dc.identifier.citedreference	Wilson MA, Iversen ES, Clyde MA, Schmidler SC, Schildkraut JM. 2010. Bayesian model search and multilevel inference for SNP association studies. Ann Appl Stat 4 ( 3 ): 1342 â 1364.
dc.identifier.citedreference	Wu G, Zhi D. 2013. Pathwayâ based approaches for sequencingâ based genomeâ wide association studies. Genet Epidemiol 37 ( 5 ): 478 â 494.
dc.identifier.citedreference	Wu MC, Lee S, Cai TX, Li Y, Boehnke M, Lin XH. 2011. Rareâ variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89 ( 1 ): 82 â 93.
dc.identifier.citedreference	Yang AJ, Song XY. 2010. Bayesian variable selection for disease classification using gene expression data. Bioinformatics 26 ( 2 ): 215 â 222.
dc.identifier.citedreference	Zellner A. 1983. Applications of Bayesianâ analysis in econometrics. Statistician 32 ( 1â 2 ): 23 â 34.
dc.identifier.citedreference	Zhou H, Sehl ME, Sinsheimer JS, Lange K. 2010. Association screening of common and rare genetic variants by penalized regression. Bioinformatics 26 ( 19 ): 2375 â 2382.
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: gepi21983.pdf
Size:: 277.0KB
Format:: PDF

View/Open

Name:: gepi21983_am.pdf
Size:: 735.7KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.