A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data

Li, Ming; He, Zihuai; Zhang, Min; Zhan, Xiaowei; Wei, Changshuai; Elston, Robert C.; Lu, Qing

A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data

dc.contributor.author	Li, Ming	en_US
dc.contributor.author	He, Zihuai	en_US
dc.contributor.author	Zhang, Min	en_US
dc.contributor.author	Zhan, Xiaowei	en_US
dc.contributor.author	Wei, Changshuai	en_US
dc.contributor.author	Elston, Robert C.	en_US
dc.contributor.author	Lu, Qing	en_US
dc.date.accessioned	2014-05-21T18:02:33Z
dc.date.available	WITHHELD_13_MONTHS	en_US
dc.date.available	2014-05-21T18:02:33Z
dc.date.issued	2014-04	en_US
dc.identifier.citation	Li, Ming; He, Zihuai; Zhang, Min; Zhan, Xiaowei; Wei, Changshuai; Elston, Robert C.; Lu, Qing (2014). "A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data." Genetic Epidemiology 38(3): 242-253.	en_US
dc.identifier.issn	0741-0395	en_US
dc.identifier.issn	1098-2272	en_US
dc.identifier.uri	https://hdl.handle.net/2027.42/106664
dc.description.abstract	With the advance of high‐throughput sequencing technologies, it has become feasible to investigate the influence of the entire spectrum of sequencing variations on complex human diseases. Although association studies utilizing the new sequencing technologies hold great promise to unravel novel genetic variants, especially rare genetic variants that contribute to human diseases, the statistical analysis of high‐dimensional sequencing data remains a challenge. Advanced analytical methods are in great need to facilitate high‐dimensional sequencing data analyses. In this article, we propose a generalized genetic random field (GGRF) method for association analyses of sequencing data. Like other similarity‐based methods (e.g., SIMreg and SKAT), the new method has the advantages of avoiding the need to specify thresholds for rare variants and allowing for testing multiple variants acting in different directions and magnitude of effects. The method is built on the generalized estimating equation framework and thus accommodates a variety of disease phenotypes (e.g., quantitative and binary phenotypes). Moreover, it has a nice asymptotic property, and can be applied to small‐scale sequencing data without need for small‐sample adjustment. Through simulations, we demonstrate that the proposed GGRF attains an improved or comparable power over a commonly used method, SKAT, under various disease scenarios, especially when rare variants play a significant role in disease etiology. We further illustrate GGRF with an application to a real dataset from the Dallas Heart Study. By using GGRF, we were able to detect the association of two candidate genes, ANGPTL 3 and ANGPTL 4, with serum triglyceride.	en_US
dc.publisher	Wiley Periodicals, Inc.	en_US
dc.publisher	Springer	en_US
dc.subject.other	Generalized Estimating Equation	en_US
dc.subject.other	Small‐Scale Sequencing Studies	en_US
dc.subject.other	Rare Variants	en_US
dc.title	A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data	en_US
dc.type	Article	en_US
dc.rights.robots	IndexNoFollow	en_US
dc.subject.hlbsecondlevel	Genetics	en_US
dc.subject.hlbsecondlevel	Molecular, Cellular and Developmental Biology	en_US
dc.subject.hlbsecondlevel	Biological Chemistry	en_US
dc.subject.hlbtoplevel	Science	en_US
dc.subject.hlbtoplevel	Health Sciences	en_US
dc.description.peerreviewed	Peer Reviewed	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/106664/1/gepi21790.pdf
dc.identifier.doi	10.1002/gepi.21790	en_US
dc.identifier.source	Genetic Epidemiology	en_US
dc.identifier.citedreference	Schuster SC. 2008. Next‐generation sequencing transforms today's biology. Nat Methods 5 ( 1 ): 16 – 18.	en_US
dc.identifier.citedreference	Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. 2010. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11 ( 6 ): 446 – 50.	en_US
dc.identifier.citedreference	Han F, Pan W. 2010. A data‐adaptive sum test for disease association with multiple common or rare variants. Hum Hered 70 ( 1 ): 42 – 54.	en_US
dc.identifier.citedreference	He Z, Zhang M, Zhan X, Lu Q. 2013. Modeling and Testing for Joint Association Using a Genetic Random Field Model. http://arxiv‐web3.library.cornell.edu/abs/1302.5493eprintarXiv:1302.5493.	en_US
dc.identifier.citedreference	Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. 2009. Potential etiologic and functional implications of genome‐wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106 ( 23 ): 9362 – 9367.	en_US
dc.identifier.citedreference	Kathiresan S, Melander O, Guiducci C, Surti A, Burtt NP, Rieder MJ, Cooper GM, Roos C, Voight BF, Havulinna AS and others. 2008. Six new loci associated with blood low‐density lipoprotein cholesterol, high‐density lipoprotein cholesterol or triglycerides in humans. Nat Genet 40 ( 2 ): 189 – 197.	en_US
dc.identifier.citedreference	Koster A, Chao YB, Mosior M, Ford A, Gonzalez‐DeWhitt PA, Hale JE, Li D, Qiu Y, Fraser CC, Yang DD and others. 2005. Transgenic angiopoietin‐like (angptl)4 overexpression and targeted disruption of angptl4 and angptl3: regulation of triglyceride metabolism. Endocrinology 146 ( 11 ): 4943 – 4950.	en_US
dc.identifier.citedreference	Kwee LC, Liu D, Lin X, Ghosh D, Epstein MP. 2008. A powerful and flexible multilocus association test for quantitative traits. Am J Hum Genet 82 ( 2 ): 386 – 397.	en_US
dc.identifier.citedreference	Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, Team NGESP‐ELP, Christiani DC, Wurfel MM, Lin X. 2012. Optimal unified approach for rare‐variant association testing with application to small‐sample case‐control whole‐exome sequencing studies. Am J Hum Genet 91 ( 2 ): 224 – 237.	en_US
dc.identifier.citedreference	Li B, Leal SM. 2008. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83 ( 3 ): 311 – 321.	en_US
dc.identifier.citedreference	Lin DY, Tang ZZ. 2011. A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 89 ( 3 ): 354 – 367.	en_US
dc.identifier.citedreference	Liu D, Lin X, Ghosh D. 2007. Semiparametric regression of multidimensional genetic pathway data: least‐squares kernel machines and linear mixed models. Biometrics 63 ( 4 ): 1079 – 1088.	en_US
dc.identifier.citedreference	Liu K, Fast S, Zawistowski M, Tintle NL. 2013. A geometric framework for evaluating rare variant tests of association. Genet Epidemiol 37 ( 4 ): 345 – 357.	en_US
dc.identifier.citedreference	Madsen BE, Browning SR. 2009. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5 ( 2 ): e1000384.	en_US
dc.identifier.citedreference	Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A and others. 2009. Finding the missing heritability of complex diseases. Nature 461 ( 7265 ): 747 – 753.	en_US
dc.identifier.citedreference	McClellan J, King MC. 2010. Genetic heterogeneity in human disease. Cell 141 ( 2 ): 210 – 217.	en_US
dc.identifier.citedreference	Morris AP, Zeggini E. 2010. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34 ( 2 ): 188 – 193.	en_US
dc.identifier.citedreference	Price AL, Kryukov GV, de Bakker PI, Purcell SM, Staples J, Wei LJ, Sunyaev SR. 2010. Pooled association tests for rare variants in exon‐resequencing studies. Am J Hum Genet 86 ( 6 ): 832 – 838.	en_US
dc.identifier.citedreference	Romeo S, Yin W, Kozlitina J, Pennacchio LA, Boerwinkle E, Hobbs HH, Cohen JC. 2009. Rare loss‐of‐function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J Clin Invest 119 ( 1 ): 70 – 79.	en_US
dc.identifier.citedreference	Schork NJ, Murray SS, Frazer KA, Topol EJ. 2009. Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev 19 ( 3 ): 212 – 219.	en_US
dc.identifier.citedreference	Shimizugawa T, Ono M, Shimamura M, Yoshida K, Ando Y, Koishi R, Ueda K, Inaba T, Minekura H, Kohama T and others. 2002. ANGPTL3 decreases very low density lipoprotein triglyceride clearance by inhibition of lipoprotein lipase. J Biol Chem 277 ( 37 ): 33742 – 33748.	en_US
dc.identifier.citedreference	Tzeng JY, Zhang D, Chang SM, Thomas DC, Davidian M. 2009. Gene‐trait similarity regression for multimarker‐based association analysis. Biometrics 65 ( 3 ): 822 – 832.	en_US
dc.identifier.citedreference	Tzeng JY, Zhang D, Pongpanich M, Smith C, McCarthy MI, Sale MM, Worrall BB, Hsu FC, Thomas DC, Sullivan PF. 2011. Studying gene and gene‐environment effects of uncommon and common variants on continuous traits: a marker‐set approach using gene‐trait similarity regression. Am J Hum Genet 89 ( 2 ): 277 – 288.	en_US
dc.identifier.citedreference	Wessel J, Schork NJ. 2006. Generalized genomic distance‐based regression methodology for multilocus association analysis. Am J Hum Genet 79 ( 5 ): 792 – 806.	en_US
dc.identifier.citedreference	Wu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X. 2010. Powerful SNP‐set analysis for case‐control genome‐wide association studies. Am J Hum Genet 86 ( 6 ): 929 – 942.	en_US
dc.identifier.citedreference	Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. 2011. Rare‐variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89 ( 1 ): 82 – 93.	en_US
dc.identifier.citedreference	Zawistowski M, Gopalakrishnan S, Ding J, Li Y, Grimm S, Zollner S. 2010. Extending rare‐variant testing strategies: analysis of noncoding sequence and imputed genotypes. Am J Hum Genet 87 ( 5 ): 604 – 617.	en_US
dc.identifier.citedreference	Adler RJ, Taylor JE. 2007. Random Fields and Geometry. Springer, New York.	en_US
dc.identifier.citedreference	Almasy L, Dyer TD, Peralta JM, Kent JW, Jr., Charlesworth JC, Curran JE, Blangero J. 2011. Genetic Analysis Workshop 17 mini‐exome simulation. BMC Proc 5 ( Suppl 9 ): S2.	en_US
dc.identifier.citedreference	Ansorge WJ. 2009. Next‐generation DNA sequencing techniques. N Biotechnol 25 ( 4 ): 195 – 203.	en_US
dc.identifier.citedreference	Besag J. 1974. Spatial interaction and statistical analysis of lattice systems. J R Stat Soc B 48: 259 – 302.	en_US
dc.identifier.citedreference	Bodmer W, Bonilla C. 2008. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet 40 ( 6 ): 695 – 701.	en_US
dc.identifier.citedreference	Davies R. 1980. The distribution of a linear combination of Chi‐square random variables. Appl Stat 29: 323 – 333.	en_US
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: gepi21790.pdf
Size:: 555.0KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.