Show simple item record

A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data

dc.contributor.authorLi, Mingen_US
dc.contributor.authorHe, Zihuaien_US
dc.contributor.authorZhang, Minen_US
dc.contributor.authorZhan, Xiaoweien_US
dc.contributor.authorWei, Changshuaien_US
dc.contributor.authorElston, Robert C.en_US
dc.contributor.authorLu, Qingen_US
dc.date.accessioned2014-05-21T18:02:33Z
dc.date.availableWITHHELD_13_MONTHSen_US
dc.date.available2014-05-21T18:02:33Z
dc.date.issued2014-04en_US
dc.identifier.citationLi, Ming; He, Zihuai; Zhang, Min; Zhan, Xiaowei; Wei, Changshuai; Elston, Robert C.; Lu, Qing (2014). "A Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Data." Genetic Epidemiology 38(3): 242-253.en_US
dc.identifier.issn0741-0395en_US
dc.identifier.issn1098-2272en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/106664
dc.description.abstractWith the advance of high‐throughput sequencing technologies, it has become feasible to investigate the influence of the entire spectrum of sequencing variations on complex human diseases. Although association studies utilizing the new sequencing technologies hold great promise to unravel novel genetic variants, especially rare genetic variants that contribute to human diseases, the statistical analysis of high‐dimensional sequencing data remains a challenge. Advanced analytical methods are in great need to facilitate high‐dimensional sequencing data analyses. In this article, we propose a generalized genetic random field (GGRF) method for association analyses of sequencing data. Like other similarity‐based methods (e.g., SIMreg and SKAT), the new method has the advantages of avoiding the need to specify thresholds for rare variants and allowing for testing multiple variants acting in different directions and magnitude of effects. The method is built on the generalized estimating equation framework and thus accommodates a variety of disease phenotypes (e.g., quantitative and binary phenotypes). Moreover, it has a nice asymptotic property, and can be applied to small‐scale sequencing data without need for small‐sample adjustment. Through simulations, we demonstrate that the proposed GGRF attains an improved or comparable power over a commonly used method, SKAT, under various disease scenarios, especially when rare variants play a significant role in disease etiology. We further illustrate GGRF with an application to a real dataset from the Dallas Heart Study. By using GGRF, we were able to detect the association of two candidate genes, ANGPTL 3 and ANGPTL 4, with serum triglyceride.en_US
dc.publisherWiley Periodicals, Inc.en_US
dc.publisherSpringeren_US
dc.subject.otherGeneralized Estimating Equationen_US
dc.subject.otherSmall‐Scale Sequencing Studiesen_US
dc.subject.otherRare Variantsen_US
dc.titleA Generalized Genetic Random Field Method for the Genetic Association Analysis of Sequencing Dataen_US
dc.typeArticleen_US
dc.rights.robotsIndexNoFollowen_US
dc.subject.hlbsecondlevelGeneticsen_US
dc.subject.hlbsecondlevelMolecular, Cellular and Developmental Biologyen_US
dc.subject.hlbsecondlevelBiological Chemistryen_US
dc.subject.hlbtoplevelScienceen_US
dc.subject.hlbtoplevelHealth Sciencesen_US
dc.description.peerreviewedPeer Revieweden_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/106664/1/gepi21790.pdf
dc.identifier.doi10.1002/gepi.21790en_US
dc.identifier.sourceGenetic Epidemiologyen_US
dc.identifier.citedreferenceSchuster SC. 2008. Next‐generation sequencing transforms today's biology. Nat Methods 5 ( 1 ): 16 – 18.en_US
dc.identifier.citedreferenceEichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH. 2010. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11 ( 6 ): 446 – 50.en_US
dc.identifier.citedreferenceHan F, Pan W. 2010. A data‐adaptive sum test for disease association with multiple common or rare variants. Hum Hered 70 ( 1 ): 42 – 54.en_US
dc.identifier.citedreferenceHe Z, Zhang M, Zhan X, Lu Q. 2013. Modeling and Testing for Joint Association Using a Genetic Random Field Model. http://arxiv‐web3.library.cornell.edu/abs/1302.5493eprintarXiv:1302.5493.en_US
dc.identifier.citedreferenceHindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. 2009. Potential etiologic and functional implications of genome‐wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106 ( 23 ): 9362 – 9367.en_US
dc.identifier.citedreferenceKathiresan S, Melander O, Guiducci C, Surti A, Burtt NP, Rieder MJ, Cooper GM, Roos C, Voight BF, Havulinna AS and others. 2008. Six new loci associated with blood low‐density lipoprotein cholesterol, high‐density lipoprotein cholesterol or triglycerides in humans. Nat Genet 40 ( 2 ): 189 – 197.en_US
dc.identifier.citedreferenceKoster A, Chao YB, Mosior M, Ford A, Gonzalez‐DeWhitt PA, Hale JE, Li D, Qiu Y, Fraser CC, Yang DD and others. 2005. Transgenic angiopoietin‐like (angptl)4 overexpression and targeted disruption of angptl4 and angptl3: regulation of triglyceride metabolism. Endocrinology 146 ( 11 ): 4943 – 4950.en_US
dc.identifier.citedreferenceKwee LC, Liu D, Lin X, Ghosh D, Epstein MP. 2008. A powerful and flexible multilocus association test for quantitative traits. Am J Hum Genet 82 ( 2 ): 386 – 397.en_US
dc.identifier.citedreferenceLee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, Team NGESP‐ELP, Christiani DC, Wurfel MM, Lin X. 2012. Optimal unified approach for rare‐variant association testing with application to small‐sample case‐control whole‐exome sequencing studies. Am J Hum Genet 91 ( 2 ): 224 – 237.en_US
dc.identifier.citedreferenceLi B, Leal SM. 2008. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83 ( 3 ): 311 – 321.en_US
dc.identifier.citedreferenceLin DY, Tang ZZ. 2011. A general framework for detecting disease associations with rare variants in sequencing studies. Am J Hum Genet 89 ( 3 ): 354 – 367.en_US
dc.identifier.citedreferenceLiu D, Lin X, Ghosh D. 2007. Semiparametric regression of multidimensional genetic pathway data: least‐squares kernel machines and linear mixed models. Biometrics 63 ( 4 ): 1079 – 1088.en_US
dc.identifier.citedreferenceLiu K, Fast S, Zawistowski M, Tintle NL. 2013. A geometric framework for evaluating rare variant tests of association. Genet Epidemiol 37 ( 4 ): 345 – 357.en_US
dc.identifier.citedreferenceMadsen BE, Browning SR. 2009. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5 ( 2 ): e1000384.en_US
dc.identifier.citedreferenceManolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A and others. 2009. Finding the missing heritability of complex diseases. Nature 461 ( 7265 ): 747 – 753.en_US
dc.identifier.citedreferenceMcClellan J, King MC. 2010. Genetic heterogeneity in human disease. Cell 141 ( 2 ): 210 – 217.en_US
dc.identifier.citedreferenceMorris AP, Zeggini E. 2010. An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34 ( 2 ): 188 – 193.en_US
dc.identifier.citedreferencePrice AL, Kryukov GV, de Bakker PI, Purcell SM, Staples J, Wei LJ, Sunyaev SR. 2010. Pooled association tests for rare variants in exon‐resequencing studies. Am J Hum Genet 86 ( 6 ): 832 – 838.en_US
dc.identifier.citedreferenceRomeo S, Yin W, Kozlitina J, Pennacchio LA, Boerwinkle E, Hobbs HH, Cohen JC. 2009. Rare loss‐of‐function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J Clin Invest 119 ( 1 ): 70 – 79.en_US
dc.identifier.citedreferenceSchork NJ, Murray SS, Frazer KA, Topol EJ. 2009. Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev 19 ( 3 ): 212 – 219.en_US
dc.identifier.citedreferenceShimizugawa T, Ono M, Shimamura M, Yoshida K, Ando Y, Koishi R, Ueda K, Inaba T, Minekura H, Kohama T and others. 2002. ANGPTL3 decreases very low density lipoprotein triglyceride clearance by inhibition of lipoprotein lipase. J Biol Chem 277 ( 37 ): 33742 – 33748.en_US
dc.identifier.citedreferenceTzeng JY, Zhang D, Chang SM, Thomas DC, Davidian M. 2009. Gene‐trait similarity regression for multimarker‐based association analysis. Biometrics 65 ( 3 ): 822 – 832.en_US
dc.identifier.citedreferenceTzeng JY, Zhang D, Pongpanich M, Smith C, McCarthy MI, Sale MM, Worrall BB, Hsu FC, Thomas DC, Sullivan PF. 2011. Studying gene and gene‐environment effects of uncommon and common variants on continuous traits: a marker‐set approach using gene‐trait similarity regression. Am J Hum Genet 89 ( 2 ): 277 – 288.en_US
dc.identifier.citedreferenceWessel J, Schork NJ. 2006. Generalized genomic distance‐based regression methodology for multilocus association analysis. Am J Hum Genet 79 ( 5 ): 792 – 806.en_US
dc.identifier.citedreferenceWu MC, Kraft P, Epstein MP, Taylor DM, Chanock SJ, Hunter DJ, Lin X. 2010. Powerful SNP‐set analysis for case‐control genome‐wide association studies. Am J Hum Genet 86 ( 6 ): 929 – 942.en_US
dc.identifier.citedreferenceWu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. 2011. Rare‐variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 89 ( 1 ): 82 – 93.en_US
dc.identifier.citedreferenceZawistowski M, Gopalakrishnan S, Ding J, Li Y, Grimm S, Zollner S. 2010. Extending rare‐variant testing strategies: analysis of noncoding sequence and imputed genotypes. Am J Hum Genet 87 ( 5 ): 604 – 617.en_US
dc.identifier.citedreferenceAdler RJ, Taylor JE. 2007. Random Fields and Geometry. Springer, New York.en_US
dc.identifier.citedreferenceAlmasy L, Dyer TD, Peralta JM, Kent JW, Jr., Charlesworth JC, Curran JE, Blangero J. 2011. Genetic Analysis Workshop 17 mini‐exome simulation. BMC Proc 5 ( Suppl 9 ): S2.en_US
dc.identifier.citedreferenceAnsorge WJ. 2009. Next‐generation DNA sequencing techniques. N Biotechnol 25 ( 4 ): 195 – 203.en_US
dc.identifier.citedreferenceBesag J. 1974. Spatial interaction and statistical analysis of lattice systems. J R Stat Soc B 48: 259 – 302.en_US
dc.identifier.citedreferenceBodmer W, Bonilla C. 2008. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet 40 ( 6 ): 695 – 701.en_US
dc.identifier.citedreferenceDavies R. 1980. The distribution of a linear combination of Chi‐square random variables. Appl Stat 29: 323 – 333.en_US
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.