Show simple item record

Improving power for rare‐variant tests by integrating external controls

dc.contributor.authorLee, Seunggeun
dc.contributor.authorKim, Sehee
dc.contributor.authorFuchsberger, Christian
dc.date.accessioned2017-10-23T17:32:07Z
dc.date.available2019-01-07T18:34:38Zen
dc.date.issued2017-11
dc.identifier.citationLee, Seunggeun; Kim, Sehee; Fuchsberger, Christian (2017). "Improving power for rare‐variant tests by integrating external controls." Genetic Epidemiology 41(7): 610-619.
dc.identifier.issn0741-0395
dc.identifier.issn1098-2272
dc.identifier.urihttps://hdl.handle.net/2027.42/138932
dc.description.abstractDue to the drop in sequencing cost, the number of sequenced genomes is increasing rapidly. To improve power of rare‐variant tests, these sequenced samples could be used as external control samples in addition to control samples from the study itself. However, when using external controls, possible batch effects due to the use of different sequencing platforms or genotype calling pipelines can dramatically increase type I error rates. To address this, we propose novel summary statistics based single and gene‐ or region‐based rare‐variant tests that allow the integration of external controls while controlling for type I error. Our approach is based on the insight that batch effects on a given variant can be assessed by comparing odds ratio estimates using internal controls only vs. using combined control samples of internal and external controls. From simulation experiments and the analysis of data from age‐related macular degeneration and type 2 diabetes studies, we demonstrate that our method can substantially improve power while controlling for type I error rate.
dc.publisherSpringer
dc.publisherWiley Periodicals, Inc.
dc.subject.otherRare‐variant test
dc.subject.otherexternal controls
dc.subject.othernext‐generation sequencing
dc.titleImproving power for rare‐variant tests by integrating external controls
dc.typeArticleen_US
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelGenetics
dc.subject.hlbsecondlevelMolecular, Cellular and Developmental Biology
dc.subject.hlbsecondlevelBiological Chemistry
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/138932/1/gepi22057.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/138932/2/gepi22057_am.pdf
dc.identifier.doi10.1002/gepi.22057
dc.identifier.sourceGenetic Epidemiology
dc.identifier.citedreferenceQuail, M. A., Smith, M., Coupland, P., Otto, T. D., Harris, S. R., Connor, T. R., … Gu, Y. ( 2012 ). A tale of three next generation sequencing platforms: Comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics, 13 ( 1 ), 341.
dc.identifier.citedreferenceShendure, J., & Ji, H. ( 2008 ). Next‐generation DNA sequencing. Nature Biotechnology, 26 ( 10 ), 1135 – 1145.
dc.identifier.citedreferenceSteinthorsdottir, V., Thorleifsson, G., Sulem, P., Helgason, H., Grarup, N., Sigurdsson, A., … Gudjonsson, S. A. ( 2014 ). Identification of low‐frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nature Genetics, 46, 294 – 298.
dc.identifier.citedreferenceLi, B., & Leal, S. M. ( 2008 ). Methods for detecting associations with rare variants for common diseases: Application to analysis of sequence data. American Journal of Human Genetics, 83 ( 3 ), 311 – 321.
dc.identifier.citedreferenceMadsen, B. E., & Browning, S. R. ( 2009 ). A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genetics, 5 ( 2 ), e1000384.
dc.identifier.citedreferenceMahajan, A., & Robertson, N. ( 2015 ). Rare variant quality control. In Eleftheria Zeggini & Andrew Morris (Eds.), Assessing rare variation in complex traits (pp. 33 – 43 ). New York: Springer.
dc.identifier.citedreferenceMorris, A. P., & Zeggini, E. ( 2010 ). An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genetic Epidemiology, 34 ( 2 ), 188 – 193.
dc.identifier.citedreferenceMorris, C. N. ( 1983 ). Parametric empirical Bayes inference: Theory and applications. Journal of the American Statistical Association, 78 ( 381 ), 47 – 55.
dc.identifier.citedreferenceMukherjee, B., & Chatterjee, N. ( 2008 ). Exploiting gene‐environment independence for analysis of case–control studies: An empirical Bayes‐type shrinkage estimator to trade‐off between bias and efficiency. Biometrics, 64 ( 3 ), 685 – 694.
dc.identifier.citedreferencePrice, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., & Reich, D. ( 2006 ). Principal components analysis corrects for stratification in genome‐wide association studies. Nature Genetics, 38 ( 8 ), 904 – 909.
dc.identifier.citedreferenceSchaffner, S. F., Foo, C., Gabriel, S., Reich, D., Daly, M. J., & Altshuler, D. ( 2005 ). Calibrating a coalescent simulation of human genome sequence variation. Genome Research, 15 ( 11 ), 1576 – 1583.
dc.identifier.citedreferenceZuk, O., Schaffner, S. F., Samocha, K., Do, R., Hechter, E., Kathiresan, S., … Lander, E. S. ( 2014 ). Searching for missing heritability: Designing rare variant association studies. Proceedings of the National Academy of Sciences, 111 ( 4 ), E455 – E464.
dc.identifier.citedreferenceZhan, X., Larson, D. E., Wang, C., Koboldt, D. C., Sergeev, Y. V., Fulton, R. S., … Bragg‐Gresham, J. ( 2013 ). Identification of a rare coding variant in complement 3 associated with age‐related macular degeneration. Nature Genetics, 45, 1375 – 1379.
dc.identifier.citedreferenceWu, M. C., Lee, S., Cai, T., Li, Y., Boehnke, M. C., & Lin, X. ( 2011 ). Rare variant association testing for sequencing data using the sequence kernel association test (SKAT). American Journal of Human Genetics, 89 ( 1 ), 82 – 93.
dc.identifier.citedreferenceWang, C., Zhan, X., Bragg‐Gresham, J., Kang, H. M., Stambolian, D., Chew, E. Y., … Fulton, R. ( 2014 ). Ancestry estimation and control of population stratification for sequence‐based association studies. Nature Genetics, 46 ( 4 ), 409 – 415.
dc.identifier.citedreferenceTryka, K. A., Hao, L., Sturcke, A., Jin, Y., Wang, Z. Y., Ziyabari, L., … Kimura, M. ( 2014 ). NCBI’s database of genotypes and phenotypes: DbGaP. Nucleic Acids Research, 42 ( D1 ), D975 – D979.
dc.identifier.citedreferenceBasu, S., & Pan, W. ( 2011 ). Comparison of statistical tests for disease association with rare variants. Genetic Epidemiology, 35 ( 7 ), 606 – 619.
dc.identifier.citedreferenceBodea, C. A., Neale, B. M., Ripke, S., Daly, M. J., Devlin, B., Roeder, K., & Consortium, International IBD Genetics. ( 2016 ). A method to exploit the structure of genetic ancestry space to enhance case‐control studies. American Journal of Human Genetics, 98 ( 5 ), 857 – 868.
dc.identifier.citedreferenceCruchaga, C., Karch, C. M., Jin, S. C., Benitez, B. A., Cai, Y., Guerreiro, R., … Bertelsen, S. ( 2014 ). Rare coding variants in the phospholipase D3 gene confer risk for Alzheimer’s disease. Nature, 505 ( 7484 ), 550 – 554.
dc.identifier.citedreferenceDerkach, A., Chiang, T., Gong, J., Addis, L., Dobbins, S., Tomlinson, I., … Strug, L. J. ( 2014 ). Association analysis using next generation sequence data from publicly available control groups: The robust variance score statistic. Bioinformatics, 30 ( 15 ), 2179 – 2188.
dc.identifier.citedreferenceESP, NHLBI GO. Exome Variant Server, NHLBI GO Exome Sequencing Project (ESP), Seattle, WA. ( 2016 ). Retrieved from http://evs.gs.washington.edu/EVS/.
dc.identifier.citedreferenceFuchsberger, C., Flannick, J., Teslovich, T. M., Mahajan, A., Agarwala, V., Gaulton, K. J., … McCarthy, D. J. ( 2016 ). The genetic architecture of type 2 diabetes. Nature, 536 ( 7614 ), 41 – 47.
dc.identifier.citedreferenceGuedj, M., Nuel, G., & Prum, B. ( 2008 ). A note on allelic tests in case‐control association studies. Annals of Human Genetics, 72 ( 3 ), 407 – 409.
dc.identifier.citedreferenceHu, Y.‐J., Liao, P., Johnston, H. R., Allen, A., & Satten, G. ( 2016 ). Testing rare‐variant association without calling genotypes allows for systematic differences in sequencing between cases and controls. PLoS Genetics, 12 ( 5 ), e1006040.
dc.identifier.citedreferenceLee, S., Abecasis, G. R., Boehnke, M., & Lin, X. ( 2014 ). Rare‐variant association analysis: Study designs and statistical tests. American Journal of Human Genetics, 95 ( 1 ), 5 – 23.
dc.identifier.citedreferenceLee, S., Teslovich, T. M., Boehnke, M., & Lin, X. ( 2013 ). General framework for meta‐analysis of rare variants in sequencing association studies. American Journal of Human Genetics, 93 ( 1 ), 42 – 53.
dc.identifier.citedreferenceLee, S., Wu, M. C., & Lin, X. ( 2012 ). Optimal tests for rare variant effects in sequencing association studies. Biostatistics, 13 ( 4 ), 762 – 775.
dc.identifier.citedreferenceLek, M., Karczewski, K., Minikel, E., Samocha, K., Banks, E., Fennell, T., … Cummings, B. ( 2016 ). Analysis of protein‐coding genetic variation in 60,706 humans. Nature, 536 ( 7616 ), 285 – 291.
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.