Show simple item record

Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels

dc.contributor.authorZhou, Wei
dc.contributor.authorFritsche, Lars G.
dc.contributor.authorDas, Sayantan
dc.contributor.authorZhang, He
dc.contributor.authorNielsen, Jonas B.
dc.contributor.authorHolmen, Oddgeir L.
dc.contributor.authorChen, Jin
dc.contributor.authorLin, Maoxuan
dc.contributor.authorElvestad, Maiken B.
dc.contributor.authorHveem, Kristian
dc.contributor.authorAbecasis, Goncalo R.
dc.contributor.authorKang, Hyun Min
dc.contributor.authorWiller, Cristen J.
dc.date.accessioned2017-12-15T16:47:37Z
dc.date.available2019-02-01T19:56:25Zen
dc.date.issued2017-12
dc.identifier.citationZhou, Wei; Fritsche, Lars G.; Das, Sayantan; Zhang, He; Nielsen, Jonas B.; Holmen, Oddgeir L.; Chen, Jin; Lin, Maoxuan; Elvestad, Maiken B.; Hveem, Kristian; Abecasis, Goncalo R.; Kang, Hyun Min; Willer, Cristen J. (2017). "Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels." Genetic Epidemiology 41(8): 744-755.
dc.identifier.issn0741-0395
dc.identifier.issn1098-2272
dc.identifier.urihttps://hdl.handle.net/2027.42/139954
dc.description.abstractThe accuracy of genotype imputation depends upon two factors: the sample size of the reference panel and the genetic similarity between the reference panel and the target samples. When multiple reference panels are not consented to combine together, it is unclear how to combine the imputation results to optimize the power of genetic association studies. We compared the accuracy of 9,265 Norwegian genomes imputed from three reference panels—1000 Genomes phase 3 (1000G), Haplotype Reference Consortium (HRC), and a reference panel containing 2,201 Norwegian participants from the population‐based Nord Trøndelag Health Study (HUNT) from low‐pass genome sequencing. We observed that the population‐matched reference panel allowed for imputation of more population‐specific variants with lower frequency (minor allele frequency (MAF) between 0.05% and 0.5%). The overall imputation accuracy from the population‐specific panel was substantially higher than 1000G and was comparable with HRC, despite HRC being 15‐fold larger. These results recapitulate the value of population‐specific reference panels for genotype imputation. We also evaluated different strategies to utilize multiple sets of imputed genotypes to increase the power of association studies. We observed that testing association for all variants imputed from any panel results in higher power to detect association than the alternative strategy of including only one version of each genetic variant, selected for having the highest imputation quality metric. This was particularly true for lower frequency variants (MAF < 1%), even after adjusting for the additional multiple testing burden.
dc.publisherWiley Periodicals, Inc.
dc.subject.othergenotype imputation
dc.subject.othermultiple reference panels
dc.subject.otherGWAS
dc.subject.otherstudy power
dc.subject.otherpopulation‐specific
dc.titleImproving power of association tests using multiple sets of imputed genotypes from distributed reference panels
dc.typeArticleen_US
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelGenetics
dc.subject.hlbsecondlevelMolecular, Cellular and Developmental Biology
dc.subject.hlbsecondlevelBiological Chemistry
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelScience
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/139954/1/gepi22067_am.pdf
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/139954/2/gepi22067.pdf
dc.identifier.doi10.1002/gepi.22067
dc.identifier.sourceGenetic Epidemiology
dc.identifier.citedreferenceMitt, M., Kals, M., Parn, K., Gabriel, S. B., Lander, E. S., Palotie, A., … Palta, P. ( 2017 ). Improved imputation accuracy of rare and low‐frequency variants using population‐specific high‐coverage WGS‐based imputation reference panel. European Journal of Human Genetics, 25 ( 7 ), 869 – 876. https://doi.org/10.1038/ejhg.2017.51
dc.identifier.citedreferenceHowie, B. N., Donnelly, P., & Marchini, J. ( 2009 ). A flexible and accurate genotype imputation method for the next generation of genome‐wide association studies. PLoS Genetics, 5 ( 6 ), e1000529. https://doi.org/10.1371/journal.pgen.1000529
dc.identifier.citedreferenceHuang, G. H., & Tseng, Y. C. ( 2014 ). Genotype imputation accuracy with different reference panels in admixed populations. BMC Proceedings, 8 (Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo), S64. https://doi.org/10.1186/1753-6561-8-s1-s64
dc.identifier.citedreferenceHuang, J., Ellinghaus, D., Franke, A., Howie, B., & Li, Y. ( 2012 ). 1000 Genomes‐based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data. European Journal of Human Genetics, 20 ( 7 ), 801 – 805. https://doi.org/10.1038/ejhg.2012.3
dc.identifier.citedreferenceHuang, J., Howie, B., McCarthy, S., Memari, Y., Walter, K., Min, J. L., … Durbin, R. ( 2015 ). Improved imputation of low‐frequency and rare variants using the UK10K haplotype reference panel. 6, 8111 – 8119. https://doi.org/10.1038/ncomms9111
dc.identifier.citedreferenceHuang, L., Li, Y., Singleton, A. B., Hardy, J. A., Abecasis, G., Rosenberg, N. A., & Scheet, P. ( 2009 ). Genotype‐imputation accuracy across worldwide human populations. American Journal of Human Genetics, 84 ( 2 ), 235 – 250. https://doi.org/10.1016/j.ajhg.2009.01.013
dc.identifier.citedreferenceJin, Y., Andersen, G., Yorgov, D., Ferrara, T. M., Ben, S., Brownson, K. M., … Koks, S. ( 2016 ). Genome‐wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants. https://doi.org/10.1038/ng.3680
dc.identifier.citedreferenceJolliffe I. T. ( 1986 ). Principal Component Analysis and Factor Analysis. In Principal component analysis (pp. 115 – 128 ). Springer: New York.
dc.identifier.citedreferenceJun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. ( 2015 ). An efficient and scalable analysis framework for variant extraction and refinement from population‐scale DNA sequence data. Genome Research, 25 ( 6 ), 918 – 925. https://doi.org/10.1101/gr.176552.114
dc.identifier.citedreferenceKrokstad, S., Langhammer, A., Hveem, K., Holmen, T. L., Midthjell, K., Stene, T. R., … Holmen, J. ( 2013 ). Cohort Profile: The HUNT Study, Norway. International Journal of Epidemiology, 42 ( 4 ), 968 – 977. https://doi.org/10.1093/ije/dys095
dc.identifier.citedreferenceLane, J. M., Vlasac, I., & Anderson, S. G. ( 2016 ). Genome‐wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank. Nature communications, 7, 10889 – 10898. https://doi.org/10.1038/ncomms10889
dc.identifier.citedreferenceLek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., … MacArthur, D. G. ( 2016 ). Analysis of protein‐coding genetic variation in 60,706 humans. Nature, 536 ( 7616 ), 285 – 291. https://doi.org/10.1038/nature19057
dc.identifier.citedreferenceLi, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. ( 2009 ). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25 ( 16 ), 2078 – 2079. https://doi.org/10.1093/bioinformatics/btp352
dc.identifier.citedreferenceLi, Y., Willer, C., Sanna, S., & Abecasis, G. ( 2009 ). Genotype imputation. Annual Reviews of Genomics and Human Genetics, 10, 387 – 406.
dc.identifier.citedreferenceLi, Y., Willer, C. J., Ding, J., Scheet, P., & Abecasis, G. R. ( 2010 ). MaCH: Using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology, 34 ( 8 ), 816 – 834. https://doi.org/10.1002/gepi.20533
dc.identifier.citedreferenceLoos, R. J., Lindgren, C. M., Li, S., Wheeler, E., Zhao, J. H., Prokopenko, I., … Mohlke, K. L. ( 2008 ). Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nature Genetics, 40 ( 6 ), 768 – 775. https://doi.org/10.1038/ng.140
dc.identifier.citedreferenceLow‐Kam, C., Rhainds, D., Lo, K. S., Provost, S., Mongrain, I., Dubois, A., … Lettre, G. ( 2016 ). Whole‐genome sequencing in French Canadians from Quebec. Human Genetics, 135 ( 11 ), 1213 – 1221. https://doi.org/10.1007/s00439-016-1702-6
dc.identifier.citedreferenceMahajan, A., Go, M. J., Zhang, W., Below, J. E., Gaulton, K. J., Ferreira, T., … Morris, A. P. ( 2014 ). Genome‐wide trans‐ancestry meta‐analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nature Genetics, 46 ( 3 ), 234 – 244. https://doi.org/10.1038/ng.2897
dc.identifier.citedreferenceMarchini, J., & Howie, B. ( 2010 ). Genotype imputation for genome‐wide association studies. Nature Reviews Genetics, 11 ( 7 ), 499 – 511. https://doi.org/10.1038/nrg2796
dc.identifier.citedreferenceMarchini, J., Howie, B., Myers, S., McVean, G., & Donnelly, P. ( 2007 ). A new multipoint method for genome‐wide association studies by imputation of genotypes. Nature Genetics, 39 ( 7 ), 906 – 913. https://doi.org/10.1038/ng2088
dc.identifier.citedreferenceMcCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A. R., Teumer, A., … Durbin, R. ( 2016 ). A reference panel of 64,976 haplotypes for genotype imputation. Nature Genetics, 48 ( 10 ), 1279 – 1283. https://doi.org/10.1038/ng.3643
dc.identifier.citedreferenceNalls, M. A., Pankratz, N., Lill, C. M., Do, C. B., Hernandez, D. G., Saad, M., … Singleton, A. B. ( 2014 ). Large‐scale meta‐analysis of genome‐wide association data identifies six new risk loci for Parkinson’s disease. Nature Genetics, 46 ( 9 ), 989 – 993. https://doi.org/10.1038/ng.3043
dc.identifier.citedreferenceOkada, Y., Momozawa, Y., Ashikawa, K., Kanai, M., & Matsuda, K. ( 2015 ). Construction of a population‐specific HLA imputation reference panel and its application to Graves’ disease risk in Japanese. 47 ( 7 ), 798 – 802. https://doi.org/10.1038/ng.3310
dc.identifier.citedreferencePistis, G., Porcu, E., Vrieze, S. I., Sidore, C., Steri, M., Danjou, F., … Sanna, S. ( 2015 ). Rare variant genotype imputation with thousands of study‐specific whole‐genome sequences: Implications for cost‐effective study designs. European Journal of Human Genetics, 23 ( 7 ), 975 – 983. https://doi.org/10.1038/ejhg.2014.216
dc.identifier.citedreferencePurcell, S., Neale, B., Todd‐Brown, K., Thomas, L., Ferreira, M. A., Bender, D., … Sham, P. C. ( 2007 ). PLINK: A tool set for whole‐genome association and population‐based linkage analyses. American Journal of Human Genetics, 81 ( 3 ), 559 – 575. https://doi.org/10.1086/519795
dc.identifier.citedreferenceRoshyara, N. R., & Scholz, M. ( 2015 ). Impact of genetic similarity on imputation accuracy. BMC Genetics, 16, 90 – 105. https://doi.org/10.1186/s12863-015-0248-2
dc.identifier.citedreferenceRuth, K. S., Campbell, P. J., Chew, S., Lim, E. M., Hadlow, N., Stuckey, B. G., … Perry, J. R. ( 2015 ). Genome‐wide association study with 1000 genomes imputation identifies signals for nine sex hormone‐related phenotypes. European Journal of Human Genetics, 24 ( 2 ), 284 – 290. https://doi.org/10.1038/ejhg.2015.102
dc.identifier.citedreferenceSherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., & Sirotkin, K. ( 2001 ). dbSNP: The NCBI database of genetic variation. Nucleic Acids Research, 29 ( 1 ), 308 – 311.
dc.identifier.citedreferenceSpencer, C. C., Su, Z., Donnelly, P., & Marchini, J. ( 2009 ). Designing genome‐wide association studies: Sample size, power, imputation, and the choice of genotyping chip. PLoS Genetics, 5 ( 5 ), e1000477. https://doi.org/10.1371/journal.pgen.1000477
dc.identifier.citedreferencevan Leeuwen, E. M., Sabo, A., Bis, J. C., Huffman, J. E., Manichaikul, A., Smith, A. V., … van Duijn, C. M. ( 2016 ). Meta‐analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. Journal of Medical Genetics, 53 ( 7 ), 441 – 449. https://doi.org/10.1136/jmedgenet-2015-103439
dc.identifier.citedreferenceExome Variant Server, NHLBI Exome Sequencing Project (ESP), Seattle, WA (URL: http://evs.gs.washington.edu/EVS/ ) [August 2016 accessed].
dc.identifier.citedreferenceWalter, K., Min, J. L., Huang, J., Crooks, L., Memari, Y., McCarthy, S., … Soranzo, N. ( 2015 ). The UK10K project identifies rare variants in health and disease. Nature, 526 ( 7571 ), 82 – 90. https://doi.org/10.1038/nature14962
dc.identifier.citedreferenceZeggini, E., Scott, L. J., Saxena, R., Voight, B. F., Marchini, J. L., Hu, T., … Altshuler, D. ( 2008 ). Meta‐analysis of genome‐wide association data and large‐scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genetics, 40 ( 5 ), 638 – 645. https://doi.org/10.1038/ng.120
dc.identifier.citedreferenceZeggini, E., Weedon, M. N., Lindgren, C. M., Frayling, T. M., Elliott, K. S., Lango, H., … Hattersley, A. T. ( 2007 ). Replication of genome‐wide association signals in UK samples reveals risk loci for type 2 diabetes. Science, 316 ( 5829 ), 1336 – 1341. https://doi.org/10.1126/science.1142364
dc.identifier.citedreferenceAuton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., … Abecasis, G. R. ( 2015 ). A global reference for human genetic variation. Nature, 526 ( 7571 ), 68 – 74. https://doi.org/10.1038/nature15393
dc.identifier.citedreferenceBrowning, B. L., & Browning, S. R. ( 2009 ). A unified approach to genotype imputation and haplotype‐phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics, 84 ( 2 ), 210 – 223. https://doi.org/10.1016/j.ajhg.2009.01.005
dc.identifier.citedreferenceBrowning, B. L., & Browning, S. R. ( 2013 ). Improving the accuracy and efficiency of identity‐by‐descent detection in population data. Genetics, 194 ( 2 ), 459 – 471. https://doi.org/10.1534/genetics.113.150029
dc.identifier.citedreferenceBurton, P. R., Clayton, D. G., Cardon, L. R., Craddock, N., Deloukas, P., Duncanson, A., … Samani, N. J. ( 2007 ). Genome‐wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447 ( 7145 ), 661 – 678.
dc.identifier.citedreferenceCheng, T. H., Thompson, D. J., O’Mara, T. A., Painter, J. N., Glubb, D. M., Flach, S., … Spurdle, A. B. ( 2016 ). Five endometrial cancer risk loci identified through genome‐wide association analysis, Nat Genet, 48 ( 6 ), 667 – 674. https://doi.org/10.1038/ng.3562
dc.identifier.citedreferenceCooper, J. D., Smyth, D. J., Smiles, A. M., Plagnol, V., Walker, N. M., Allen, J. E., … Todd, J. A. ( 2008 ). Meta‐analysis of genome‐wide association study data identifies additional type 1 diabetes risk loci. Nature Genetics, 40 ( 12 ), 1399 – 1401. https://doi.org/10.1038/ng.249
dc.identifier.citedreferenceDas, S., Forer, L., Schonherr, S., Sidore, C., Locke, A. E., Kwong, A., … Fuchsberger, C. ( 2016 ). Next‐generation genotype imputation service and methods. Nat Genet, 48 ( 10 ), 1284 – 1287. https://doi.org/10.1038/ng.3656
dc.identifier.citedreferenceDe Jager, P. L., Jia, X., Wang, J., de Bakker, P. I., Ottoboni, L., Aggarwal, N. T., … Oksenberg, J. R. ( 2009 ). Meta‐analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nature Genetics, 41 ( 7 ), 776 – 782. https://doi.org/10.1038/ng.401
dc.identifier.citedreferenceDeelen, P., Menelaou, A., van Leeuwen, E. M., Kanterakis, A., van Dijk, F., Medina‐Gomez, C., … Kreiner‐Moller, E. ( 2014 ). Improved imputation quality of low‐frequency and rare variants in European samples using the “Genome of The Netherlands.” European Journal of Human Genetics, 22 ( 11 ), 1321 – 1326. https://doi.org/10.1038/ejhg.2014.19
dc.identifier.citedreferenceDelaneau, O., Zagury, J. F., & Marchini, J. ( 2013 ). Improved whole‐chromosome phasing for disease and population genetic studies. Nature Methods, 10 ( 1 ), 5 – 6. https://doi.org/10.1038/nmeth.2307
dc.identifier.citedreferenceFuchsberger, C., Abecasis, G. R., & Hinds, D. A. ( 2015 ). minimac2: Faster genotype imputation. Bioinformatics, 31 ( 5 ), 782 – 784. https://doi.org/10.1093/bioinformatics/btu704
dc.identifier.citedreferenceGe, Y., Wang, Y., Shao, W., Jin, J., Du, M., Ma, G., … Zhang, Z. ( 2016 ). Rare variants in BRCA2 and CHEK2 are associated with the risk of urinary tract cancers. Scientific Reports, 6, 33542 – 33548. https://doi.org/10.1038/srep33542
dc.identifier.citedreferenceGudbjartsson, D. F., Helgason, H., Gudjonsson, S. A., Zink, F., Oddson, A., Gylfason, A., & Besenbacher, S. ( 2015 ). Large‐scale whole‐genome sequencing of the Icelandic population. 47 ( 5 ), 435 – 444. https://doi.org/10.1038/ng.3247
dc.identifier.citedreferenceHorikoshi, M., Mgi, R., van de Bunt, M., Surakka, I., Sarin, A. P., Mahajan, A., … Morris, A. P. ( 2015 ). Discovery and fine‐mapping of glycaemic and obesity‐related trait loci using high‐density imputation. PLoS Genetics, 11 ( 7 ), e1005230. https://doi.org/10.1371/journal.pgen.1005230
dc.identifier.citedreferenceHoulston, R. S., Webb, E., Broderick, P., Pittman, A. M., Di Bernardo, M. C., Lubbe, S., … Dunlop, M. G. ( 2008 ). Meta‐analysis of genome‐wide association data identifies four new susceptibility loci for colorectal cancer. Nature Genetics, 40 ( 12 ), 1426 – 1435. https://doi.org/10.1038/ng.262
dc.identifier.citedreferenceHowie, B., Fuchsberger, C., Stephens, M., Marchini, J., & Abecasis, G. R. ( 2012 ). Fast and accurate genotype imputation in genome‐wide association studies through pre‐phasing. Nature Genetics, 44 ( 8 ), 955 – 959. https://doi.org/10.1038/ng.2354
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.