Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels

Zhou, Wei; Fritsche, Lars G.; Das, Sayantan; Zhang, He; Nielsen, Jonas B.; Holmen, Oddgeir L.; Chen, Jin; Lin, Maoxuan; Elvestad, Maiken B.; Hveem, Kristian; Abecasis, Goncalo R.; Kang, Hyun Min; Willer, Cristen J.

Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels

dc.contributor.author	Zhou, Wei
dc.contributor.author	Fritsche, Lars G.
dc.contributor.author	Das, Sayantan
dc.contributor.author	Zhang, He
dc.contributor.author	Nielsen, Jonas B.
dc.contributor.author	Holmen, Oddgeir L.
dc.contributor.author	Chen, Jin
dc.contributor.author	Lin, Maoxuan
dc.contributor.author	Elvestad, Maiken B.
dc.contributor.author	Hveem, Kristian
dc.contributor.author	Abecasis, Goncalo R.
dc.contributor.author	Kang, Hyun Min
dc.contributor.author	Willer, Cristen J.
dc.date.accessioned	2017-12-15T16:47:37Z
dc.date.available	2019-02-01T19:56:25Z	en
dc.date.issued	2017-12
dc.identifier.citation	Zhou, Wei; Fritsche, Lars G.; Das, Sayantan; Zhang, He; Nielsen, Jonas B.; Holmen, Oddgeir L.; Chen, Jin; Lin, Maoxuan; Elvestad, Maiken B.; Hveem, Kristian; Abecasis, Goncalo R.; Kang, Hyun Min; Willer, Cristen J. (2017). "Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels." Genetic Epidemiology 41(8): 744-755.
dc.identifier.issn	0741-0395
dc.identifier.issn	1098-2272
dc.identifier.uri	https://hdl.handle.net/2027.42/139954
dc.description.abstract	The accuracy of genotype imputation depends upon two factors: the sample size of the reference panel and the genetic similarity between the reference panel and the target samples. When multiple reference panels are not consented to combine together, it is unclear how to combine the imputation results to optimize the power of genetic association studies. We compared the accuracy of 9,265 Norwegian genomes imputed from three reference panels—1000 Genomes phase 3 (1000G), Haplotype Reference Consortium (HRC), and a reference panel containing 2,201 Norwegian participants from the population‐based Nord Trøndelag Health Study (HUNT) from low‐pass genome sequencing. We observed that the population‐matched reference panel allowed for imputation of more population‐specific variants with lower frequency (minor allele frequency (MAF) between 0.05% and 0.5%). The overall imputation accuracy from the population‐specific panel was substantially higher than 1000G and was comparable with HRC, despite HRC being 15‐fold larger. These results recapitulate the value of population‐specific reference panels for genotype imputation. We also evaluated different strategies to utilize multiple sets of imputed genotypes to increase the power of association studies. We observed that testing association for all variants imputed from any panel results in higher power to detect association than the alternative strategy of including only one version of each genetic variant, selected for having the highest imputation quality metric. This was particularly true for lower frequency variants (MAF < 1%), even after adjusting for the additional multiple testing burden.
dc.publisher	Wiley Periodicals, Inc.
dc.subject.other	genotype imputation
dc.subject.other	multiple reference panels
dc.subject.other	GWAS
dc.subject.other	study power
dc.subject.other	population‐specific
dc.title	Improving power of association tests using multiple sets of imputed genotypes from distributed reference panels
dc.type	Article	en_US
dc.rights.robots	IndexNoFollow
dc.subject.hlbsecondlevel	Genetics
dc.subject.hlbsecondlevel	Molecular, Cellular and Developmental Biology
dc.subject.hlbsecondlevel	Biological Chemistry
dc.subject.hlbtoplevel	Health Sciences
dc.subject.hlbtoplevel	Science
dc.description.peerreviewed	Peer Reviewed
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/139954/1/gepi22067_am.pdf
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/139954/2/gepi22067.pdf
dc.identifier.doi	10.1002/gepi.22067
dc.identifier.source	Genetic Epidemiology
dc.identifier.citedreference	Mitt, M., Kals, M., Parn, K., Gabriel, S. B., Lander, E. S., Palotie, A., … Palta, P. ( 2017 ). Improved imputation accuracy of rare and low‐frequency variants using population‐specific high‐coverage WGS‐based imputation reference panel. European Journal of Human Genetics, 25 ( 7 ), 869 – 876. https://doi.org/10.1038/ejhg.2017.51
dc.identifier.citedreference	Howie, B. N., Donnelly, P., & Marchini, J. ( 2009 ). A flexible and accurate genotype imputation method for the next generation of genome‐wide association studies. PLoS Genetics, 5 ( 6 ), e1000529. https://doi.org/10.1371/journal.pgen.1000529
dc.identifier.citedreference	Huang, G. H., & Tseng, Y. C. ( 2014 ). Genotype imputation accuracy with different reference panels in admixed populations. BMC Proceedings, 8 (Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo), S64. https://doi.org/10.1186/1753-6561-8-s1-s64
dc.identifier.citedreference	Huang, J., Ellinghaus, D., Franke, A., Howie, B., & Li, Y. ( 2012 ). 1000 Genomes‐based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data. European Journal of Human Genetics, 20 ( 7 ), 801 – 805. https://doi.org/10.1038/ejhg.2012.3
dc.identifier.citedreference	Huang, J., Howie, B., McCarthy, S., Memari, Y., Walter, K., Min, J. L., … Durbin, R. ( 2015 ). Improved imputation of low‐frequency and rare variants using the UK10K haplotype reference panel. 6, 8111 – 8119. https://doi.org/10.1038/ncomms9111
dc.identifier.citedreference	Huang, L., Li, Y., Singleton, A. B., Hardy, J. A., Abecasis, G., Rosenberg, N. A., & Scheet, P. ( 2009 ). Genotype‐imputation accuracy across worldwide human populations. American Journal of Human Genetics, 84 ( 2 ), 235 – 250. https://doi.org/10.1016/j.ajhg.2009.01.013
dc.identifier.citedreference	Jin, Y., Andersen, G., Yorgov, D., Ferrara, T. M., Ben, S., Brownson, K. M., … Koks, S. ( 2016 ). Genome‐wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants. https://doi.org/10.1038/ng.3680
dc.identifier.citedreference	Jolliffe I. T. ( 1986 ). Principal Component Analysis and Factor Analysis. In Principal component analysis (pp. 115 – 128 ). Springer: New York.
dc.identifier.citedreference	Jun, G., Wing, M. K., Abecasis, G. R., & Kang, H. M. ( 2015 ). An efficient and scalable analysis framework for variant extraction and refinement from population‐scale DNA sequence data. Genome Research, 25 ( 6 ), 918 – 925. https://doi.org/10.1101/gr.176552.114
dc.identifier.citedreference	Krokstad, S., Langhammer, A., Hveem, K., Holmen, T. L., Midthjell, K., Stene, T. R., … Holmen, J. ( 2013 ). Cohort Profile: The HUNT Study, Norway. International Journal of Epidemiology, 42 ( 4 ), 968 – 977. https://doi.org/10.1093/ije/dys095
dc.identifier.citedreference	Lane, J. M., Vlasac, I., & Anderson, S. G. ( 2016 ). Genome‐wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank. Nature communications, 7, 10889 – 10898. https://doi.org/10.1038/ncomms10889
dc.identifier.citedreference	Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., … MacArthur, D. G. ( 2016 ). Analysis of protein‐coding genetic variation in 60,706 humans. Nature, 536 ( 7616 ), 285 – 291. https://doi.org/10.1038/nature19057
dc.identifier.citedreference	Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. ( 2009 ). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25 ( 16 ), 2078 – 2079. https://doi.org/10.1093/bioinformatics/btp352
dc.identifier.citedreference	Li, Y., Willer, C., Sanna, S., & Abecasis, G. ( 2009 ). Genotype imputation. Annual Reviews of Genomics and Human Genetics, 10, 387 – 406.
dc.identifier.citedreference	Li, Y., Willer, C. J., Ding, J., Scheet, P., & Abecasis, G. R. ( 2010 ). MaCH: Using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology, 34 ( 8 ), 816 – 834. https://doi.org/10.1002/gepi.20533
dc.identifier.citedreference	Loos, R. J., Lindgren, C. M., Li, S., Wheeler, E., Zhao, J. H., Prokopenko, I., … Mohlke, K. L. ( 2008 ). Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nature Genetics, 40 ( 6 ), 768 – 775. https://doi.org/10.1038/ng.140
dc.identifier.citedreference	Low‐Kam, C., Rhainds, D., Lo, K. S., Provost, S., Mongrain, I., Dubois, A., … Lettre, G. ( 2016 ). Whole‐genome sequencing in French Canadians from Quebec. Human Genetics, 135 ( 11 ), 1213 – 1221. https://doi.org/10.1007/s00439-016-1702-6
dc.identifier.citedreference	Mahajan, A., Go, M. J., Zhang, W., Below, J. E., Gaulton, K. J., Ferreira, T., … Morris, A. P. ( 2014 ). Genome‐wide trans‐ancestry meta‐analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nature Genetics, 46 ( 3 ), 234 – 244. https://doi.org/10.1038/ng.2897
dc.identifier.citedreference	Marchini, J., & Howie, B. ( 2010 ). Genotype imputation for genome‐wide association studies. Nature Reviews Genetics, 11 ( 7 ), 499 – 511. https://doi.org/10.1038/nrg2796
dc.identifier.citedreference	Marchini, J., Howie, B., Myers, S., McVean, G., & Donnelly, P. ( 2007 ). A new multipoint method for genome‐wide association studies by imputation of genotypes. Nature Genetics, 39 ( 7 ), 906 – 913. https://doi.org/10.1038/ng2088
dc.identifier.citedreference	McCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A. R., Teumer, A., … Durbin, R. ( 2016 ). A reference panel of 64,976 haplotypes for genotype imputation. Nature Genetics, 48 ( 10 ), 1279 – 1283. https://doi.org/10.1038/ng.3643
dc.identifier.citedreference	Nalls, M. A., Pankratz, N., Lill, C. M., Do, C. B., Hernandez, D. G., Saad, M., … Singleton, A. B. ( 2014 ). Large‐scale meta‐analysis of genome‐wide association data identifies six new risk loci for Parkinson’s disease. Nature Genetics, 46 ( 9 ), 989 – 993. https://doi.org/10.1038/ng.3043
dc.identifier.citedreference	Okada, Y., Momozawa, Y., Ashikawa, K., Kanai, M., & Matsuda, K. ( 2015 ). Construction of a population‐specific HLA imputation reference panel and its application to Graves’ disease risk in Japanese. 47 ( 7 ), 798 – 802. https://doi.org/10.1038/ng.3310
dc.identifier.citedreference	Pistis, G., Porcu, E., Vrieze, S. I., Sidore, C., Steri, M., Danjou, F., … Sanna, S. ( 2015 ). Rare variant genotype imputation with thousands of study‐specific whole‐genome sequences: Implications for cost‐effective study designs. European Journal of Human Genetics, 23 ( 7 ), 975 – 983. https://doi.org/10.1038/ejhg.2014.216
dc.identifier.citedreference	Purcell, S., Neale, B., Todd‐Brown, K., Thomas, L., Ferreira, M. A., Bender, D., … Sham, P. C. ( 2007 ). PLINK: A tool set for whole‐genome association and population‐based linkage analyses. American Journal of Human Genetics, 81 ( 3 ), 559 – 575. https://doi.org/10.1086/519795
dc.identifier.citedreference	Roshyara, N. R., & Scholz, M. ( 2015 ). Impact of genetic similarity on imputation accuracy. BMC Genetics, 16, 90 – 105. https://doi.org/10.1186/s12863-015-0248-2
dc.identifier.citedreference	Ruth, K. S., Campbell, P. J., Chew, S., Lim, E. M., Hadlow, N., Stuckey, B. G., … Perry, J. R. ( 2015 ). Genome‐wide association study with 1000 genomes imputation identifies signals for nine sex hormone‐related phenotypes. European Journal of Human Genetics, 24 ( 2 ), 284 – 290. https://doi.org/10.1038/ejhg.2015.102
dc.identifier.citedreference	Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., & Sirotkin, K. ( 2001 ). dbSNP: The NCBI database of genetic variation. Nucleic Acids Research, 29 ( 1 ), 308 – 311.
dc.identifier.citedreference	Spencer, C. C., Su, Z., Donnelly, P., & Marchini, J. ( 2009 ). Designing genome‐wide association studies: Sample size, power, imputation, and the choice of genotyping chip. PLoS Genetics, 5 ( 5 ), e1000477. https://doi.org/10.1371/journal.pgen.1000477
dc.identifier.citedreference	van Leeuwen, E. M., Sabo, A., Bis, J. C., Huffman, J. E., Manichaikul, A., Smith, A. V., … van Duijn, C. M. ( 2016 ). Meta‐analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. Journal of Medical Genetics, 53 ( 7 ), 441 – 449. https://doi.org/10.1136/jmedgenet-2015-103439
dc.identifier.citedreference	Exome Variant Server, NHLBI Exome Sequencing Project (ESP), Seattle, WA (URL: http://evs.gs.washington.edu/EVS/ ) [August 2016 accessed].
dc.identifier.citedreference	Walter, K., Min, J. L., Huang, J., Crooks, L., Memari, Y., McCarthy, S., … Soranzo, N. ( 2015 ). The UK10K project identifies rare variants in health and disease. Nature, 526 ( 7571 ), 82 – 90. https://doi.org/10.1038/nature14962
dc.identifier.citedreference	Zeggini, E., Scott, L. J., Saxena, R., Voight, B. F., Marchini, J. L., Hu, T., … Altshuler, D. ( 2008 ). Meta‐analysis of genome‐wide association data and large‐scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genetics, 40 ( 5 ), 638 – 645. https://doi.org/10.1038/ng.120
dc.identifier.citedreference	Zeggini, E., Weedon, M. N., Lindgren, C. M., Frayling, T. M., Elliott, K. S., Lango, H., … Hattersley, A. T. ( 2007 ). Replication of genome‐wide association signals in UK samples reveals risk loci for type 2 diabetes. Science, 316 ( 5829 ), 1336 – 1341. https://doi.org/10.1126/science.1142364
dc.identifier.citedreference	Auton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., … Abecasis, G. R. ( 2015 ). A global reference for human genetic variation. Nature, 526 ( 7571 ), 68 – 74. https://doi.org/10.1038/nature15393
dc.identifier.citedreference	Browning, B. L., & Browning, S. R. ( 2009 ). A unified approach to genotype imputation and haplotype‐phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics, 84 ( 2 ), 210 – 223. https://doi.org/10.1016/j.ajhg.2009.01.005
dc.identifier.citedreference	Browning, B. L., & Browning, S. R. ( 2013 ). Improving the accuracy and efficiency of identity‐by‐descent detection in population data. Genetics, 194 ( 2 ), 459 – 471. https://doi.org/10.1534/genetics.113.150029
dc.identifier.citedreference	Burton, P. R., Clayton, D. G., Cardon, L. R., Craddock, N., Deloukas, P., Duncanson, A., … Samani, N. J. ( 2007 ). Genome‐wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447 ( 7145 ), 661 – 678.
dc.identifier.citedreference	Cheng, T. H., Thompson, D. J., O’Mara, T. A., Painter, J. N., Glubb, D. M., Flach, S., … Spurdle, A. B. ( 2016 ). Five endometrial cancer risk loci identified through genome‐wide association analysis, Nat Genet, 48 ( 6 ), 667 – 674. https://doi.org/10.1038/ng.3562
dc.identifier.citedreference	Cooper, J. D., Smyth, D. J., Smiles, A. M., Plagnol, V., Walker, N. M., Allen, J. E., … Todd, J. A. ( 2008 ). Meta‐analysis of genome‐wide association study data identifies additional type 1 diabetes risk loci. Nature Genetics, 40 ( 12 ), 1399 – 1401. https://doi.org/10.1038/ng.249
dc.identifier.citedreference	Das, S., Forer, L., Schonherr, S., Sidore, C., Locke, A. E., Kwong, A., … Fuchsberger, C. ( 2016 ). Next‐generation genotype imputation service and methods. Nat Genet, 48 ( 10 ), 1284 – 1287. https://doi.org/10.1038/ng.3656
dc.identifier.citedreference	De Jager, P. L., Jia, X., Wang, J., de Bakker, P. I., Ottoboni, L., Aggarwal, N. T., … Oksenberg, J. R. ( 2009 ). Meta‐analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nature Genetics, 41 ( 7 ), 776 – 782. https://doi.org/10.1038/ng.401
dc.identifier.citedreference	Deelen, P., Menelaou, A., van Leeuwen, E. M., Kanterakis, A., van Dijk, F., Medina‐Gomez, C., … Kreiner‐Moller, E. ( 2014 ). Improved imputation quality of low‐frequency and rare variants in European samples using the “Genome of The Netherlands.” European Journal of Human Genetics, 22 ( 11 ), 1321 – 1326. https://doi.org/10.1038/ejhg.2014.19
dc.identifier.citedreference	Delaneau, O., Zagury, J. F., & Marchini, J. ( 2013 ). Improved whole‐chromosome phasing for disease and population genetic studies. Nature Methods, 10 ( 1 ), 5 – 6. https://doi.org/10.1038/nmeth.2307
dc.identifier.citedreference	Fuchsberger, C., Abecasis, G. R., & Hinds, D. A. ( 2015 ). minimac2: Faster genotype imputation. Bioinformatics, 31 ( 5 ), 782 – 784. https://doi.org/10.1093/bioinformatics/btu704
dc.identifier.citedreference	Ge, Y., Wang, Y., Shao, W., Jin, J., Du, M., Ma, G., … Zhang, Z. ( 2016 ). Rare variants in BRCA2 and CHEK2 are associated with the risk of urinary tract cancers. Scientific Reports, 6, 33542 – 33548. https://doi.org/10.1038/srep33542
dc.identifier.citedreference	Gudbjartsson, D. F., Helgason, H., Gudjonsson, S. A., Zink, F., Oddson, A., Gylfason, A., & Besenbacher, S. ( 2015 ). Large‐scale whole‐genome sequencing of the Icelandic population. 47 ( 5 ), 435 – 444. https://doi.org/10.1038/ng.3247
dc.identifier.citedreference	Horikoshi, M., Mgi, R., van de Bunt, M., Surakka, I., Sarin, A. P., Mahajan, A., … Morris, A. P. ( 2015 ). Discovery and fine‐mapping of glycaemic and obesity‐related trait loci using high‐density imputation. PLoS Genetics, 11 ( 7 ), e1005230. https://doi.org/10.1371/journal.pgen.1005230
dc.identifier.citedreference	Houlston, R. S., Webb, E., Broderick, P., Pittman, A. M., Di Bernardo, M. C., Lubbe, S., … Dunlop, M. G. ( 2008 ). Meta‐analysis of genome‐wide association data identifies four new susceptibility loci for colorectal cancer. Nature Genetics, 40 ( 12 ), 1426 – 1435. https://doi.org/10.1038/ng.262
dc.identifier.citedreference	Howie, B., Fuchsberger, C., Stephens, M., Marchini, J., & Abecasis, G. R. ( 2012 ). Fast and accurate genotype imputation in genome‐wide association studies through pre‐phasing. Nature Genetics, 44 ( 8 ), 955 – 959. https://doi.org/10.1038/ng.2354
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: gepi22067_am.pdf
Size:: 1.799MB
Format:: PDF

View/Open

Name:: gepi22067.pdf
Size:: 799.0KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.