Show simple item record

Genotype Imputation in Genome‐Wide Association Studies

dc.contributor.authorPorcu, Eleonora
dc.contributor.authorSanna, Serena
dc.contributor.authorFuchsberger, Christian
dc.contributor.authorFritsche, Lars G.
dc.date.accessioned2020-01-13T15:11:36Z
dc.date.available2020-01-13T15:11:36Z
dc.date.issued2013-07
dc.identifier.citationPorcu, Eleonora; Sanna, Serena; Fuchsberger, Christian; Fritsche, Lars G. (2013). "Genotype Imputation in Genome‐Wide Association Studies." Current Protocols in Human Genetics 78(1): 1.25.1-1.25.14.
dc.identifier.issn1934-8266
dc.identifier.issn1934-8258
dc.identifier.urihttps://hdl.handle.net/2027.42/152850
dc.description.abstractImputation is an in silico method that can increase the power of association studies by inferring missing genotypes, harmonizing data sets for meta‐analyses, and increasing the overall number of markers available for association testing. This unit provides an introductory overview of the imputation method and describes a two‐step imputation approach that consists of the phasing of the study genotypes and the imputation of reference panel genotypes into the study haplotypes. Detailed steps for data preparation and quality control illustrate how to run the computationally intensive two‐step imputation with the high‐density reference panels of the 1000 Genomes Project, which currently integrates more than 39 million variants. Additionally, the influence of reference panel selection, input marker density, and imputation settings on imputation quality are demonstrated with a simulated data set to give insight into crucial points of successful genotype imputation. Curr. Protoc. Hum. Genet. 78:1.25.1‐1.25.14. © 2013 by John Wiley & Sons, Inc.
dc.publisherWiley Periodicals, Inc.
dc.subject.othergenotyping arrays
dc.subject.otherlinkage disequilibrium
dc.subject.othergenome‐wide association studies
dc.subject.otherimputation
dc.subject.otherinference
dc.subject.otherimputation
dc.subject.other1000 Genomes Project
dc.subject.otherHapMap Project
dc.subject.otherrare variants
dc.titleGenotype Imputation in Genome‐Wide Association Studies
dc.typeArticle
dc.rights.robotsIndexNoFollow
dc.subject.hlbsecondlevelGenetics
dc.subject.hlbsecondlevelBiological Chemistry
dc.subject.hlbsecondlevelMolecular, Cellular and Developmental Biology
dc.subject.hlbtoplevelScience
dc.subject.hlbtoplevelHealth Sciences
dc.description.peerreviewedPeer Reviewed
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/152850/1/cphg0125.pdf
dc.identifier.doi10.1002/0471142905.hg0125s78
dc.identifier.sourceCurrent Protocols in Human Genetics
dc.identifier.citedreferenceSu, Z., Marchini, J., and Donnelly, P. 2011. HAPGEN2: Simulation of multiple disease SNPs. Bioinformatics 27: 2304 ‐ 2305.
dc.identifier.citedreferenceHowie, B.N., Donnelly, P., and Marchini, J. 2009. A flexible and accurate genotype imputation method for the next generation of genome‐wide association studies. PLoS Genet. 5: e100529.
dc.identifier.citedreferenceHowie, B.N., Fuchsberger, C., Stephens, M., Marchini, J., and Abecasis, G.R. 2012. Fast and accurate genotype imputation in genome‐wide association studies through pre‐phasing. Nat. Genet. 44: 955 ‐ 959.
dc.identifier.citedreferenceHuang, L., Li, Y., Singleton, A.B., Hardy, J.A., Abecasis, G., Rosenberg, N.A., and Scheet, P. 2009. Genotype‐imputation accuracy across worldwide human populations. Am. J. Hum. Genet. 84: 235 ‐ 250.
dc.identifier.citedreferenceInternational HapMap 3 Consortium. 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467: 52 ‐ 58.
dc.identifier.citedreferenceKlein, R.J., Zeiss, C., Chew, E.Y., Tsai, J.Y., Sackler, R.S., Haynes, C., Henning, A.K., SanGiovanni, J.P., Mane, S.M., Mayne, S.T., Bracken, M.B., Ferris, F.L., Ott, J., Barnstable, C., and Hoh, J. 2005. Complement factor H polymorphism in age‐related macular degeneration. Science 308: 385 ‐ 389.
dc.identifier.citedreferenceKong, A., Masson, G., Frigge, M.L., Gylfason, A., Zusmanovich, P., Thorleifsson, G., Olason, P.I., Ingason, A., Steinberg, S., Rafnar, T., Sulem, P., Mouy, M., Jonsson, F., Thorsteinsdottir, U., Gudbjartsson, D.F., Stefansson, H., and Stefansson, K. 2008. Detection of sharing by descent, long‐range phasing and haplotype imputation. Nat. Genet. 40: 1068 ‐ 1075.
dc.identifier.citedreferenceLi, Y., Willer, C.J., Ding, J., Scheet, P., and Abecasis, G.R. 2010. MaCH: Using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34: 816 ‐ 834.
dc.identifier.citedreferenceLiu, E.Y., Buyske, S., Aragaki, A.K., Peters, U., Boerwinkle, E., Carlson, C., Carty, C., Crawford, D.C., Haessler, J., Hindorff, L.A., Marchand, L.L., Manolio, T.A., Matise, T., Wang, W., Kooperberg, C., North, K.E., and Li, Y. 2012. Genotype imputation of Metabochip SNPs using a study‐specific reference panel of ˜4,000 haplotypes in African Americans from the Women’s Health Initiative. Genet. Epidemiol. 36: 107 ‐ 117.
dc.identifier.citedreferenceMeschia, J.F., Nalls, M., Matarin, M., Brott, T.G., Brown, R.D. Jr., Hardy, J., Kissela, B., Rich, S.S., Singleton, A., Hernandez, D., Ferrucci, L., Pearce, K., Keller, M., and Worrall, B.B. 2011. Siblings With Ischemic Stroke Study Investigators. Siblings with ischemic stroke study: Results of a genome‐wide scan for stroke loci. Stroke. 42: 2726 ‐ 2732.
dc.identifier.citedreferenceMetzker, M.L. 2010. Sequencing technologies ‐ the next generation. Nat. Rev. Genet. 11: 31 ‐ 46.
dc.identifier.citedreferenceO’Connell, J.R. and Weeks, D.E. 1998. PedCheck: A program for identification of genotype incompatibilities in linkage analysis. Am. J. Hum. Genet. 63: 259 ‐ 266.
dc.identifier.citedreferencePurcell, S., Neale, B., Todd‐Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., and Sham, P.C. 2007. PLINK: A tool set for whole‐genome association and population‐based linkage analyses. Am. J. Hum. Genet. 81: 559 ‐ 575.
dc.identifier.citedreferenceSanna, S., Jackson, A.U., Nagaraja, R., Willer, C.J., Chen, W.M., Bonnycastle, L.L., Shen, H., Timpson, N., Lettre, G., Usala, G., Chines, P.S., Stringham, H.M., Scott, L.J., Dei, M., Lai, S., Albai, G., Crisponi, L., Naitza, S., Doheny, K.F., Pugh, E.W., Ben‐Shlomo, Y., Ebrahim, S., Lawlor, D.A., Bergman, R.N., Watanabe, R.M., Uda, M., Tuomilehto, J., Coresh, J., Hirschhorn, J.N., Shuldiner, A.R., Schlessinger, D., Collins, F.S., Davey Smith, G., Boerwinkle, E., Cao, A., Boehnke, M., Abecasis, G.R., and Mohlke, K.L. 2008. Common variants in the GDF5‐UQCC region are associated with variation in human height. Nat. Genet. 40: 198 ‐ 203.
dc.identifier.citedreferenceScheet, P. and Stephens, M. 2006. A fast and flexible statistical model for large‐scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78: 629 ‐ 644.
dc.identifier.citedreferenceScott, L.J., Mohlke, K.L., Bonnycastle, L.L., Willer, C.J., Li, Y., Duren, W.L., Erdos, M.R., Stringham, H.M., Chines, P.S., Jackson, A.U., Prokunina‐Olsson, L., Ding, C.J., Swift, A.J., Narisu, N., Hu, T., Pruim, R., Xiao, R., Li, X.Y., Conneely, K.N., Riebow, N.L., Sprau, A.G., Tong, M., White, P.P., Hetrick, K.N., Barnhart, M.W., Bark, C.W., Goldstein, J.L., Watkins, L., Xiang, F., Saramies, J., Buchanan, T.A., Watanabe, R.M., Valle, T.T., Kinnunen, L., Abecasis, G.R., Pugh, E.W., Doheny, K.F., Bergman, R.N., Tuomilehto, J., Collins, F.S., and Boehnke, M. 2007. A genome‐wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316: 1341 ‐ 1345.
dc.identifier.citedreferenceSoutham, L., Panoutsopoulou, K., Rayner, N.W., Chapman, K., Durrant, C., Ferreira, T., Arden, N., Carr, A., Deloukas, P., Doherty, M., Loughlin, J., McCaskie, A., Ollier, W.E., Ralston, S., Spector, T.D., Valdes, A.M., Wallis, G.A., Wilkinson, J.M., arcOGEN Consortium, Marchini, J., and Zeggini, E. 2011. The effect of genome‐wide association scan quality control on imputation outcome for common variants. Eur. J. Hum. Genet. 19: 610 ‐ 614.
dc.identifier.citedreferenceTurner, S., Armstrong, L.L., Bradford, Y., Carlson, C.S., Crawford, D.C., Crenshaw, A.T., de Andrade, M., Doheny, K.F., Haines, J.L., Hayes, G., Jarvik, G., Jiang, L., Kullo, I.J., Li, R., Ling, H., Manolio, T.A., Matsumoto, M., McCarty, C.A., McDavid, A.N., Mirel, D.B., Paschall, J.E., Pugh, E.W., Rasmussen, L.V., Wilke, R.A., Zuvich, R.L., and Ritchie, M.D. 2011. Quality control procedures for genome‐wide association studies. Curr. Protoc. Hum. Genet. 68: 1.19. 1‐1.19.18.
dc.identifier.citedreferenceVoight, B.F., Kang, H.M., Ding, J., Palmer, C.D., Sidore, C., Chines, P.S., Burtt, N.P., Fuchsberger, C., Li, Y., Erdmann, J., Frayling, T.M., Heid, I.M., Jackson, A.U., Johnson, T., Kilpelainen, T.O., Lindgren, C.M., Morris, A.P., Prokopenko, I., Randall, J.C., Saxena, R., Soranzo, N., Speliotes, E.K., Teslovich, T.M., Wheeler, E., Maguire, J., Parkin, M., Potter, S., Rayner, N.W., Robertson, N., Stirrups, K., Winckler, W., Sanna, S., Mulas, A., Nagaraja, R., Cucca, F., Barroso, I., Deloukas, P., Loos, R.J., Kathiresan, S., Munroe, P.B., Newton‐Cheh, C., Pfeufer, A., Samani, N.J., Schunkert, H., Hirschhorn, J.N., Altshuler, D., McCarthy, M.I., Abecasis, G.R., and Boehnke, M. 2012. The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 8: e1002793.
dc.identifier.citedreferenceWetterstrand, K. 2013. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). www.genome.gov/sequencingcosts.
dc.identifier.citedreferenceWigginton, J.E. and Abecasis, G.R. 2005. PEDSTATS: Descriptive statistics, graphics and quality assessment for gene mapping data. Bioinformatics 21: 3445 ‐ 3447.
dc.identifier.citedreferencehttp://www.sph.umich.edu/csg/abecasis/MaCH/tour
dc.identifier.citedreferencehttp://genome.sph.umich.edu/wiki/MaCH_FAQ
dc.identifier.citedreferencehttp://genome.sph.umich.edu/wiki/Minimac
dc.identifier.citedreferencehttp://genome.sph.umich.edu/wiki/Minimac:_1000_Genomes_Imputation_Cookbook
dc.identifier.citedreferencehttp://genome.sph.umich.edu/wiki/IMPUTE2:_1000_Genomes_Imputation_Cookbook
dc.identifier.citedreferencehttp://www.1000genomes.org
dc.identifier.citedreferencehttp://hapmap.ncbi.nlm.nih.gov
dc.identifier.citedreferencehttp://www.unc.edu/~yunmli/software.html
dc.identifier.citedreferencehttps://mathgen.stats.ox.ac.uk/genetics_software/hapgen/hapgen2.html
dc.identifier.citedreference1000 Genomes Project Consortium. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491: 56 ‐ 65.
dc.identifier.citedreferenceAbecasis, G.R. and Wigginton, J.E. 2005. Handling marker‐marker linkage disequilibrium: Pedigree analysis with clustered markers. Am. J. Hum. Genet. 77: 754 ‐ 767.
dc.identifier.citedreferenceBrowning, B.L. and Browning, S.R. 2009. A unified approach to genotype imputation and haplotype‐phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84: 210 ‐ 223.
dc.identifier.citedreferencede Bakker, P.I., Ferreira, M.A., Jia, X., Neale, B.M., Raychaudhuri, S., and Voight, B.F. 2008. Practical aspects of imputation‐driven meta‐analysis of genome‐wide association studies. Hum. Mol. Genet. 17: R122 ‐ R128.
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.