Show simple item record

Genotype Imputation in Diverse Populations: Empirical and Theoretical Approaches.

dc.contributor.authorHuang, Juichi (Lucy)en_US
dc.date.accessioned2012-01-26T20:06:45Z
dc.date.availableNO_RESTRICTIONen_US
dc.date.available2012-01-26T20:06:45Z
dc.date.issued2011en_US
dc.date.submitted2011en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/89807
dc.description.abstractGenome-wide association (GWA) studies, in which dense genotypes in a large sample of individuals are tested for disease associations, represent a powerful approach for uncovering disease-susceptibility genes. Genotype imputation is a statistical procedure that enables evaluation of disease associations at markers beyond those experimentally measured, by using chromosomal stretches shared between study and reference individuals to infer unmeasured genotypes in GWA samples. Crucial to the success of imputation procedures is the representation of GWA samples in reference datasets that contain “template” sequences from which the unmeasured genotypes are inferred. In this dissertation, I study the design of reference datasets for use in genetic studies in diverse human populations. First, I devise a mixture approach for selecting panels of reference data. Using genotype data from 29 worldwide populations, I show that nearly all populations benefit from the mixture approach in that the mixture approach reduces imputation error. Focusing on African populations whose genotypes are particularly difficult to impute, I investigate haplotype variation and imputation in Africa. Using various statistics on haplotype variation to explain variation in imputation accuracy, I find that simple statistics, such as Fst, which measure genetic distance between study and reference populations are useful metrics for guiding the selection of reference panels. Next, I quantify the increase in the minimal sample size, due to imperfect imputation, that would be required to provide the same level of statistical evidence of disease predisposition for genetic variants that are imputed rather than experimentally measured. Finally, I develop a coalescent model for evaluating imputation accuracy. Under this model, use of reference sequences selected based on observed genetic similarity to a study sequence targeted for imputation produces higher imputation accuracy than use of reference sequences selected based on population of origin. This result suggests a reference-selection strategy that chooses template sequences from multiple populations, including the target population itself. Together, results from this dissertation can inform study design for future GWA studies. In particular, they can facilitate the design of reference datasets for use in imputation-based studies, thereby improving the search for genetic determinants that affect human health in populations worldwide.en_US
dc.language.isoen_USen_US
dc.subjectGenotype Imputationen_US
dc.subjectGenome-wide Association Studyen_US
dc.titleGenotype Imputation in Diverse Populations: Empirical and Theoretical Approaches.en_US
dc.typeThesisen_US
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformaticsen_US
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studiesen_US
dc.contributor.committeememberRosenberg, Noah A.en_US
dc.contributor.committeememberZoellner, Sebastian K.en_US
dc.contributor.committeememberBoehnke, Michael Leeen_US
dc.contributor.committeememberGruber, Stephen B.en_US
dc.contributor.committeememberLi, Junen_US
dc.subject.hlbsecondlevelGeneticsen_US
dc.subject.hlbtoplevelHealth Sciencesen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/89807/1/hlucy_1.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.