Statistical Methods for Analyzing Human Genetic Variation in Diverse Populations.
dc.contributor.author | Wang, Chaolong | en_US |
dc.date.accessioned | 2013-02-04T18:04:56Z | |
dc.date.available | NO_RESTRICTION | en_US |
dc.date.available | 2013-02-04T18:04:56Z | |
dc.date.issued | 2012 | en_US |
dc.date.submitted | 2012 | en_US |
dc.identifier.uri | https://hdl.handle.net/2027.42/96024 | |
dc.description.abstract | The recent expansion of genetic datasets in diverse populations has allowed researchers to investigate human genetic structure and evolutionary history with unprecedented resolution. The huge amount of data also poses new statistical challenges, in both quality control and data analysis. In this dissertation, I develop statistical methods to address some challenges arising from recent population-genetic studies, and apply the methods to study the geographic structure of human genetic variation. First, I develop a method to correct for allelic dropout, a common source of genotyping error in microsatellite data. Traditional solutions for allelic dropout often require replicate genotyping, which is costly and often impossible in population-genetic studies. To address this problem, I propose a maximum likelihood approach to estimate dropout rates from nonreplicated microsatellite genotypes. Based on simulations and empirical data, I show that this method is both accurate and fairly robust to some violations of model assumptions. Next, I introduce a Procrustes analysis approach to compare spatial maps of genetic variation. Multivariate techniques, such as principal components analysis (PCA), have been widely used to summarize population structure, typically in two-dimensional maps, which often resemble the geographic maps of sampling locations. Using the Procrustes approach, I quantitatively demonstrate that genetic coordinates based on SNPs and CNVs are similar to each other, and are highly concordant with the geographic coordinates. Finally, applying PCA and Procrustes analysis on SNP data from worldwide populations, I perform a systematic study to compare genes and geography across the globe. By considering examples in different regions, I find that significant similarity between genes and geography exists in general. Further, the similarity is highest in Asia and once isolated populations have been removed, Sub-Saharan Africa. The results provide a quantitative assessment of the geographic structure of human genetic variation worldwide. In summary, this dissertation contributes both statistical tools for analyzing large-scale genetic data and biological insights on the spatial patterns of human genetic variation. Results from this dissertation provide a basis for evaluating the role of geography in giving rise to human population structure, and can facilitate statistical methods for inferring individual geographic origin from genetic variation. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | Allelic Dropout | en_US |
dc.subject | EM Algorithm | en_US |
dc.subject | Genetic Variation | en_US |
dc.subject | Population Structure | en_US |
dc.subject | Principal Components Analysis | en_US |
dc.subject | Procrustes Analysis | en_US |
dc.title | Statistical Methods for Analyzing Human Genetic Variation in Diverse Populations. | en_US |
dc.type | Thesis | en_US |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Bioinformatics | en_US |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | en_US |
dc.contributor.committeemember | Boehnke, Michael Lee | en_US |
dc.contributor.committeemember | Rosenberg, Noah A. | en_US |
dc.contributor.committeemember | Zoellner, Sebastian K. | en_US |
dc.contributor.committeemember | Zhu, Ji | en_US |
dc.contributor.committeemember | Burmeister, Margit | en_US |
dc.subject.hlbsecondlevel | Genetics | en_US |
dc.subject.hlbtoplevel | Science | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/96024/1/chaolong_1.pdf | |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.