Methods for statistical and population genetics analyses.
dc.contributor.author | Gopalakrishnan, Shyam S. | en_US |
dc.date.accessioned | 2012-01-26T19:59:26Z | |
dc.date.available | NO_RESTRICTION | en_US |
dc.date.available | 2012-01-26T19:59:26Z | |
dc.date.issued | 2011 | en_US |
dc.date.submitted | en_US | |
dc.identifier.uri | https://hdl.handle.net/2027.42/89609 | |
dc.description.abstract | Genetics studies have advanced rapidly, from candidate region studies to genome wide association studies (GWAS) and next generation sequencing projects. The emergence of new technologies has brought with it an array of statistical challenges. In this thesis, we propose methods for statistical and population genetics in our effort to better understand the underlying architecture of our genomes. GWAS rely on indirect association, testing a reduced set of representative markers (tagSNPs) instead of all variants present in the genome. In the first chapter, we propose a graph-based method to select the optimal set of tagSNPs. We apply our method to chromosome-wide data and show that it outperforms the widely used greedy approach, selecting fewer tagSNPs while maintaining high correlation with non-tagSNPs variants. Alignment to a reference sequence is an integral step in many sequencing studies. Multiply mapped reads, reads that align to multiple locations in the reference, are discarded from downstream analyses, resulting in a loss of information. We develop a Gibbs sampling approach to identify the true location of multiply mapped reads obtained from the alignment step. We validate our method using simulation studies. We use the improvement in variant discovery to quantify the effect of including multiply mapped reads in downstream analyses. In the third chapter, we explore the feasibility of admixture mapping, a population genetics tool, in identifying regions harboring rare susceptibility variants. We compare the power of admixture mapping to single marker association studies in detecting causal regions. We find that admixture mapping performs better over a wide range of risk allele frequencies. The site frequency spectrum (SFS) is an important summary statistic in population genetics, encompassing information on selection and demographic history. We show that estimates of the SFS obtained from genotype calling methods underestimate the number of rare variants, especially singletons and doubletons. We derive a maximum likelihood estimate for the SFS. We demonstrate that our method performs better than SFS obtained from genotype calling algorithms using both simulated and real data examples. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | Statistical Genetics | en_US |
dc.subject | Population Genetics | en_US |
dc.subject | Admixture Mapping | en_US |
dc.subject | Site Frequency Spectrum Estimation | en_US |
dc.subject | TagSNP Selection | en_US |
dc.subject | Next Generation Sequence Read Remapping | en_US |
dc.title | Methods for statistical and population genetics analyses. | en_US |
dc.type | Thesis | en_US |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Biostatistics | en_US |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | en_US |
dc.contributor.committeemember | Zoellner, Sebastian K. | en_US |
dc.contributor.committeemember | Boehnke, Michael Lee | en_US |
dc.contributor.committeemember | Li, Jun | en_US |
dc.contributor.committeemember | Qin, Zhaohui | en_US |
dc.contributor.committeemember | Rosenberg, Noah A. | en_US |
dc.subject.hlbsecondlevel | Genetics | en_US |
dc.subject.hlbsecondlevel | Statistics and Numeric Data | en_US |
dc.subject.hlbtoplevel | Health Sciences | en_US |
dc.subject.hlbtoplevel | Science | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/89609/1/gopalakr_1.pdf | |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.