Show simple item record

Methods for statistical and population genetics analyses.

dc.contributor.authorGopalakrishnan, Shyam S.en_US
dc.date.accessioned2012-01-26T19:59:26Z
dc.date.availableNO_RESTRICTIONen_US
dc.date.available2012-01-26T19:59:26Z
dc.date.issued2011en_US
dc.date.submitteden_US
dc.identifier.urihttps://hdl.handle.net/2027.42/89609
dc.description.abstractGenetics studies have advanced rapidly, from candidate region studies to genome wide association studies (GWAS) and next generation sequencing projects. The emergence of new technologies has brought with it an array of statistical challenges. In this thesis, we propose methods for statistical and population genetics in our effort to better understand the underlying architecture of our genomes. GWAS rely on indirect association, testing a reduced set of representative markers (tagSNPs) instead of all variants present in the genome. In the first chapter, we propose a graph-based method to select the optimal set of tagSNPs. We apply our method to chromosome-wide data and show that it outperforms the widely used greedy approach, selecting fewer tagSNPs while maintaining high correlation with non-tagSNPs variants. Alignment to a reference sequence is an integral step in many sequencing studies. Multiply mapped reads, reads that align to multiple locations in the reference, are discarded from downstream analyses, resulting in a loss of information. We develop a Gibbs sampling approach to identify the true location of multiply mapped reads obtained from the alignment step. We validate our method using simulation studies. We use the improvement in variant discovery to quantify the effect of including multiply mapped reads in downstream analyses. In the third chapter, we explore the feasibility of admixture mapping, a population genetics tool, in identifying regions harboring rare susceptibility variants. We compare the power of admixture mapping to single marker association studies in detecting causal regions. We find that admixture mapping performs better over a wide range of risk allele frequencies. The site frequency spectrum (SFS) is an important summary statistic in population genetics, encompassing information on selection and demographic history. We show that estimates of the SFS obtained from genotype calling methods underestimate the number of rare variants, especially singletons and doubletons. We derive a maximum likelihood estimate for the SFS. We demonstrate that our method performs better than SFS obtained from genotype calling algorithms using both simulated and real data examples.en_US
dc.language.isoen_USen_US
dc.subjectStatistical Geneticsen_US
dc.subjectPopulation Geneticsen_US
dc.subjectAdmixture Mappingen_US
dc.subjectSite Frequency Spectrum Estimationen_US
dc.subjectTagSNP Selectionen_US
dc.subjectNext Generation Sequence Read Remappingen_US
dc.titleMethods for statistical and population genetics analyses.en_US
dc.typeThesisen_US
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBiostatisticsen_US
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studiesen_US
dc.contributor.committeememberZoellner, Sebastian K.en_US
dc.contributor.committeememberBoehnke, Michael Leeen_US
dc.contributor.committeememberLi, Junen_US
dc.contributor.committeememberQin, Zhaohuien_US
dc.contributor.committeememberRosenberg, Noah A.en_US
dc.subject.hlbsecondlevelGeneticsen_US
dc.subject.hlbsecondlevelStatistics and Numeric Dataen_US
dc.subject.hlbtoplevelHealth Sciencesen_US
dc.subject.hlbtoplevelScienceen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/89609/1/gopalakr_1.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.