Methods for statistical and population genetics analyses.

Gopalakrishnan, Shyam S.

Methods for statistical and population genetics analyses.

dc.contributor.author	Gopalakrishnan, Shyam S.	en_US
dc.date.accessioned	2012-01-26T19:59:26Z
dc.date.available	NO_RESTRICTION	en_US
dc.date.available	2012-01-26T19:59:26Z
dc.date.issued	2011	en_US
dc.date.submitted		en_US
dc.identifier.uri	https://hdl.handle.net/2027.42/89609
dc.description.abstract	Genetics studies have advanced rapidly, from candidate region studies to genome wide association studies (GWAS) and next generation sequencing projects. The emergence of new technologies has brought with it an array of statistical challenges. In this thesis, we propose methods for statistical and population genetics in our effort to better understand the underlying architecture of our genomes. GWAS rely on indirect association, testing a reduced set of representative markers (tagSNPs) instead of all variants present in the genome. In the first chapter, we propose a graph-based method to select the optimal set of tagSNPs. We apply our method to chromosome-wide data and show that it outperforms the widely used greedy approach, selecting fewer tagSNPs while maintaining high correlation with non-tagSNPs variants. Alignment to a reference sequence is an integral step in many sequencing studies. Multiply mapped reads, reads that align to multiple locations in the reference, are discarded from downstream analyses, resulting in a loss of information. We develop a Gibbs sampling approach to identify the true location of multiply mapped reads obtained from the alignment step. We validate our method using simulation studies. We use the improvement in variant discovery to quantify the effect of including multiply mapped reads in downstream analyses. In the third chapter, we explore the feasibility of admixture mapping, a population genetics tool, in identifying regions harboring rare susceptibility variants. We compare the power of admixture mapping to single marker association studies in detecting causal regions. We find that admixture mapping performs better over a wide range of risk allele frequencies. The site frequency spectrum (SFS) is an important summary statistic in population genetics, encompassing information on selection and demographic history. We show that estimates of the SFS obtained from genotype calling methods underestimate the number of rare variants, especially singletons and doubletons. We derive a maximum likelihood estimate for the SFS. We demonstrate that our method performs better than SFS obtained from genotype calling algorithms using both simulated and real data examples.	en_US
dc.language.iso	en_US	en_US
dc.subject	Statistical Genetics	en_US
dc.subject	Population Genetics	en_US
dc.subject	Admixture Mapping	en_US
dc.subject	Site Frequency Spectrum Estimation	en_US
dc.subject	TagSNP Selection	en_US
dc.subject	Next Generation Sequence Read Remapping	en_US
dc.title	Methods for statistical and population genetics analyses.	en_US
dc.type	Thesis	en_US
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Biostatistics	en_US
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies	en_US
dc.contributor.committeemember	Zoellner, Sebastian K.	en_US
dc.contributor.committeemember	Boehnke, Michael Lee	en_US
dc.contributor.committeemember	Li, Jun	en_US
dc.contributor.committeemember	Qin, Zhaohui	en_US
dc.contributor.committeemember	Rosenberg, Noah A.	en_US
dc.subject.hlbsecondlevel	Genetics	en_US
dc.subject.hlbsecondlevel	Statistics and Numeric Data	en_US
dc.subject.hlbtoplevel	Health Sciences	en_US
dc.subject.hlbtoplevel	Science	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/89609/1/gopalakr_1.pdf
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: gopalakr_1.pdf
Size:: 863.7KB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.