Bayesian Modeling for High Throughput Genomic Data.

Hu, Ming

Bayesian Modeling for High Throughput Genomic Data.

dc.contributor.author	Hu, Ming	en_US
dc.date.accessioned	2011-01-18T16:20:50Z
dc.date.available	NO_RESTRICTION	en_US
dc.date.available	2011-01-18T16:20:50Z
dc.date.issued	2010	en_US
dc.date.submitted		en_US
dc.identifier.uri	https://hdl.handle.net/2027.42/78939
dc.description.abstract	The explosion of high throughput genomic data in recent years has already altered our view of the extent and complexity of biology. Technologically specific features, heterogeneous data structures and massive sample sizes present great challenges and opportunities to develop novel statistical methodologies in computational biology. This dissertation presents three Bayesian modeling methods in high throughput genomic data analysis. In chapter 2, we develop a model-based gene expression query algorithm built under the Bayesian model selection framework. This algorithm is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust in the presence of sporadic outliers in the data. Our simulation studies suggest that this method outperforms existing query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons, as well as novel potential target genes of numerous key transcription factors. In chapter 3, we introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for transcription factor binding sites (TFBS) motif discovery in ChIP-Seq data. HMS incorporates sequencing depth information to aid motif identification, allows intra-motif dependency to describe more accurately the underlying motif pattern and combines stochastic sampling and deterministic search to accelerate the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that the accuracy of existing TFBS motif patterns can be significantly improved. In chapter 4, we propose a spatial Poisson regression model to provide a portrait of base-level sequencing depth in RNA-Seq data. The model utilizes two random effects to explain the spatial correlation and the non-spatial variation and incorporates GC content effects into the mean structure for better fitting. Both simulation study and real data analysis demonstrate that this method can capture local genomic features that affect coverage depth, and therefore, offers improved quantification of the true underlying expression levels. The research in this dissertation demonstrates that Bayesian modeling methods have achieved great success and have the potential to accelerate biomedical research.	en_US
dc.format.extent	6538053 bytes
dc.format.extent	1373 bytes
dc.format.mimetype	application/pdf
dc.format.mimetype	text/plain
dc.language.iso	en_US	en_US
dc.subject	Bayesian Modeling	en_US
dc.subject	High Throughput Genomic Data	en_US
dc.subject	MCMC	en_US
dc.subject	ChIP-Seq	en_US
dc.subject	RNA-Seq	en_US
dc.subject	Microarray	en_US
dc.title	Bayesian Modeling for High Throughput Genomic Data.	en_US
dc.type	Thesis	en_US
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Biostatistics	en_US
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies	en_US
dc.contributor.committeemember	Qin, Zhaohui	en_US
dc.contributor.committeemember	Abecasis, Goncalo	en_US
dc.contributor.committeemember	Johnson, Timothy D.	en_US
dc.contributor.committeemember	Kumar, Chandan	en_US
dc.contributor.committeemember	Lin, Jiandie	en_US
dc.contributor.committeemember	Taylor, Jeremy M.	en_US
dc.subject.hlbsecondlevel	Genetics	en_US
dc.subject.hlbsecondlevel	Public Health	en_US
dc.subject.hlbsecondlevel	Statistics and Numeric Data	en_US
dc.subject.hlbtoplevel	Health Sciences	en_US
dc.subject.hlbtoplevel	Science	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/78939/1/hming_1.pdf
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: hming_1.pdf
Size:: 6.235MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.