Show simple item record

Reconstructing signaling pathways from high throughput data.

dc.contributor.authorZhu, Dongxiao
dc.contributor.advisorIII, Alfred O. Hero,
dc.date.accessioned2016-08-30T16:05:18Z
dc.date.available2016-08-30T16:05:18Z
dc.date.issued2006
dc.identifier.urihttp://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:3224798
dc.identifier.urihttps://hdl.handle.net/2027.42/125940
dc.description.abstractMany bioinformatics problems can be tackled from a fresh angle offered by the network perspective. Taking into account the network constraints on gene interaction, we propose a series of logically-coherent approaches to reconstruct signaling pathways from high throughput expression profiling data. These approaches proceed in three consecutive steps: co-expression network construction with controlled biological and statistical significance, network constrained clustering, and reconstruction of the order of pathway components. The first step relies on detecting pairwise co-expression of genes. We attack the problem from both frequentist statistics and Bayesian statistics perspectives. We designed and implemented a frequentist two-stage co-expression detection algorithm that controls both statistical significance (False Discovery Rate, FDR) and biological significance (Minimum Acceptable Strength, MAS) of the discovered co-expressions. In order to regularize variances of the correlation estimation in small sample scenario, we also designed and implemented a Bayesian hierarchical model, in which correlation parameters are assumed to be exchangeable and sampled from a parental Gaussian distribution. Using simulated data and the galactose metabolism data, we demonstrated advantages of our approaches and compared the differences among them. The second problem considered is distance-based clustering that accounts for network constraints extracted from the Giant Connected Component (GCC) of the network discovered from the data. The clustering is performed using a hybrid distance matrix composed of direct distance between adjacent genes and shortest-path distance between non-adjacent genes in the network. The third problem considered is the reconstruction of the order of pathway components. We applied a first-order Markov model, originally developed and applied to a network tomography problem in telecommunication networks, to reconstruct three well-known signaling pathways from unordered pathway components. We suggest that the methods proposed here can also be applied to other high throughput data analysis problems.
dc.format.extent170 p.
dc.languageEnglish
dc.language.isoEN
dc.subjectClustering
dc.subjectExpression Profiling
dc.subjectHigh-throughput Data
dc.subjectReconstructing
dc.subjectSignaling Pathways
dc.titleReconstructing signaling pathways from high throughput data.
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformatics
dc.description.thesisdegreedisciplineBiological Sciences
dc.description.thesisdegreedisciplineBiostatistics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/125940/2/3224798.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.