Show simple item record

Developing and Application of Statistical Algorithms for High-Demensional Biological Data Analysis

dc.contributor.authorSenbabaoglu, Yasinen_US
dc.date.accessioned2012-10-12T15:25:40Z
dc.date.availableNO_RESTRICTIONen_US
dc.date.available2012-10-12T15:25:40Z
dc.date.issued2012en_US
dc.date.submitted2012en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/94038
dc.description.abstractVarious high-throughput technologies have fueled advances in biomedical research in the last decade. Two typical examples are gene expression and genomic hybridization microarrays that quantify RNA and DNA levels respectively. High-dimensional data sets generated by these technologies presented novel opportunities to discover relationships not only among interrogating probes (i.e genes) but also among interrogated specimens (i.e samples). At the same time, however, the necessity to model the variability within and between different high-throughput platforms has created novel statistical challenges. In this thesis, I address the opportunities and challenges with three algorithms. First, I present DynBoost, a new method to infer gene-gene dependence relationships and nonlinear dynamics in gene regulatory networks. DynBoost is a flexible boosting algorithm that shares features from L2-boosting and randomization-based algorithms to perform the tasks of parameter learning and network inference. The performance of the proposed algorithm was evaluated on a number of benchmark data sets from the DREAM3 challenge and the results strongly indicated that it outperformed existing approaches. Second, I revisit consensus clustering (CC) and some other clustering methods in the context of unsupervised sample subtype discovery. I show that many unsupervised partitioning methods are able to divide homogeneous data into pre-specified numbers of clusters, and CC is able to show apparent stability of such chance partitioning of random data. I conclude that CC is a powerful tool for minimizing false negatives in the presence of genuine structure, but can lead to false positives in the exploratory phase of many studies if the implementation and inference are not carried out with caution in line with particular prudent practices. Lastly, I present MPCBS, a new method that integrates DNA copy number analysis across different platforms by pooling statistical evidence during segmentation. I show by comparing the integrated analysis of Affymetrix and Illumina SNP array data with Agilent and fosmid clone end-sequencing results on 8 HapMap samples that MPCBS achieves improved spatial resolution, detection power, and provides a natural consensus across platforms.en_US
dc.language.isoen_USen_US
dc.subjectConsensus Clusteringen_US
dc.subjectUnsupervised Class Discoveryen_US
dc.subjectReverse-engineering Gene Regulatory Networksen_US
dc.subjectDNA Copy Number Estimationen_US
dc.subjectOperator-valued Kernelsen_US
dc.subjectTCGA Glioblastoma Multiformeen_US
dc.titleDeveloping and Application of Statistical Algorithms for High-Demensional Biological Data Analysisen_US
dc.typeThesisen_US
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformaticsen_US
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studiesen_US
dc.contributor.committeememberLi, Junen_US
dc.contributor.committeememberMichailidis, Georgeen_US
dc.contributor.committeememberBurns Jr., Daniel M.en_US
dc.contributor.committeememberSartor, Maureen A.en_US
dc.contributor.committeememberD'alche-Buc, Florenceen_US
dc.subject.hlbsecondlevelComputer Scienceen_US
dc.subject.hlbsecondlevelMolecular, Cellular and Developmental Biologyen_US
dc.subject.hlbsecondlevelScience (General)en_US
dc.subject.hlbsecondlevelStatistics and Numeric Dataen_US
dc.subject.hlbtoplevelEngineeringen_US
dc.subject.hlbtoplevelHealth Sciencesen_US
dc.subject.hlbtoplevelScienceen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/94038/1/yasinsen_1.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.