Machine Learning for Flow Cytometry Data Analysis.

Lee, Gyemin

Machine Learning for Flow Cytometry Data Analysis.

dc.contributor.author	Lee, Gyemin	en_US
dc.date.accessioned	2012-01-26T20:07:24Z
dc.date.available	NO_RESTRICTION	en_US
dc.date.available	2012-01-26T20:07:24Z
dc.date.issued	2011	en_US
dc.date.submitted		en_US
dc.identifier.uri	https://hdl.handle.net/2027.42/89818
dc.description.abstract	This thesis concerns the problem of automatic flow cytometry data analysis. Flow cytometry is a technique for rapid cell analysis and widely used in many biomedical and clinical laboratories. Quantitative measurements from a flow cytometer provide rich information about various physical and chemical characteristics of a large number of cells. In clinical applications, flow cytometry data is visualized on a sequence of two-dimensional scatter plots and analyzed through a manual process called “gating”. This conventional analysis process requires a large amount of time and labor and is highly subjective and inefficient. In this thesis, we present novel machine learning methods for flow cytometry data analysis to address these issues. We first begin by a method for generating a high dimensional flow cytometry dataset from multiple low dimensional datasets. We present an imputation algorithm based on clustering and show that it improves upon a simple nearest neighbor based approach that often induces spurious clusters in the imputed data. This technique enables the analysis of multi-dimensional flow cytometry data beyond the fundamental measurement limits of instruments. We then present two machine learning methods for automatic gating problems. Gating is a process of identifying interesting subsets of cell populations. Pathologists make clinical decisions by inspecting the results from gating. Unfortunately, this process is performed manually in most clinical settings and poses many challenges in high-throughput analysis. The first approach is an unsupervised learning technique based on multivariate mixture models. Since measurements from a flow cytometer are often censored and truncated, standard model-fitting algorithms can cause biases and lead to poor gating results. We propose novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored. Our second approach is a transfer learning technique combined with the low-density separation principle. Unlike conventional unsupervised learning approaches, this method can leverage existing datasets previously gated by domain experts to automatically gate a new flow cytometry data. Moreover, the proposed algorithm can adaptively account for biological variations in multiple datasets. We demonstrate these techniques on clinical flow cytometry data and evaluate their effectiveness.	en_US
dc.language.iso	en_US	en_US
dc.subject	Machine Learning	en_US
dc.subject	Flow Cytometry	en_US
dc.subject	Support Vector Machine	en_US
dc.subject	Mixture Models	en_US
dc.subject	EM Algorithms	en_US
dc.subject	Statistical File Matching	en_US
dc.title	Machine Learning for Flow Cytometry Data Analysis.	en_US
dc.type	Thesis	en_US
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Electrical Engineering: Systems	en_US
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies	en_US
dc.contributor.committeemember	Scott, Clayton D.	en_US
dc.contributor.committeemember	Fessler, Jeffrey A.	en_US
dc.contributor.committeemember	Finn, William G.	en_US
dc.contributor.committeemember	Hero Iii, Alfred O.	en_US
dc.contributor.committeemember	Nguyen, Long	en_US
dc.subject.hlbsecondlevel	Electrical Engineering	en_US
dc.subject.hlbtoplevel	Engineering	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/89818/1/gyemin_1.pdf
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: gyemin_1.pdf
Size:: 1.736MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.