Unsupervised Bayesian linear unmixing of gene expression microarrays

Bazot, Cécile; Dobigeon, Nicolas; Tourneret, Jean-Yves; Zaas, Aimee K; Ginsburg, Geoffrey S; O Hero III, Alfred

Unsupervised Bayesian linear unmixing of gene expression microarrays

dc.contributor.author	Bazot, Cécile
dc.contributor.author	Dobigeon, Nicolas
dc.contributor.author	Tourneret, Jean-Yves
dc.contributor.author	Zaas, Aimee K
dc.contributor.author	Ginsburg, Geoffrey S
dc.contributor.author	O Hero III, Alfred
dc.date.accessioned	2015-08-07T17:39:46Z
dc.date.available	2015-08-07T17:39:46Z
dc.date.issued	2013-03-19
dc.identifier.citation	BMC Bioinformatics. 2013 Mar 19;14(1):99
dc.identifier.uri	https://hdl.handle.net/2027.42/112688	en_US
dc.description.abstract	Abstract Background This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Results Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. Conclusions The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor.
dc.title	Unsupervised Bayesian linear unmixing of gene expression microarrays
dc.type	Article	en_US
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/112688/1/12859_2012_Article_5920.pdf
dc.identifier.doi	10.1186/1471-2105-14-99	en_US
dc.language.rfc3066	en
dc.rights.holder	Bazot et al.; licensee BioMed Central Ltd.
dc.date.updated	2015-08-07T17:39:47Z
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: 12859_2012_Article_5920.pdf
Size:: 1.037MB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.