A co-training algorithm for multi-view data with applications in data fusion
dc.contributor.author | Culp, Mark | en_US |
dc.contributor.author | Michailidis, George | en_US |
dc.date.accessioned | 2009-07-06T15:37:11Z | |
dc.date.available | 2010-08-02T17:56:56Z | en_US |
dc.date.issued | 2009-06 | en_US |
dc.identifier.citation | Culp, Mark; Michailidis, George (2009). "A co-training algorithm for multi-view data with applications in data fusion." Journal of Chemometrics 23(6): 294-303. <http://hdl.handle.net/2027.42/63041> | en_US |
dc.identifier.issn | 0886-9383 | en_US |
dc.identifier.issn | 1099-128X | en_US |
dc.identifier.uri | https://hdl.handle.net/2027.42/63041 | |
dc.description.abstract | In several scientific applications, data are generated from two or more diverse sources (views) with the goal of predicting an outcome of interest. Often it is the case that the outcome is not associated with any single view. However, the synergy of all measurements from each view may yield a more predictive classifier. For example, consider a drug discovery application in which individual molecules are described partially by several assay screens based on diverse profiles and partially by their chemical structural fingerprints. A common classification problem is to determine whether the molecule is associated with a particular disease. In this paper, a co-training algorithm is developed to utilize data from diverse sources to predict the common class variable. Novel enhancements for variable importance, robustness to a mislabeled class variable, and a technique to handle unbalanced classes are applied to the motivating data set, highlighting that the approach attains strong performance and provides useful diagnostics for data analytic purposes. In addition, comparisons to a framework with data fusion using partial least squares (PLS) are also assessed on real data. An R package for performing the proposed approach is provided as Supporting information. Copyright © 2003 John Wiley & Sons, Ltd. | en_US |
dc.format.extent | 38428 bytes | |
dc.format.extent | 3118 bytes | |
dc.format.mimetype | application/pdf | |
dc.format.mimetype | text/plain | |
dc.publisher | John Wiley & Sons, Ltd. | en_US |
dc.subject.other | Chemistry | en_US |
dc.subject.other | Analytical Chemistry and Spectroscopy | en_US |
dc.title | A co-training algorithm for multi-view data with applications in data fusion | en_US |
dc.type | Article | en_US |
dc.rights.robots | IndexNoFollow | en_US |
dc.subject.hlbsecondlevel | Chemical Engineering | en_US |
dc.subject.hlbsecondlevel | Chemistry | en_US |
dc.subject.hlbsecondlevel | Materials Science and Engineering | en_US |
dc.subject.hlbtoplevel | Engineering | en_US |
dc.subject.hlbtoplevel | Science | en_US |
dc.description.peerreviewed | Peer Reviewed | en_US |
dc.contributor.affiliationum | Department of Statistics, University of Michigan, Ann Arbor, MI, USA | en_US |
dc.contributor.affiliationother | Department of Statistics, West Virginia University, Morgantown, WV, USA ; Department of Statistics, West Virginia University, Morgantown, WV, USA. | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/63041/1/cem_1233_sm_suppmaterial.pdf | |
dc.identifier.doi | 10.1002/cem.1233 | en_US |
dc.identifier.source | Journal of Chemometrics | en_US |
dc.owningcollname | Interdisciplinary and Peer-Reviewed |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.