Show simple item record

A co-training algorithm for multi-view data with applications in data fusion

dc.contributor.authorCulp, Marken_US
dc.contributor.authorMichailidis, Georgeen_US
dc.date.accessioned2009-07-06T15:37:11Z
dc.date.available2010-08-02T17:56:56Zen_US
dc.date.issued2009-06en_US
dc.identifier.citationCulp, Mark; Michailidis, George (2009). "A co-training algorithm for multi-view data with applications in data fusion." Journal of Chemometrics 23(6): 294-303. <http://hdl.handle.net/2027.42/63041>en_US
dc.identifier.issn0886-9383en_US
dc.identifier.issn1099-128Xen_US
dc.identifier.urihttps://hdl.handle.net/2027.42/63041
dc.description.abstractIn several scientific applications, data are generated from two or more diverse sources (views) with the goal of predicting an outcome of interest. Often it is the case that the outcome is not associated with any single view. However, the synergy of all measurements from each view may yield a more predictive classifier. For example, consider a drug discovery application in which individual molecules are described partially by several assay screens based on diverse profiles and partially by their chemical structural fingerprints. A common classification problem is to determine whether the molecule is associated with a particular disease. In this paper, a co-training algorithm is developed to utilize data from diverse sources to predict the common class variable. Novel enhancements for variable importance, robustness to a mislabeled class variable, and a technique to handle unbalanced classes are applied to the motivating data set, highlighting that the approach attains strong performance and provides useful diagnostics for data analytic purposes. In addition, comparisons to a framework with data fusion using partial least squares (PLS) are also assessed on real data. An R package for performing the proposed approach is provided as Supporting information. Copyright © 2003 John Wiley & Sons, Ltd.en_US
dc.format.extent38428 bytes
dc.format.extent3118 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypetext/plain
dc.publisherJohn Wiley & Sons, Ltd.en_US
dc.subject.otherChemistryen_US
dc.subject.otherAnalytical Chemistry and Spectroscopyen_US
dc.titleA co-training algorithm for multi-view data with applications in data fusionen_US
dc.typeArticleen_US
dc.rights.robotsIndexNoFollowen_US
dc.subject.hlbsecondlevelChemical Engineeringen_US
dc.subject.hlbsecondlevelChemistryen_US
dc.subject.hlbsecondlevelMaterials Science and Engineeringen_US
dc.subject.hlbtoplevelEngineeringen_US
dc.subject.hlbtoplevelScienceen_US
dc.description.peerreviewedPeer Revieweden_US
dc.contributor.affiliationumDepartment of Statistics, University of Michigan, Ann Arbor, MI, USAen_US
dc.contributor.affiliationotherDepartment of Statistics, West Virginia University, Morgantown, WV, USA ; Department of Statistics, West Virginia University, Morgantown, WV, USA.en_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/63041/1/cem_1233_sm_suppmaterial.pdf
dc.identifier.doi10.1002/cem.1233en_US
dc.identifier.sourceJournal of Chemometricsen_US
dc.owningcollnameInterdisciplinary and Peer-Reviewed


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.