Show simple item record

Unsupervised Graph-Based Similarity Learning Using Heterogeneous Features.

dc.contributor.authorMuthukrishnan, Pradeepen_US
dc.date.accessioned2012-01-26T20:07:35Z
dc.date.availableNO_RESTRICTIONen_US
dc.date.available2012-01-26T20:07:35Z
dc.date.issued2011en_US
dc.date.submitted2011en_US
dc.identifier.urihttps://hdl.handle.net/2027.42/89824
dc.description.abstractRelational data refers to data that contains explicit relations among objects. Nowadays, relational data are universal and have a broad appeal in many different application domains. The problem of estimating similarity between objects is a core requirement for many standard Machine Learning (ML), Natural Language Processing (NLP) and Information Retrieval (IR) problems such as clustering, classiffication, word sense disambiguation, etc. Traditional machine learning approaches represent the data using simple, concise representations such as feature vectors. While this works very well for homogeneous data, i.e, data with a single feature type such as text, it does not exploit the availability of dfferent feature types fully. For example, scientic publications have text, citations, authorship information, venue information. Each of the features can be used for estimating similarity. Representing such objects has been a key issue in efficient mining (Getoor and Taskar, 2007). In this thesis, we propose natural representations for relational data using multiple, connected layers of graphs; one for each feature type. Also, we propose novel algorithms for estimating similarity using multiple heterogeneous features. Also, we present novel algorithms for tasks like topic detection and music recommendation using the estimated similarity measure. We demonstrate superior performance of the proposed algorithms (root mean squared error of 24.81 on the Yahoo! KDD Music recommendation data set and classiffication accuracy of 88% on the ACL Anthology Network data set) over many of the state of the art algorithms, such as Latent Semantic Analysis (LSA), Multiple Kernel Learning (MKL) and spectral clustering and baselines on large, standard data sets.en_US
dc.language.isoen_USen_US
dc.subjectUnsupervised Algorithmsen_US
dc.subjectGraph-based Learningen_US
dc.subjectSimilarity Learningen_US
dc.subjectMachine Learningen_US
dc.subjectMultiple Heterogeneous Featuresen_US
dc.titleUnsupervised Graph-Based Similarity Learning Using Heterogeneous Features.en_US
dc.typeThesisen_US
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineComputer Science & Engineeringen_US
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studiesen_US
dc.contributor.committeememberRadev, Dragomir Radkoven_US
dc.contributor.committeememberAbney, Steven P.en_US
dc.contributor.committeememberLee, Honglaken_US
dc.contributor.committeememberMei, Qiaozhuen_US
dc.contributor.committeememberSyed, Zeeshanen_US
dc.subject.hlbsecondlevelComputer Scienceen_US
dc.subject.hlbtoplevelEngineeringen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/89824/1/mpradeep_1.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.