Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains

Ruas, Terry L.

Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains

dc.contributor.author	Ruas, Terry L.
dc.contributor.advisor	Grosky, William
dc.date.accessioned	2019-06-26T14:13:22Z
dc.date.available	NO_RESTRICTION	en_US
dc.date.available	2019-06-26T14:13:22Z
dc.date.issued	2019-08-23
dc.date.submitted	2019-06-13
dc.identifier.uri	https://hdl.handle.net/2027.42/149647
dc.description.abstract	The relationship between words in a sentence often tell us more about the underlying semantic content of a document than its actual words individually. Natural language understanding has seen an increasing effort in the formation of techniques that try to produce non-trivial features, in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data. These new dense vector representations indeed leverage the baseline in natural language processing, but they still fall short in dealing with intrinsic issues in linguistics, such as polysemy and homonymy. Systems that make use of natural language at its core, can be affected by a weak semantic representation of human language, resulting in inaccurate outcomes based on poor decisions. In this subject, word sense disambiguation and lexical chains have been exploring alternatives to alleviate several problems in linguistics, such as semantic representation, definitions, differentiation, polysemy, and homonymy. However, little effort is seen in combining recent advances in token embeddings (e.g. words, documents) with word sense disambiguation and lexical chains. To collaborate in building a bridge between these areas, this work proposes a collection of algorithms to extract semantic features from large corpora as its main contributions, named MSSA, MSSA-D, MSSA-NR, FLLC II, and FXLC II. The MSSA techniques focus on disambiguating and annotating each word by its specific sense, considering the semantic effects of its context. The lexical chains group derive the semantic relations between consecutive words in a document in a dynamic and pre-defined manner. These original techniques' target is to uncover the implicit semantic links between words using their lexical structure, incorporating multi-sense embeddings, word sense disambiguation, lexical chains, and lexical databases. A few natural language problems are selected to validate the contributions of this work, in which our techniques outperform state-of-the-art systems. All the proposed algorithms can be used separately as independent components or combined in one single system to improve the semantic representation of words, sentences, and documents. Additionally, they can also work in a recurrent form, refining even more their results.	en_US
dc.language.iso	en_US	en_US
dc.subject	Synsets	en_US
dc.subject	WordNet	en_US
dc.subject	MSSA	en_US
dc.subject	Natural language processing	en_US
dc.subject	Semantics	en_US
dc.subject	Lexical chains	en_US
dc.subject.other	Computer and Information Science	en_US
dc.title	Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains	en_US
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	College of Engineering & Computer Science	en_US
dc.description.thesisdegreegrantor	University of Michigan-Dearborn	en_US
dc.contributor.committeemember	Abouelenien, Mohamed
dc.contributor.committeemember	Agrawal, Rajeev
dc.contributor.committeemember	Kessentini, Marouane
dc.contributor.committeemember	Ortiz, Luis
dc.contributor.committeemember	Zakarian, Armen
dc.identifier.uniqname	7512 2669	en_US
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/149647/1/Terry Ruas Final Dissertation.pdf
dc.identifier.orcid	0000-0002-9440-780X	en_US
dc.description.filedescription	Description of Terry Ruas Final Dissertation.pdf : Dissertation
dc.identifier.name-orcid	Ruas, Terry; 0000-0002-9440-780X	en_US
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: Terry Ruas Final Dissertation.pdf
Size:: 2.691MB
Format:: PDF
Description:: Dissertation

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.