Enhanced word embeddings using multi-semantic representation through lexical chains

Ruas, Terry; Ferreira, Charles Henrique Porto; Grosky, William; Olivetti de Franca, Fabrıcio; Rossi de Medeiros, Debora Maria

Enhanced word embeddings using multi-semantic representation through lexical chains

dc.contributor.author	Ruas, Terry
dc.contributor.author	Ferreira, Charles Henrique Porto
dc.contributor.author	Grosky, William
dc.contributor.author	Olivetti de Franca, Fabrıcio
dc.contributor.author	Rossi de Medeiros, Debora Maria
dc.date.accessioned	2020-05-13T17:36:38Z
dc.date.available	2020-05-13T17:36:38Z
dc.date.issued	2020-09
dc.identifier.citation	Terry Ruas, Charles Henrique Porto Ferreira, William Grosky, Fabrício Olivetti de França, Débora Maria Rossi de Medeiros, "Enhanced word embeddings using multi-semantic representation through lexical chains," Information Sciences, Volume 532, 2020, Pages 16-32, https://doi.org/10.1016/j.ins.2020.04.048	en_US
dc.identifier.uri	https://hdl.handle.net/2027.42/155353
dc.description.abstract	The relationship between words in a sentence often tells us more about the underlying semantic content of a document than its actual words, individually. In this work, we propose two novel algorithms, called Flexible Lexical Chain II and Fixed Lexical Chain II. These algorithms combine the semantic relations derived from lexical chains, prior knowledge from lexical databases, and the robustness of the distributional hypothesis in word embeddings as building blocks forming a single system. In short, our approach has three main contributions: (i) a set of techniques that fully integrate word embeddings and lexical chains; (ii) a more robust semantic representation that considers the latent relation between words in a document; and (iii) lightweight word embeddings models that can be extended to any natural language task. We intend to assess the knowledge of pre-trained models to evaluate their robustness in document classification task. The proposed techniques are tested against seven word embeddings algorithms using five different machine learning classifiers over six scenarios in the document classification task. Our results show the integration between lexical chains and word embeddings representations sustain state-of-the-art results, even against more complex systems.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Elsevier	en_US
dc.subject	Lexical chains	en_US
dc.subject	Natural language processing	en_US
dc.subject	Word embeddings	en_US
dc.subject	Document classification	en_US
dc.subject	Synsets	en_US
dc.title	Enhanced word embeddings using multi-semantic representation through lexical chains	en_US
dc.type	Article	en_US
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.peerreviewed	Peer Reviewed	en_US
dc.contributor.affiliationum	Computer and Information Science, Department of (UM-Dearborn)	en_US
dc.contributor.affiliationother	Federal University of ABC, Brazil	en_US
dc.contributor.affiliationother	University of Wuppertal	en_US
dc.contributor.affiliationumcampus	Dearborn	en_US
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/155353/1/Ruas_EtAl_Enhanced word embeddings_preprint_2020.pdf
dc.identifier.doi	https://doi.org/10.1016/j.ins.2020.04.048
dc.identifier.source	Information Sciences	en_US
dc.description.filedescription	Description of Ruas_EtAl_Enhanced word embeddings_preprint_2020.pdf : preprint of article published in the journal Information Sciences
dc.owningcollname	Computer and Information Science, Department of (UM-Dearborn)

Files in this item

Name:: Ruas_EtAl_Enhanced word embedd ...
Size:: 505.0KB
Format:: PDF
Description:: preprint of article published ...

View/Open

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.