Work Description

Title: Supplementary Materials for "Crowdsourced Detection of Emotionally Manipulative Language" Open Access Deposited

h
Attribute Value
Methodology
  • Please see our full article and the attached readme for details about the methodology used to create this dataset.
Description
  • The following files include supplementary materials for our CHI 2020 paper "Crowdsourced Detection of Emotionally Manipulative Language". Namely, these materials include the dataset that was used in the evaluation. See the paper for more details.
Creator
Depositor
  • jhuffak@umich.edu
Contact information
Discipline
Funding agency
  • National Aeronautics and Space Administration (NASA)
Keyword
Citations to related material
  • J.S. Huffaker, J.K. Kummerfeld, W.S. Lasecki, M.S. Ackerman. Crowdsourced Detection of Emotionally Manipulative Language. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2020). Honolulu, HI. 2020.
Resource type
Last modified
  • 11/18/2022
Published
  • 03/25/2020
Language
DOI
  • https://doi.org/10.7302/yhpy-e679
License
To Cite this Work:
Huffaker, J. S., Kummerfeld, J. K., Lasecki, W. S., Ackerman, M. S. (2020). Supplementary Materials for "Crowdsourced Detection of Emotionally Manipulative Language" [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/yhpy-e679

Relationships

This work is not a member of any user collections.

Files (Count: 3; Size: 83.2 KB)

The following dataset is provided in conjunction with the paper "Crowdsourced Detection of Emotionally Manipulative Language" featured in the proceedings of the ACM conference on human factors in computing systems (CHI 2020). The purpose of the dataset is to provide clear examples of emotionally manipulative language (EML) within text while controlling for a variety of factors such as whether the topic of the text is intrinsically emotional content (IEC). Our intent is that this dataset be used to benchmark the performance of systems intended to detect EML within text, however, we expect that other applications may be used as well.

Detecting EML is non-trivial because it must be distinguished from IEC. An example of EML would consist of a segment by the Fox News pundit Tucker Carlson where he calls congresswoman Ilhan Omar a "living fire alarm. A warning to the rest of us that we better change our immigration system immediately. Or else." to play to people's xenophobia. IEC would consist of content that is inherently emotional no matter how it is conveyed, such as an immigrant describing the challenges she experienced crossing the U.S. southern border.

Methodology

To facilitate the development of EML detectors, we created a dataset of twenty text snippets adopted from news articles. We systematically modified each text snippet to create a version with heavy EML and one with very little EML while maintaining the same information between each version. We selected news articles to ensure that half of them include IEC, balancing the dataset between four conditions (with EML and IEC, only EML, only IEC, and no EML or IEC). We hired a journalist and a member of the editorial staff for a nationally prominent news magazine to verify the two dimensions of our dataset and to verify that some of the snippets would be publishable in a reputable news source. Comparing the ratings of the two experts we found that they tended to agree, achieving high interrater reliability scores, and verifying the validity of the dataset.

Warning

Our provided dataset covers a variety of topics that are emotionally loaded and that some readers may find difficult to read. We suggest readers find a comfortable location to open the documents (i.e., not in a public space) and take a second to emotionally prepare themselves.

Download All Files (To download individual files, select them in the “Files” panel above)

Best for data sets < 3 GB. Downloads all files plus metadata into a zip file.



Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.