Human-Centered Natural Language Processing for Countering Misinformation
dc.contributor.author | Kazemi, Ashkan | |
dc.date.accessioned | 2024-05-22T17:22:55Z | |
dc.date.available | 2024-05-22T17:22:55Z | |
dc.date.issued | 2024 | |
dc.date.submitted | 2024 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/193271 | |
dc.description.abstract | As curbing the spread of online misinformation has proven to be challenging, we look to artificial intelligence (AI) and natural language technology for helping individuals and society counter and limit it. Despite current advances, state-of-the-art natural language processing (NLP) and AI still struggle to automatically identify and understand misinformation. Humans exposed to harmful content may experience lasting negative consequences in real life, and it is often difficult to change one’s mind once they form wrong beliefs. Addressing these interwoven technical and social challenges requires research and understanding into the core mechanisms that drive the phenomena of misinformation. This thesis introduces human-centered NLP tasks and methods that can help prioritize human welfare in countering misinformation. We present findings on the differences in how people of different backgrounds perceive misinformation, and how misinformation unfolds in different conditions such as end-to-end encrypted social media in India. We build on this understanding to create models and datasets for identifying misinformation at scale that put humans in the decision making seat, through claim matching, matching claims with fact-check reports, and query rewriting that scale the efforts of fact-checkers. Our work highlights the global impact of misinformation, and contributes to advancing the equitability of available language technologies through models and datasets in a variety of high and low resources and languages. We also make fundamental contributions to data, algorithms, and models through: multilingual and low-resource embeddings and retrieval for better claim matching, reinforcement learning for reformulating queries for better misinformation discovery, unsupervised and graph-based focused content extraction through introducing the Biased TextRank algorithm, and explanation generation through extractive (Biased TextRank) and abstractive (GPT-2) summarization. Through this thesis, we aim to promote individual and social wellbeing by creating language technologies built on a deeper understanding of misinformation, and provide tools to help journalists as well as internet users to identify and navigate around it. | |
dc.language.iso | en_US | |
dc.subject | natural language processing | |
dc.subject | misinformation | |
dc.subject | Human-Centered NLP | |
dc.title | Human-Centered Natural Language Processing for Countering Misinformation | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | |
dc.description.thesisdegreediscipline | Computer Science & Engineering | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Mihalcea, Rada | |
dc.contributor.committeemember | Perez-Rosas, Veronica | |
dc.contributor.committeemember | Budak, Ceren | |
dc.contributor.committeemember | Hale, Scott | |
dc.contributor.committeemember | Wang, Lu | |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbtoplevel | Engineering | |
dc.contributor.affiliationumcampus | Ann Arbor | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/193271/1/ashkank_1.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/22916 | |
dc.identifier.orcid | 0000-0002-2475-1007 | |
dc.identifier.name-orcid | Kazemi, Ashkan; 0000-0002-2475-1007 | en_US |
dc.working.doi | 10.7302/22916 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.