Neural Language Generation for Content Adaptation: Explainable, Efficient Low-Resource Text Simplification and Evaluation
dc.contributor.author | Garbacea, Georgeta-Cristina | |
dc.date.accessioned | 2023-09-22T15:38:35Z | |
dc.date.available | 2023-09-22T15:38:35Z | |
dc.date.issued | 2023 | |
dc.date.submitted | 2023 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/178028 | |
dc.description.abstract | There are rich opportunities to reduce the language complexity of professional content (either human-written or computer-generated) and make it accessible to a broad audience. As a sub-task of natural language generation (NLG), text simplification has considerable potential to improve the fairness and transparency of text information systems. Recent approaches to text simplification usually complete the task in an end-to-end fashion, employing neural machine translation models in a monolingual setting regardless of the type of simplifications to be done. These models are limited on the one hand due to the absence of large-scale parallel (complex → simple) monolingual training data, and on the other hand due to the lack of interpretability of their black-box procedures. Furthermore, despite fast development of algorithms, there is an urgency to fill the huge gap in evaluating NLG systems in general (including text simplification systems). Indeed, given no clear model of text quality and no agreed objective criterion for comparing the “goodness of texts”, the evaluation of NLG systems is inherently difficult. The present work addresses these problems: i) sample-efficient approaches to NLG that improve the fairness and transparency of text information systems by adapting their content to the literacy level of the target audience, ii) systematic analysis of evaluation metrics for NLG models informed by theory and empirical evidence. In particular, we show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process; low-resource text simplification can be framed from a task and domain adaptation perspective which can be decomposed into multiple adaptation steps via meta-learning and transfer learning; and evaluators for NLG can be evaluated at scale and compared with human judgements. Beyond the problem of low-resource text simplification, the methodology proposed in this dissertation (explainable decomposition, chain of adaptations to new tasks and domains, and meta-evaluation) may benefit other research areas related to generative artificial intelligence (AI). | |
dc.language.iso | en_US | |
dc.subject | Neural Language Generation | |
dc.subject | Low-Resource Text Simplification | |
dc.subject | Content Adaptation | |
dc.subject | Explainable Prediction of Text Complexity | |
dc.subject | Natural Language Evaluation | |
dc.subject | Artificial Intelligence | |
dc.title | Neural Language Generation for Content Adaptation: Explainable, Efficient Low-Resource Text Simplification and Evaluation | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | en_US |
dc.description.thesisdegreediscipline | Computer Science & Engineering | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Mei, Qiaozhu | |
dc.contributor.committeemember | Collins-Thompson, Kevyn | |
dc.contributor.committeemember | Chai, Joyce | |
dc.contributor.committeemember | Mower Provost, Emily | |
dc.contributor.committeemember | Wang, Lu | |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbtoplevel | Engineering | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/178028/1/garbacea_1.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/8485 | |
dc.identifier.orcid | 0000-0001-5340-594X | |
dc.identifier.name-orcid | Garbacea, Georgeta-Cristina; 0000-0001-5340-594X | en_US |
dc.working.doi | 10.7302/8485 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.