Detecting Machine-obfuscated Plagiarism
dc.contributor.author | Foltynek, Tomas | |
dc.contributor.author | Ruas, Terry | |
dc.contributor.author | Scharpf, Philipp | |
dc.contributor.author | Meuschke, Norman | |
dc.contributor.author | Schubotz, Moritz | |
dc.contributor.author | Grosky, William | |
dc.contributor.author | Gipp, Bela | |
dc.date.accessioned | 2019-12-13T13:52:54Z | |
dc.date.available | 2019-12-13T13:52:54Z | |
dc.date.issued | 2019-12-13 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/152346 | |
dc.description | Related dataset is at https://doi.org/10.7302/bewj-qx93 and also listed in the dc.relation field of the full item record. | |
dc.description.abstract | Research on academic integrity has identified online paraphrasing tools as a severe threat to the effectiveness of plagiarism detection systems. To enable the automated identification of machine-paraphrased text, we make three contributions. First, we evaluate the effectiveness of six prominent word embedding models in combination with five classifiers for distinguishing human-written from machine-paraphrased text. The best performing classification approach achieves an accuracy of 99.0% for documents and 83.4% for paragraphs. Second, we show that the best approach outperforms human experts and established plagiarism detection systems for these classification tasks. Third, we provide a Web application that uses the best performing classification approach to indicate whether a text underwent machine-paraphrasing. The data and code of our study are openly available. | en_US |
dc.language.iso | en_US | en_US |
dc.relation | https://doi.org/10.7302/bewj-qx93 | |
dc.title | Detecting Machine-obfuscated Plagiarism | en_US |
dc.type | Conference Paper | en_US |
dc.subject.hlbsecondlevel | Computer Science | |
dc.subject.hlbtoplevel | Engineering | |
dc.description.peerreviewed | Peer Reviewed | en_US |
dc.contributor.affiliationum | University of Michigan Dearborn | en_US |
dc.contributor.affiliationum | University of Michigan Dearborn | en_US |
dc.contributor.affiliationother | University of Wuppertal, Mendel University in Brno | en_US |
dc.contributor.affiliationother | University of Wuppertal | en_US |
dc.contributor.affiliationother | University of Konstanz | en_US |
dc.contributor.affiliationother | University of Wuppertal, University of Konstanz | en_US |
dc.contributor.affiliationother | University of Wuppertal | en_US |
dc.contributor.affiliationother | University of Wuppertal | en_US |
dc.contributor.affiliationumcampus | Dearborn | en_US |
dc.description.bitstreamurl | https://deepblue.lib.umich.edu/bitstream/2027.42/152346/1/Foltynek2020_Paraphrase_Detection.pdf | |
dc.identifier.orcid | 0000-0001-8412-5553 | en_US |
dc.identifier.orcid | 0000-0002-9440-780X | en_US |
dc.identifier.orcid | 0000-0002-4212-0508 | en_US |
dc.identifier.orcid | 0000-0003-4648-8198 | en_US |
dc.identifier.orcid | 0000-0001-7141-4997 | en_US |
dc.identifier.orcid | 0000-0002-2775-2806 | en_US |
dc.identifier.orcid | 0000-0001-6522-3019 | en_US |
dc.description.filedescription | Description of Foltynek2020_Paraphrase_Detection.pdf : Foltynek2020_Paraphrase_Detection | |
dc.identifier.name-orcid | Schubotz, Moritz; 0000-0001-7141-4997 | en_US |
dc.identifier.name-orcid | Foltýnek, Tomáš; 0000-0001-8412-5553 | en_US |
dc.identifier.name-orcid | Meuschke, Norman; 0000-0003-4648-8198 | en_US |
dc.identifier.name-orcid | Grosky, William; 0000-0002-2775-2806 | en_US |
dc.identifier.name-orcid | Ruas, Terry; 0000-0002-9440-780X | en_US |
dc.identifier.name-orcid | Scharpf, Philipp; 0000-0002-4212-0508 | en_US |
dc.identifier.name-orcid | Gipp, Béla; 0000-0001-6522-3019 | en_US |
dc.owningcollname | Computer and Information Science, Department of (UM-Dearborn) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.