Detecting Machine-obfuscated Plagiarism

Foltynek, Tomas; Ruas, Terry; Scharpf, Philipp; Meuschke, Norman; Schubotz, Moritz; Grosky, William; Gipp, Bela

Detecting Machine-obfuscated Plagiarism

dc.contributor.author	Foltynek, Tomas
dc.contributor.author	Ruas, Terry
dc.contributor.author	Scharpf, Philipp
dc.contributor.author	Meuschke, Norman
dc.contributor.author	Schubotz, Moritz
dc.contributor.author	Grosky, William
dc.contributor.author	Gipp, Bela
dc.date.accessioned	2019-12-13T13:52:54Z
dc.date.available	2019-12-13T13:52:54Z
dc.date.issued	2019-12-13
dc.identifier.uri	https://hdl.handle.net/2027.42/152346
dc.description	Related dataset is at https://doi.org/10.7302/bewj-qx93 and also listed in the dc.relation field of the full item record.
dc.description.abstract	Research on academic integrity has identified online paraphrasing tools as a severe threat to the effectiveness of plagiarism detection systems. To enable the automated identification of machine-paraphrased text, we make three contributions. First, we evaluate the effectiveness of six prominent word embedding models in combination with five classifiers for distinguishing human-written from machine-paraphrased text. The best performing classification approach achieves an accuracy of 99.0% for documents and 83.4% for paragraphs. Second, we show that the best approach outperforms human experts and established plagiarism detection systems for these classification tasks. Third, we provide a Web application that uses the best performing classification approach to indicate whether a text underwent machine-paraphrasing. The data and code of our study are openly available.	en_US
dc.language.iso	en_US	en_US
dc.relation	https://doi.org/10.7302/bewj-qx93
dc.title	Detecting Machine-obfuscated Plagiarism	en_US
dc.type	Conference Paper	en_US
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.peerreviewed	Peer Reviewed	en_US
dc.contributor.affiliationum	University of Michigan Dearborn	en_US
dc.contributor.affiliationum	University of Michigan Dearborn	en_US
dc.contributor.affiliationother	University of Wuppertal, Mendel University in Brno	en_US
dc.contributor.affiliationother	University of Wuppertal	en_US
dc.contributor.affiliationother	University of Konstanz	en_US
dc.contributor.affiliationother	University of Wuppertal, University of Konstanz	en_US
dc.contributor.affiliationother	University of Wuppertal	en_US
dc.contributor.affiliationother	University of Wuppertal	en_US
dc.contributor.affiliationumcampus	Dearborn	en_US
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/152346/1/Foltynek2020_Paraphrase_Detection.pdf
dc.identifier.orcid	0000-0001-8412-5553	en_US
dc.identifier.orcid	0000-0002-9440-780X	en_US
dc.identifier.orcid	0000-0002-4212-0508	en_US
dc.identifier.orcid	0000-0003-4648-8198	en_US
dc.identifier.orcid	0000-0001-7141-4997	en_US
dc.identifier.orcid	0000-0002-2775-2806	en_US
dc.identifier.orcid	0000-0001-6522-3019	en_US
dc.description.filedescription	Description of Foltynek2020_Paraphrase_Detection.pdf : Foltynek2020_Paraphrase_Detection
dc.identifier.name-orcid	Schubotz, Moritz; 0000-0001-7141-4997	en_US
dc.identifier.name-orcid	Foltýnek, Tomáš; 0000-0001-8412-5553	en_US
dc.identifier.name-orcid	Meuschke, Norman; 0000-0003-4648-8198	en_US
dc.identifier.name-orcid	Grosky, William; 0000-0002-2775-2806	en_US
dc.identifier.name-orcid	Ruas, Terry; 0000-0002-9440-780X	en_US
dc.identifier.name-orcid	Scharpf, Philipp; 0000-0002-4212-0508	en_US
dc.identifier.name-orcid	Gipp, Béla; 0000-0001-6522-3019	en_US
dc.owningcollname	Computer and Information Science, Department of (UM-Dearborn)

Files in this item

Name:: Foltynek2020_Paraphrase_Detect ...
Size:: 345.9KB
Format:: PDF
Description:: Foltynek2020_Paraphrase_Detection

View/Open

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.