Detecting Informal Data References in Academic Literature
dc.contributor.author | Lafia, Sara | |
dc.contributor.author | Ko, Jeong-Woo | |
dc.contributor.author | Moss, Elizabeth | |
dc.contributor.author | Kim, Jinseok | |
dc.contributor.author | Thomer, Andrea | |
dc.contributor.author | Hemphill, Libby | |
dc.date.accessioned | 2021-07-22T21:09:38Z | |
dc.date.available | 2021-07-22T21:09:38Z | |
dc.date.issued | 2021-07-22 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/168392 | en |
dc.description.abstract | The Inter-university Consortium for Political and Social Research (ICPSR) is developing a machine learning approach using natural language processing (NLP) to assist in the detection of informal data references. Formal data citations that reference unique identifiers are readily discoverable; however, informal references indicating research data reuse are challenging to infer and detect. We contribute a model that uses a combination of cues, such as the presence of indicator terms and syntactical patterns, to assign a likelihood score to dataset mentions and extract candidate data citations from academic text. In production, the model will support the evaluation of candidate documents for ingest into the ICPSR Bibliography of Data-related Literature. This work supports a larger effort to measure the impact of research data. | en_US |
dc.language.iso | en_US | en_US |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | data citation | en_US |
dc.subject | data reference | en_US |
dc.subject | machine learning | en_US |
dc.subject | research data metrics | en_US |
dc.title | Detecting Informal Data References in Academic Literature | en_US |
dc.type | Preprint | en_US |
dc.subject.hlbsecondlevel | Social Sciences (General) | |
dc.subject.hlbtoplevel | Social Sciences | |
dc.contributor.affiliationum | Inter-university Consortium for Political and Social Research (ICPSR) | en_US |
dc.contributor.affiliationum | Institute for Social Research (ISR) | en_US |
dc.contributor.affiliationum | School of Information (UMSI) | en_US |
dc.contributor.affiliationumcampus | Ann Arbor | en_US |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/168392/1/Detecting_Informal_Data_Refs.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/1671 | |
dc.identifier.doi | https://doi.org/10.1002/asi.24646 | en_US |
dc.identifier.orcid | 0000-0002-5896-7295 | en_US |
dc.identifier.orcid | 0000-0001-5464-8716 | en_US |
dc.identifier.orcid | 0000-0001-6481-2065 | en_US |
dc.identifier.orcid | 0000-0001-6238-3498 | en_US |
dc.identifier.orcid | 0000-0002-3793-7281 | en_US |
dc.description.filedescription | Description of Detecting_Informal_Data_Refs.pdf : Preprint | |
dc.description.depositor | SELF | en_US |
dc.identifier.name-orcid | Lafia, Sara; 0000-0002-5896-7295 | en_US |
dc.identifier.name-orcid | Moss, Elizabeth; 0000-0001-5464-8716 | en_US |
dc.identifier.name-orcid | Kim, Jinseok; 0000-0001-6481-2065 | en_US |
dc.identifier.name-orcid | Thomer, Andrea; 0000-0001-6238-3498 | en_US |
dc.identifier.name-orcid | Hemphill, Libby; 0000-0002-3793-7281 | en_US |
dc.working.doi | 10.7302/1671 | en_US |
dc.owningcollname | Institute for Social Research (ISR) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.