An active learning-enabled annotation system for clinical named entity recognition

Chen, Yukun; Lask, Thomas A; Mei, Qiaozhu; Chen, Qingxia; Moon, Sungrim; Wang, Jingqi; Nguyen, Ky; Dawodu, Tolulola; Cohen, Trevor; Denny, Joshua C; Xu, Hua

An active learning-enabled annotation system for clinical named entity recognition

dc.contributor.author	Chen, Yukun
dc.contributor.author	Lask, Thomas A
dc.contributor.author	Mei, Qiaozhu
dc.contributor.author	Chen, Qingxia
dc.contributor.author	Moon, Sungrim
dc.contributor.author	Wang, Jingqi
dc.contributor.author	Nguyen, Ky
dc.contributor.author	Dawodu, Tolulola
dc.contributor.author	Cohen, Trevor
dc.contributor.author	Denny, Joshua C
dc.contributor.author	Xu, Hua
dc.date.accessioned	2017-07-09T03:18:52Z
dc.date.available	2017-07-09T03:18:52Z
dc.date.issued	2017-07-05
dc.identifier.citation	BMC Medical Informatics and Decision Making. 2017 Jul 05;17(Suppl 2):82
dc.identifier.uri	http://dx.doi.org/10.1186/s12911-017-0466-9
dc.identifier.uri	https://hdl.handle.net/2027.42/137676
dc.description.abstract	Abstract Background Active learning (AL) has shown the promising potential to minimize the annotation cost while maximizing the performance in building statistical natural language processing (NLP) models. However, very few studies have investigated AL in a real-life setting in medical domain. Methods In this study, we developed the first AL-enabled annotation system for clinical named entity recognition (NER) with a novel AL algorithm. Besides the simulation study to evaluate the novel AL algorithm, we further conducted user studies with two nurses using this system to assess the performance of AL in real world annotation processes for building clinical NER models. Results The simulation results show that the novel AL algorithm outperformed traditional AL algorithm and random sampling. However, the user study tells a different story that AL methods did not always perform better than random sampling for different users. Conclusions We found that the increased information content of actively selected sentences is strongly offset by the increased time required to annotate them. Moreover, the annotation time was not considered in the querying algorithms. Our future work includes developing better AL algorithms with the estimation of annotation time and evaluating the system with larger number of users.
dc.title	An active learning-enabled annotation system for clinical named entity recognition
dc.type	Article	en_US
dc.description.bitstreamurl	https://deepblue.lib.umich.edu/bitstream/2027.42/137676/1/12911_2017_Article_466.pdf
dc.language.rfc3066	en
dc.rights.holder	The Author(s).
dc.date.updated	2017-07-09T03:18:57Z
dc.owningcollname	Interdisciplinary and Peer-Reviewed

Files in this item

Name:: 12911_2017_Article_466.pdf
Size:: 969.6KB
Format:: PDF

View/Open

Interdisciplinary and Peer-Reviewed

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.