Doc2Vec on similar document suggestion for pharmaceutical collections
dc.contributor.author | Zhu, Hongting | |
dc.contributor.author | Pothukuchi, Ashwin | |
dc.contributor.author | Guo, Joel | |
dc.date.accessioned | 2021-04-29T19:13:43Z | |
dc.date.available | 2021-04-29T19:13:43Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/167258 | |
dc.identifier.uri | https://youtu.be/Q5ARFlxFZNI | |
dc.description.abstract | ProQuest Dialog is a powerful search engine on pharmaceutical and biomedical papers. But the document retrieval algorithm is getting outdated in current days. In this paper, we find a way to improve the similar document suggestions on Dialog interface. The NLP model Doc2Vec PV-DBOW embeds and clusters the similar documents together, and both evaluation methods return a better score for the baseline TF-IDF method, with textual coherence being 36.6% higher on bigram count vectors, 8.3% higher on trigram count vectors, and grant-to-article linkage being 6.1% higher on herfindahl-hirschman index. | |
dc.subject | Machine Learning | |
dc.subject | Natural Language Processing | |
dc.subject | Clinical NLP | |
dc.title | Doc2Vec on similar document suggestion for pharmaceutical collections | |
dc.type | Technical Report | |
dc.subject.hlbtoplevel | Engineering | |
dc.contributor.affiliationum | Electrical Engineering and Computer Science | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/167258/1/Capstone_Final_Report_Hongting_Zhu.pdf | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/167258/2/Capstone_Presentation_Hongting_Zhu.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/933 | |
dc.working.doi | 10.7302/933 | en |
dc.owningcollname | Honors Program, The College of Engineering |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.