Work Description

Title: Images of pills inside medication bottles dataset Open Access Deposited

h
Attribute Value
Methodology
  • This dataset includes images of pills inside medication bottles. The images were taken by a robot in the pharmacy that counts out pills into the bottle and takes a photo of the pills inside the bottle from a top down view. The data were obtained from a mail order pharmacy in the United States. Each image is labeled with an ID number and also includes the national drug code (NDC) for the medication inside the bottle. The NDC identifies the medication product on the basis of ingredient, strength, dose form, and manufacturer. The dataset was split into training/validation/testing subsets for each NDC using a ratio of 6:2:2. Each sub-folder is labeled with the NDC and images inside the sub-folder correspond to the NDC
Description
  • The dataset contains images of pills inside a medication bottle from a top down view. The dataset was used to build an image classification model for predicting the national drug code (NDC) of the medication seen in the image. There are 13,955 images of 20 distinct NDC. The image data were used to create a machine learning algorithm which could predict the NDC.
Creator
Depositor
  • lesterca@umich.edu
Contact information
Discipline
Funding agency
  • National Institutes of Health (NIH)
ORSP grant number
  • 20-PAF07658
Keyword
Resource type
Last modified
  • 11/25/2022
Published
  • 07/07/2022
DOI
  • https://doi.org/10.7302/zacw-p603
License
To Cite this Work:
Lester, C. A., Al Kontar, R., Chen, Q. (2022). Images of pills inside medication bottles dataset [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/zacw-p603

Relationships

This work is not a member of any user collections.

Files (Count: 2; Size: 1.15 GB)

Images of pills inside medication bottles dataset Readme

Research Overview
The dataset contains images of pills inside a medication bottle from a top down view. The dataset was used to build an image classification model for predicting the national drug code (NDC) of the medication seen in the image. There are 13,955 images of 20 distinct NDC. The image data were used to create a machine learning algorithm which could predict the NDC. This dataset was used in R01LM013624 from the National Library Medicine in the National Institutes of Health. The grant was awarded to Corey Lester, Raed Al Kontar, and Jessie Xi Yang at the University of Michigan.

Methodology
This dataset includes images of pills inside medication bottles. The images were taken by a robot in the pharmacy that counts out pills into the bottle and takes a photo of the pills inside the bottle from a top down view. The data were obtained from a mail order pharmacy in the United States. Each image is labeled with an ID number and also includes the national drug code (NDC) for the medication inside the bottle. The NDC identifies the medication product on the basis of ingredient, strength, dose form, and manufacturer. The dataset was split into training/validation/testing subsets for each NDC using a ratio of 6:2:2. Each sub-folder is labeled with the NDC and images inside the sub-folder correspond to the NDC.

File Inventory
The image dataset contains 3 main folders: 1. Train, 2. Test, and 3. Valid datasets. Within each of these folders exists 20 sub-folders. Each sub-folder is labeled with the NDC for which the images inside that sub-folder corresponds to. The NDC are the same in each main folder. Within each NDC folder, .jpg images of pills for that NDC are labeled with an image ID as the file name. The following shows the folder hierarchy:
* Main folder (i.e., train, test, valid)
* NDC (e.g., 00378-0208)
* Image ID (e.g., 2082.jpg)

Definition of Terms and Variables
* Train: the folder containing the pill images used to train the model.
* Test: the folder containing the pill images used to test the model.
* Valid: the folder containing the pill images used to validate the model.
* National Drug Code (NDC): a unique identifier for medication ingredient, strength, dose form, and manufacturer. Each NDC folder contains the pill images corresponding to the respective NDC.

Download All Files (To download individual files, select them in the “Files” panel above)

Total work file size of 1.15 GB may be too large to download directly. Consider using Globus (see below).

Files are ready   Download Data from Globus
Best for data sets > 3 GB. Globus is the platform Deep Blue Data uses to make large data sets available.   More about Globus

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.