UWHandles Dataset

Billings, Gideon H; Johnson-Roberson, Matthew

Work Description

Title: UWHandles Dataset Open Access Deposited

Attribute	Value
Methodology	We collected fisheye images of three different types of graspable handles, randomly arranged in different natural seafloor environments of the Costa Rican shelf break. Two of the handle types are actively used by Schmidt Ocean Institute and Woods Hole Oceanographic Institute to manipulate tools with Remotely Operated Vehicles (ROV) during underwater operations. AprilTag fiducials were randomly dispersed throughout the scene and on mount plates attached to the base of the handle objects in order to provide ground truth poses of the handles in the image sequences. The camera system was a FLIR BFLY-PGE-50S5C-C with a Fujinon FE185C086HA-1 fisheye lens centered in a dome housing that was mounted on the wrist of a Schilling Titan 4 hydraulic manipulator. This dataset is relevant to automating underwater manipulation tasks; if the pose of a handle attached to a known tool type can be accurately estimated, the handle can be autonomously grasped and manipulated to perform a desired manipulation task. The dataset is composed of 25 training image sequences with a total of 18,329 images, 1 validation sequence with 910 images, and 2 testing sequences with 1,188 images. At various locations on the seafloor, the ROV was set down and the handle objects were randomly dispersed throughout the reachable area of the manipulator. 4 metal tag plates with attached 4” AprilTag fiducial stickers were randomly scattered around the handle objects. Initially, we also attached 4” April Tag stickers to mount plates at the base of the handle objects, but through trial and error, we found the most robust detection results were obtained with multiple 2” stickers attached to the object mount plates rather than the single 4” sticker. The smaller tags on the objects enabled better detection of the tags at close range, and the large tags on the scattered metal plates provided good detection at greater distances. Full 5MP resolution images were recorded at 3Hz, with the manipulator moving around the objects in various motion paths to obtain a diverse set of viewpoints. We used the ROS TagSLAM package to process the image sequences and obtain globally consistent camera poses for each image in the sequence. In order to make use of the full fisheye view, tags were detected in the raw fisheye images, and then the detected tag poses were calculated using a pinhole equidistant distortion model calibrated with the Kalibr ROS package. The pinhole model was adequate for this camera system, because the effective usable field of view of the image was less than 180◦. We created an OpenGL based annotation tool called VisPose, which takes in the image sequence with an associated camera pose file from the TagSLAM output. VisPose provides an interface to project models of the different objects into the image sequence, play through the sequence, and tweak the fit of the models to obtain accurate 6D pose annotations. Pose outliers can be filtered from the image sequence and then a COCO style annotation file exported, including 6D pose and 2D bounding box annotations, for the full image sequence.
Description	UWHandles is a dataset for 6D object pose estimation in underwater fisheye images. It provides 6D pose and 2D bounding box annotations for 3 different graspable handle objects used for ROV manipulation. The dataset consists of 28 image sequences collected in natural seafloor environments with a total of 20,427 annotated frames. Meta repository for the dataset https://github.com/gidobot/UWHandles
Creator	Billings, Gideon H Johnson-Roberson, Matthew
Depositor	gidobot@umich.edu
Contact information	gidobot@umich.edu
Discipline	Engineering
Funding agency	National Aeronautics and Space Administration (NASA)
Keyword	Deep Learning Pose Estimation Underwater Vision
Date coverage	2019-01-01 to 2020-01-01
Citations to related material	Billings, G., & Johnson-Roberson, M. (2020). SilhoNet-fisheye: Adaptation of a ROI based object pose estimation network to monocular fisheye images. IEEE Robotics and Automation Letters, 5(3), 4241-4248.
Resource type	Dataset
Last modified	03/27/2023
Published	03/27/2023
Language	English
DOI	https://doi.org/10.7302/3vcq-wk15
License	http://creativecommons.org/publicdomain/zero/1.0/

To Cite this Work:
Billings, G. H., Johnson-Roberson, M. (2023). UWHandles Dataset [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/3vcq-wk15

Relationships


This work is not a member of any user collections.

Files (Count: 2; Size: 213 GB)

Thumbnailthumbnail-column	Title	Original Upload	Last Modified	File Size	Access	Actions
	UWHandles_README.txt	2023-03-23	2023-03-23	5.78 KB	Open Access	View Details Download
	UWHandles.zip	2023-03-25	2023-03-25	213 GB	Open Access	View Details Download Data from Globus

Date: 23 March, 2023

Dataset Title: Underwater Handles Dataset (UWHandles_dataset)

Dataset Creators: G. Billings, M. Johnson-Roberson

Dataset Contact: Gideon Billings gidobot@umich.edu

Funding: NNX16AL08G (NASA)

Key Points
==========

- This is a dataset for developing and testing methods for 6D object pose estimation from underwater monocular images captured in natural deep sea environments.
- The dataset was collected with a monocular fisheye camera and provides annotations for raw fisheye images as well as center rectified perspective images projected from the raw images.
- The dataset provides 6D pose and 2D bounding box annotations for 3 different graspable handle objects, typical of the type used for tool manipulation with Remotely Operated Vehicles (ROVs).
- The dataset has 28 unique image sequences with a total of 20,427 annotated images.

Research Overview
=================

Computer vision in underwater environments is significantly more difficult than in terrestrial environments, due to the physical absorption and scattering properties of water on light. Induced lighting effects like haze, backscatter, caustics and vinnetting all compound to increase the noise and artifacts present in underwater imagery. Underwater datasets are also difficult and expensive to collect in underwater environments, due to the specialized, sealed hardware needed to work in these environemnts. This dataset provides annotated imagery for object pose estimation from deep ocean environments, which are particularly difficult to reach. While this dataset is useful for developing general computer vision methods for object detection and pose estimation for the underwater environment, it is particulary relevant to automating underwater manipulation tasks; if the pose of a handle attached to a known tool type can be accurately estimated, the handle can be autonomously grasped and manipulated to perform a desired manipulation task.

Methodology
===========

We collected fisheye images of three different types of graspable handles, randomly arranged in different natural seafloor environments of the Costa Rican shelf break. Two of the handle types are actively used by Schmidt Ocean Institute and Woods Hole Oceanographic Institution to manipulate tools with Remotely Operated Vehicles (ROV) during underwater operations. AprilTag fiducials were randomly dispersed throughout the scene and on mount plates attached to the base of the handle objects in order to recover ground truth poses of the camera in the image sequences. The image sequences were then post processed with an annotation tool to obtain labeled 6D object poses and bounding boxes. The camera system was a FLIR BFLY-PGE-50S5C-C with a Fujinon FE185C086HA-1 fisheye lens centered in a dome housing that was mounted on the wrist of a Schilling Titan 4 hydraulic manipulator. The dataset is composed of 25 training image sequences with a total of 18,329 images, 1 validation sequence with 910 images, and 2 testing sequences with 1,188 images.

Files Contained Here
====================

For each sequence, the images are numbered incrementally starting from 0 as .png. The fisheye camera was calibrated using the Kalibr toolbox with the equidistant distortion model, and the calibration yaml files for both the raw fisheye and the center rectififed images are in the format output by Kalibr (https://github.com/ethz-asl/kalibr/wiki/camera-imu-calibration).

THe camera_poses.txt CSV file for each sequence gives a globally referenced camera pose for each image in the sequence, with the line format ", x, y, z, qw, qx, qy, qz", where [x,y,z] is the translation in meters and [qw,qx,qy,qz] is the rotation as a quaternion. The annotation json files use COCO sytle 6D pose and 2D bounding box annotations (https://cocodataset.org/#home). These annotations were generated with the VisPose tool (https://github.com/gidobot/VisPose) that was created as part of this work. VisPose can be used to visualize the annotations. Note that the VisPose version used to generate this dataset has the commit date May 13, 2020.

The object models were created in Blender and use material definitions for realistic textures when creating renders.

File Structure
==============

UWHandles
└───calibration
│ │ fisheye_calib.yaml -> calibration file for fisheye
│ │ rectified_calib.yaml -> calibration file for rectified fisheye
└───data
│ └───set#
│ │ camera_poses.txt -> CSV file of globally referenced camera poses for each image in sequence, with each line formated as $, x, y, z, qw, qx, qy, qz$
│ └───images
│ │ annotations.json -> COCO style 6D pose and 2D bounding box annotations generated by the VisPose annotation tool for the rectified image sequence
│ │ annotations_culled.json -> culled annotations with bounding boxes for rectified images
│ │ annotations_fisheye_culled.json -> culled annotations with bounding boxes regenerated for raw fisheye images
│ └───raw -> folder containing raw fisheye images
│ └───rect -> folder containing center rectified images
└───image_sets -> contains text files which list the image sets for training, testing, and validation
└───models
└───
│ textured_real.obj -> obj files for the 3 different handle objects in the dataset
│ textured_real.mtl -> material definition for obj file

Use and Access
==============

This data set is made available under a Creative Commons Public Domain license (CC0 1.0).

How to Cite
================

Billings, G., & Johnson-Roberson, M. (2020). SilhoNet-fisheye: Adaptation of a ROI based object pose estimation network to monocular fisheye images. IEEE Robotics and Automation Letters, 5(3), 4241-4248.

Update Provenance Log Entries

Download All Files (To download individual files, select them in the “Files” panel above)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.