Work Description
Title: CHANGES Project - Fish Growth Curated Data Open Access Deposited
Attribute | Value |
---|---|
Methodology |
|
Description |
|
Creator | |
Depositor | |
Contact information | |
Discipline | |
Funding agency |
|
Other Funding agency |
|
Keyword | |
Date coverage |
|
Citations to related material |
|
Resource type | |
Last modified |
|
Published |
|
Language | |
DOI |
|
License |
(2025). CHANGES Project - Fish Growth Curated Data [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/h8hp-gw58
Relationships
Files (Count: 4; Size: 10.3 MB)
Thumbnailthumbnail-column | Title | Original Upload | Last Modified | File Size | Access | Actions |
---|---|---|---|---|---|---|
![]() |
grow_data.csv | 2025-05-05 | 2025-05-05 | 10.3 MB | Open Access |
|
![]() |
grow_datadictionary.csv | 2025-05-05 | 2025-05-05 | 2.26 KB | Open Access |
|
![]() |
grow_species_table.csv | 2025-05-05 | 2025-05-05 | 1.85 KB | Open Access |
|
![]() |
_grow_data_Readme.txt | 2025-05-05 | 2025-05-05 | 3.83 KB | Open Access |
|
Date: 05 Feb, 2025
Dataset Title: CHANGES Project - Fish Growth Curated Data
Dataset Creators: King, Katelyn; Schell, Justin; Alofs, Karen; Thomer, Andrea; Wehrly, Kevin; Lenard, Michael; and Lopez-Fernandez, Hernan
Dataset Contact: Katelyn King [email protected]
Funding: Michigan Institute For Data & AI In Society (MIDAS) Propelling Original Data Science Grant
Research Overview:
Archives at the Institute for Fisheries Research (IFR) hold records of thousands of lake surveys from the University of Michigan and Michigan Department of Natural Resources.
Fish growth cards document fish that were aged and measured during fish surveys. The data that were transcribed from these cards and included in this dataset (grow_data.csv) are for each fish species: the number of fish measured in each age group, and the minimum, maximum, and average length of the fish for each age group. The final growth dataset includes length-at-age information for 36 different species (grow_species_table.csv). For a description of all fields of this data table see grow_datadictionary.csv.
Methodology:
Michigan Department of Natural Resources historically collected lake survey data on index cards. We used the Zooniverse crowdsourcing platform for volunteer transcription of these records using various workflows that captured different data. To be included in the dataset, each card was transcribed by three or more volunteers. Zooniverse transcriptions require significant cleaning and curation before the data is in a usable format. We used code to aggregate the transcribed data from each person in order to provide a consensus-based “final answer” and confidence score for each data field, based on how well entries from the different volunteers matched. We then standardized data using techniques such as changing all text to lowercase, trimming excess whitespace, and converting fractions to decimals. We separated numeric and alphabetic values into different data columns. Finally, we standardized units for each variable into a single unit, and when applicable, transformed to metric units (e.g. inches to millimeters). We checked data numeric values by plotting, identifying outliers, and reviewing the original document. In order to combine multiple sampling events for one lake or connect the transcribed data to more contemporary survey data from the MDNR, we matched the records with the corresponding MDNR unique lake identifiers. The transcribed data included each lake’s name, county, and in some instances geographic reference data in the form of Township, Range, and Section from the United States Public Land Survey System (TRS). We joined data entries on lake names, counties, and TRS when available. Remaining lakes that were unmatched due to issues like lakes crossing county lines or changing names over time, were manually matched to data using experts from the research team. Finally, we were unable to match some of the historical data due to insufficient geographic information.
Instrument and/or Software specifications: NA
Files contained here:
grow_data.csv
grow_species_table.csv
grow_datadictionary.csv
We tried to use a standard naming convention for all of our data fields except for identifiers, dates, and comments. The naming convention is as follows: [variable name]_[min or max]_[unit].
Related publication(s):
King, K.B.S., Schell, J, Wehrly, K.E., Lenard, M., Singer, R., López-Fernández, H., Thomer, A.K., & Alofs, K.M. Community science helps digitize 78 years of fish and habitat data for thousands of lakes in Michigan, USA. under review
Use and Access:
This data set is made available under a Creative Commons Public Domain license (CC0 1.0).
To Cite Data:
King, K.B.S., K.M. Alofs, J. Schell, A. Thomer, K. Wehrly, M. Lenard, & H. Lopez-Fernandez (2025). CHANGES Project - Fish Growth Curated Data [Data set]. University of Michigan - Deep Blue.