This is the experimental data referenced in our manuscript entitled "Influence of CUREs on STEM retention depends on demographic identities." The dataset comprises csv files with results from student surveys given to students enrolled in Biology 173 from Fall 2015 through Fall 2019 as well as institutional data of their course grades and cumulative GPA at the time they enrolled in Biology 173, and graduation and major data for student who had graduated by 2021. The survey questions used in the analysis and the IRB consent form are also included as pdfs.
Bradshaw, L., Vernon J., Schmidt T., James T., Zhang J., Archbold H., Cadigan K., Wolfe J.P. & Goldberg D. 2023. Research article: Influence of CUREs on STEM retention depends on demographic identities. J Microbiol Biol Educ (accepted)
PedX is a large-scale multi-modal collection of pedestrians at complex urban intersections. The dataset provides high-resolution stereo images and LiDAR data with manual 2D and automatic 3D annotations. The data was captured using two pairs of stereo cameras and four Velodyne LiDAR sensors.
The items in this bundle are supporting videos to a study of subsea seismo-acoustics carried out regarding an earthquake in the Persian Gulf. The main data used in the study is a diver's recording of the acoustic waves from the earthquake. The epicenter and topography data used in this study are publicly available as cited in the README.txt file.
The work guides the processing of CAM6 data for use in machine learning applications. We also provide workflow scripts for training both random forests and neural networks to emulate physic s schemes from the data, as well as analysis scripts written in both Python and NCL in order to process our results.
Limon, G. C., Jablonowski, C. (2022) Probing the Skill of Random Forest Emulators for Physical Parameterizations via a Hierarchy of Simple CAM6 Configurations [Pre Print]. ESSOAr. https://10.1002/essoar.10512353.1
The collection contains the code and the data used to train machine learning algorithms to emulate simplified physical parameterizations within the Community Atmosphere Model (CAM6). CAM6 is the atmospheric general circulation model (GCM) within the Community Earth System Model (CESM) framework, developed by the National Center for Atmospheric Research (NCAR). GCMs are made up of a dynamical core, responsible for the geophysical fluid flow calculations, and physical parameterization schemes, which estimate various unresolved processes. Simple physics schemes were used to train both random forests and neural networks in the interest of exploring the feasibility of machine learning techniques being used in conjunction with the dynamical core for improved efficiency of future climate and weather models. The results of the research show that various physical forcing tendencies and precipitation rates can be effectively emulated by the machine learning models.
The data represents weekly output from three 60-year CAM6 model runs. The output includes state (.h0. files) and tendency (.h1. files) fields for three difference model configurations of increasing complexity. State fields include temperature, surface pressure, specific humidity, among others; while tendencies include temperature tendencies, specific humidity tendencies, as well as precipitation rates. Using the state variables at a given time step, machine learning techniques can be trained to predict the following tendency field, which can then be applied to the state variables to provide the state at the next physics time step of the model.
Limon, G. C., Jablonowski, C. (2022) Probing the Skill of Random Forest Emulators for Physical Parameterizations via a Hierarchy of Simple CAM6 Configurations [Preprint]. ESSOAr. https://10.1002/essoar.10512353.1