Read Me Dataset Title: Influence of CUREs (Course Based Undergraduate Research Experience) on STEM retention depends on demographic identities Dataset Creators: L. Bradshaw, J. Vernon, T. Schmidt, T. James, J. Zhang, H. Archbold, K. Cadigan, J.P. Wolfe & D. Goldberg Dataset Contact: degold@umich.edu Funding: Howard Hughes Medical Institute Related publication and citation for this data set (where complete study description and methodology can be found): Bradshaw, L., Vernon J., Schmidt T., James T., Zhang J., Archbold H., Cadigan K., Wolfe J.P., & Goldberg D. 2023. Research article: Influence of CUREs on STEM retention depends on demographic identities. J Microbiol Biol Educ Insert doi here when available The following files are contained in this data set: File Name Description Graduated_students.csv Stream enrolled, demographic info, intended major and degree obtained for Bio 173 students who had graduated or left UM at time of analysis Pre_and_post_semester_major.csv Stream enrolled, demographic info, intended majors at the beginning and end of the semester during which students were enrolled in Bio 173 for first and second year students Motivation_factors.csv Stream enrolled, demographic info, pre-semester scores, post-semester scores and change in scores for 5 motivational factors of first and second year students in Bio 173 Class_environment.csv Stream enrolled, demographic info, pre-semester score, post-semester score and change in score for classroom environment (student cohesiveness) of first and second year students in Bio 173 Grade_data.csv Stream enrolled, demographic info, Bio 173 course grade and cumulative GPA for all students enrolled in Bio 173 during the period of the study Comments_from_surveys.csv Stream enrolled and optional written comments on surveys from all students enrolled in Bio 173 during the period of the study, with names of instructors removed. Survey_questions.pdf Survey questions which were given to students at the beginning and end of the semester which were included for analysis in this paper IRB_consent_form.pdf Consent form included at the beginning of each survey Abbreviations and terms used in data files: semester = semester of enrollment in Bio 173, e.g. fall15 = Fall 2015; win16 = Winter 2016 labtype: reg = regular or traditional lab course; arc = CURE or research-based lab course stream: nonapp or non-applicant = student didn't apply to participate in CURE and enrolled in regular lab course; nonpar or non-participant = student applied to CURE but enrolled in regular lab course; evolyst = Yeast evolution CURE; fly = Fly genetics CURE; microb = Human microbiome CURE pgmyr = program year; frosh = first year; soph = second year ethnic = self-identified ethnicity categorized as: peer = Black or African American, Latinx or Hispanic and peoples indigenous to the U.S. and its territories; nonpeer = all other ethnicities generation: first = parents of students did not attend college or university; notfst = at least 1 parent of student attended college or university premaj or premajor = student's intended major when taking the survey at the beginning of the semester they were enrolled in Bio 173, codified as stem = science, technology, engineering or math majors; other = any other major; undec = undecided pstmaj or post major = student's intended major when taking the survey at the end of the semester they were enrolled in Bio 173; same classification as premaj variable (above) degree = major or field that student obtained degree in, codifed as stem, other or nofinish = no degree granted at UM persist = whether a student kept their intended major from the initial survey date until graduation, codified as: staystem = remained in stem; staynon = remained in other major; smtonon = left stem for other major; nontosm = left other major and became stem grade = grade obtained in Bio 173 course on a 4 point scale where A = 4.0, A- = 3.7, etc. cumgpa = cumulative grade point average over time as a student at UM on 4 point scale described above Motivation factors abbreviations: Average scores from 1-5 scaled responses to survey questions related to defined motivational factors described below. Prefixes: pre = from survey at the beginning of the semester; pst = from survey at the end of the semester; chng = change in score calculated as post score - pre score (used for these five variables in the motivation_factors.csv file) interest = subject (biology) interest effbel = effort belief, or how much a student believes that increased effort will have a positive impact on academic outcomes as opposed to innate ability intlacc = intellectual accessibility or how obtainable knowledge of the subject is to the student useful = perceived importance and usefulness of biology labcon = laboratory confidence or student's perceived self-efficacy in completing lab/research based tasks Class environment abbreviations: Cohes = student cohesiveness or perceived cooperation among students, a proxy for science identity Prefixes: pre = from survey at the beginning of the semester; pst = from survey at the end of the semester; chng = change in score calculated as post score - pre score Study period ran from winter 2015 through winter 2020 semesters, with data analyzed from the period encompassing Fall 2015 through Fall 2019. Use and Access: This data set is made available under a Creative Commons Public Domain license.