Work Description
Title: Estimating Student Capital of Community College Student Populations Open Access Deposited
Attribute | Value |
---|---|
Methodology |
|
Description |
|
Creator | |
Depositor |
|
Contact information | |
Discipline | |
Keyword | |
Citations to related material |
|
Resource type | |
Last modified |
|
Published |
|
Language | |
DOI |
|
License |
(2020). Estimating Student Capital of Community College Student Populations [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/attc-n529
Relationships
- This work is not a member of any user collections.
Files (Count: 4; Size: 52.2 KB)
Thumbnailthumbnail-column | Title | Original Upload | Last Modified | File Size | Access | Actions |
---|---|---|---|---|---|---|
readme.txt | 2020-05-15 | 2020-05-15 | 2.69 KB | Open Access |
|
|
mlecens.R | 2020-05-15 | 2020-05-15 | 12.7 KB | Open Access |
|
|
code_from_QBR_paper.R | 2020-05-15 | 2020-05-15 | 18.1 KB | Open Access |
|
|
sample_data.csv | 2020-05-15 | 2020-05-15 | 18.6 KB | Open Access |
|
DATE: May 13, 2020
TITLE: Code and sample data to accompany "The Shape of Educational Inequality" by Quarles, Budak & Resnick, published in Science Advances
AUTHOR: Christopher L. Quarles, chrisquarles@gmail.com
This repository contains four files:
-- readme.txt: The file you're reading now.
-- mlecens.R: This file contains the R code for estimating student capital in a population of students. It contains one function, mlecens, which performs right-censored maximum likelihood estimation to fit a distribution to a data set.
-- code from QBR paper.R: This contains the R code used to make (most of) the images and tables in the paper. Because our data is unavailable, all of the code will run on the sample_data.csv. If you want to make an image from the paper with your own data, you can just format your data like in sample_data.csv and then run the code in this file.
-- sample_data.csv: For privacy reasons, the data used in the paper is not available to the public. This dataset mimics the type of data used for the analysis. The dataset has 4 variables:
- credits_earned = # of credits earned by a given student, rounded to the nearest positive integer
- droppedout = FALSE if the student graduated or transferred, TRUE otherwise
- transferred = TRUE iff the student transferred to a 4-year college
- transnograd = TRUE iff the student transferred but didn't graduate
IF YOU JUST WANT TO CALCULATE THE STUDENT CAPITAL OF A GROUP OF STUDENTS:
- Make sure that your cohort is large enough. In simulations based on real data, the standard error of the estimated average student capital was roughly: SE = 150/sqrt(sample size).
- Also, make sure that a middling number of your students dropped out. Otherwise, you won't observe enough students' capital to make an accurate inference. I don't have a good rule of thumb here, but 20% or fewer dropouts probably won't work. Nor will >90% dropouts.
- Save your data in the same format as sample_data.csv, or you can copy and paste over the sample data. You only need two variables: credits_earned and droppedout. droppedout can be either TRUE/FALSE or 1/0.
- Make sure all the files are in the same directory.
- Make sure that you have the VGAM package installed. You can run install.packages("VGAM") or install it through Tools menu in RStudio.
- Run the following lines in R. You'll have to change the file name to match your file. (The sample file should give q=.9917 and mu_s=120.2.)
source("mlecens.R")
coldat <- read.csv("sample_data.csv")
q = mlecens(x=coldat$credits_earned, yc=coldat$droppedout) # This returns the "per-credit retention rate"
mu_s = 1/(1-q) # This returns the "average student capital", measured in credits.