- Kelli A. Bird
Search EdWorkingPapers by author, title, or keywords.
Kelli A. Bird
Predictive analytics are increasingly pervasive in higher education. However, algorithmic bias has the potential to reinforce racial inequities in postsecondary success. We provide a comprehensive and translational investigation of algorithmic bias in two separate prediction models -- one predicting course completion, the second predicting degree completion. Our results show that algorithmic bias in both models could result in at-risk Black students receiving fewer success resources than White students at comparatively lower-risk of failure. We also find the magnitude of algorithmic bias to vary within the distribution of predicted success. With the degree completion model, the amount of bias is nearly four times higher when we define at-risk using the bottom decile than when we focus on students in the bottom half of predicted scores. Between the two models, the magnitude and pattern of bias and the efficacy of basic bias mitigation strategies differ meaningfully, emphasizing the contextual nature of algorithmic bias and attempts to mitigate it. Our results moreover suggest that algorithmic bias is due in part to currently-available administrative data being less useful at predicting Black student success compared with White student success, particularly for new students; this suggests that additional data collection efforts have the potential to mitigate bias.
Prediction algorithms are used across public policy domains to aid in the identification of at-risk individuals and guide service provision or resource allocation. While growing research has investigated concerns of algorithmic bias, much less research has compared algorithmically-driven targeting to the counterfactual: human prediction. We compare algorithmic and human predictions in the context of a national college advising program, focusing in particular on predicting high-achieving, lower-income students’ college enrollment quality. College advisors slightly outperform a prediction algorithm; however, greater advisor accuracy is concentrated among students with whom advisors had more interactions. The algorithm achieved similar accuracy among students lower in the distribution of interactions, despite advisors having substantially more information. We find no evidence that the advisors or algorithm exhibit bias against vulnerable populations. Our results suggest that, especially at scale, algorithms have the potential to provide efficient, accurate, and unbiased predictions to target scarce social services and resources.
Data science applications are increasingly entwined in students’ educational experiences. One prominent application of data science in education is to predict students’ risk of failing a course in or dropping out from college. There is growing interest among higher education researchers and administrators in whether learning management system (LMS) data, which capture very detailed information on students’ engagement in and performance on course activities, can improve model performance. We systematically evaluate whether incorporating LMS data into course performance prediction models improves model performance. We conduct this analysis within an entire state community college system. Among students with prior academic history in college, administrative data-only models substantially outperform LMS data-only models and are quite accurate at predicting whether students will struggle in a course. Among first-time students, LMS data-only models outperform administrative data-only models. We achieve the highest performance for first-time students with models that include data from both sources. We also show that models achieve similar performance with a small and judiciously selected set of predictors; models trained on system-wide data achieve similar performance as models trained on individual courses.
Despite decades and hundreds of billions of dollars of federal and state investment in policies to promote postsecondary educational attainment as a key lever for increasing the economic mobility of lower-income populations, research continues to show large and meaningful differences in the mid-career earnings of students from families in the bottom and top income quintiles. Prior research has not disentangled whether these disparities are due to differential sorting into colleges and majors, or due to barriers lower socioeconomic status (SES) graduates encounter during the college-to-career transition. Using linked individual-level higher education and Unemployment Insurance (UI) records for nearly a decade of students from the Virginia Community College System (VCCS), we compare the labor market outcomes of higher- and lower-SES community college graduates within the same college, program, and academic performance level. Our analyses show that, conditional on employment, lower-SES graduates earn nearly $500/quarter less than their higher-SES peers one year after graduation, relative to higher-SES graduate average of $10,846/quarter. The magnitude of this disparity persists through at least three years after graduation. Disparities are concentrated among non-Nursing programs, in which gaps persist seven years from graduation. Our results highlight the importance of greater focus on the college-to-career transition.
Non-traditional students disproportionately enroll in institutions with weaker graduation and earnings outcomes. One hypothesis is that these students would have made different choices had they been provided with better information or supports during the decision-making process. We conducted a large-scale, multi-arm field experiment with the U.S. Army to investigate whether personalized information and the offer of advising assistance affect postsecondary choices and attainment among non-traditional adult populations. We provided U.S. Army service members transitioning out of the military with a package of research-based information and prompts, including quality and cost information on a personalized set of matched colleges, messages targeted at addressing veteran-specific concerns or needs, and reminders about key stages in the college and financial aid application process. For a randomly selected subset of the experimental sample, we also provided service members with opportunities to connect with a college advisor. We find no overall impact of the intervention on whether service members enroll in college, on the quality of their college enrollment, or on their persistence in college. We find suggestive evidence of a modest increase in degree completion within the period of observation, with these impacts mainly driven by increased attainment at for-profit institutions. Our results suggest that influencing non-traditional populations’ educational decisions and outcomes will require substantially more intensive programs and significant resources.
The COVID-19 pandemic led to an abrupt shift from in-person to virtual instruction in Spring 2020. We use two complementary difference-in differences frameworks, one that leverages within-instructor-by-course variation on whether students started their Spring 2020 courses in person or online and another that incorporates student fixed effects. We estimate the impact of this shift on the academic performance of Virginia’s community college students. With both approaches, we find modest negative impacts (three to six percent) on course completion. Our results suggest that faculty experience teaching a given course online does not mitigate the negative effects. In an exploratory analysis, we find minimal long-term impacts of the switch to online instruction.
Recent state policy efforts have focused on increasing attainment among adults with some college but no degree (SCND). Yet little is actually known about the SCND population. Using data from the Virginia Community College System (VCCS), we provide the first detailed profile on the academic, employment, and earnings trajectories of the SCND population, and how these compare to VCCS graduates. We show that the share of SCND students who are academically ready to reenroll and would benefit from doing so may be substantially lower than policy makers anticipate. Specifically, we estimate that few SCND students (approximately three percent) could fairly easily re-enroll in fields of study from which they could reasonably expect a sizable earnings premium from completing their degree.
Colleges have increasingly turned to predictive analytics to target at-risk students for additional support. Most of the predictive analytic applications in higher education are proprietary, with private companies offering little transparency about their underlying models. We address this lack of transparency by systematically comparing two important dimensions: (1) different approaches to sample and variable construction and how these affect model accuracy; and (2) how the selection of predictive modeling approaches, ranging from methods many institutional researchers would be familiar with to more complex machine learning methods, impacts model performance and the stability of predicted scores. The relative ranking of students’ predicted probability of completing college varies substantially across modeling approaches. While we observe substantial gains in performance from models trained on a sample structured to represent the typical enrollment spells of students and with a robust set of predictors, we observe similar performance between the simplest and most complex models.