Search for EdWorkingPapers here by author, title, or keywords.
Program and policy effects
Barriers to accessing financial aid may keep students from matriculating to college. To test whether FAFSA completion is one of these barriers, I utilize a natural experiment brought about by a Louisiana mandate for seniors to file the FAFSA upon graduation from high school. Exploiting pre-treatment FAFSA completion rates as a treatment intensity in a dosage differences-in-differences specification, I find that a 10 percentage point lower pre-treatment FAFSA completion rate for a school implies a 1 percentage point larger increase in post-mandate college enrollment.
In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and improving teachers' uptake at scale is challenging, as existing methods require expensive annotation by experts. We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) correlating these measures with educational outcomes. We find that although repetition captures a significant part of uptake, pJSD outperforms repetition-based baselines, as it is capable of identifying a wider range of uptake phenomena like question answering and reformulation. We apply our uptake measure to three different educational datasets with outcome indicators. Unlike baseline measures, pJSD correlates significantly with instruction quality in all three, providing evidence for its generalizability and for its potential to serve as an automated professional development tool for teachers.
Advanced course-taking in high school sends an important signal to college admissions officers, helps reduce the cost and time to complete a post-secondary degree, and increases educational attainment and future earnings. However, Black and Hispanic students in the U.S. are underrepresented in Advanced Placement coursework and dual enrollment (i.e. early college). In this paper, we systematically examine the social, demographic, economic, and policy factors that are predictive of racial gaps in AP enrollment and access to DE across the U.S. We find that many of the same factors that predict higher AP access overall also predict higher racial/ethnic gaps in AP, suggesting that policies aimed at increasing AP access need to specifically attend to the inequitable access, rather than simply focusing on increasing access overall. We also find evidence that that might indicate opportunity hoarding by White families contributes to AP gaps – but not DE gaps – suggesting that DE acts as a more equitable avenue for access to college coursework. Our most novel contribution to the literature is our analysis of policies aimed at reducing teacher shortages in high needs areas, in which we find no evidence that the disparities in access to advanced coursework were reduced following implementation of these policies.
In multisite experiments, we can quantify treatment effect variation with the cross-site treatment effect variance. However, there is no standard method for estimating cross-site treatment effect variance in multisite regression discontinuity designs (RDD). This research rectifies this gap in the literature by systematically exploring and evaluating methods for estimating the cross-site treatment effect variance in multisite RDDs. Specifically, we formalize a fixed intercepts/random coefficients (FIRC) RDD model and develop a random effects meta-analysis (Meta) RDD model for estimating cross-site treatment effect variance. We find that a restricted FIRC model works best when the running variables' relationship to the outcome is stable across sites but can be biased otherwise. In those instances, we recommend using either the unrestricted FIRC model or the meta-analysis model; with the unrestricted FIRC model generally performing better when the average number of in-bandwidth observations is less than 120 and the meta-analysis model performing better when the average number of in-bandwidth observations is above 120. We apply our models to a high school exit exam policy in Massachusetts that required students who passed the high school exit exam but were still determined to be nonproficient to complete an ``Education Proficiency Plan" (EPP). We find the EPP policy had a positive local average treatment effect on whether students completed a math course their senior year on average across sites, but that the impact varied enough such that a third of schools could have had a negative impact.
Federal law defines eligibility for English learner (EL) classification differently for Indigenous students compared to non-Indigenous students. Indigenous students, unlike non-Indigenous students, are not required to have a non-English home or primary language. A critical question, therefore, is how EL classification impacts Indigenous students’ educational outcomes. This study explores this question for Alaska Native students, drawing on data from five Alaska school districts. Using a regression discontinuity design, we find evidence that among students who score near the EL classification threshold in kindergarten, EL classification has a large negative impact on Alaska Native students’ academic outcomes, especially in the 3rd and 4th grades. Negative impacts are not found for non-Alaska Native students in the same districts.
A core motivation for the widespread teacher evaluation reforms of the last decade was the belief that these new systems would promote teacher development through high-quality feedback. We examine this theory by studying teachers’ perceptions of evaluation feedback in Boston Public Schools and evaluating the district’s efforts to improve feedback through an administrator training program. Teachers generally reported that evaluators were trustworthy, fair, and accurate, but that they struggled to provide high-quality feedback. We find little evidence the training program improved perceived feedback quality, classroom instruction, teacher self-efficacy, or student achievement. Our results illustrate the challenges of using evaluation systems as engines for professional growth when administrators lack the time and skill necessary to provide frequent, high-quality feedback.
From 2010 onwards, most US states have aligned their education standards by adopting the Common Core State Standards (CCSS) for math and English Language Arts. The CCSS did not target other subjects such as science and social studies. We estimate spillovers of the CCSS on student achievement in non-targeted subjects in models with state and year fixed effects. Using student achievement data from the NAEP, we show that the CCSS had a negative effect on student achievement in non-targeted subjects. This negative effect is largest for underprivileged students, exacerbating racial and socioeconomic student achievement gaps. Using teacher surveys, we show that the CCSS caused a reduction in instructional focus on nontargeted subjects.
Four-day school weeks have proliferated across the United States in recent years, reaching over 650 public school districts in 24 states as of 2019, but little is known about the effects of the four-day school week on high school students. This study uses district-level panel data from Oklahoma and a difference-in-differences research design to provide the first estimates of the causal effect of the four-day school week on high school students’ ACT scores, attendance, and disciplinary incidents during school. Results indicate that four-day school weeks decrease per-pupil bullying incidents by approximately 31% and per-pupil fighting incidents by approximately 27%, but have no detectable effect on other incident types, ACT scores, or attendance.
Poor program implementation constitutes one explanation for null results in trials of educational interventions. For this reason, researchers often collect data about implementation fidelity when conducting such trials. In this article, we document whether and how researchers report and measure program fidelity in recent cluster-randomized trials. We then create two measures—one describing the level of fidelity reported by authors and another describing whether the study reports null results—and examine the correspondence between the two. We also explore whether fidelity is influenced by study size, type of fidelity measured and reported, and features of the intervention. We find that as expected, fidelity level relates to student outcomes; we also find that the presence of new curriculum materials positively predicts fidelity level.
Free and reduced-price meal (FRM) eligibility is commonly used in education research and policy applications as an indicator of student poverty. However, using multiple data sources external to the school system, we show that FRM status is a poor proxy for poverty, with eligibility rates far exceeding what would be expected based on stated income thresholds for program participation. This is true even without accounting for community eligibility for free meals, although community eligibility has exacerbated the problem in recent years. Over the course of showing the limitations of using FRM data to measure poverty, we provide promising validity evidence for a new, publicly-available measure of school poverty based on local-area family incomes.