Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

K-12 Education

Isaac M. Opper, Umut Özek.

We use a marginal treatment effect (MTE) representation of a fuzzy regression discontinuity setting to propose a novel estimator. The estimator can be thought of as extrapolating the traditional fuzzy regression discontinuity estimate or as an observational study that adjusts for endogenous selection into treatment using information at the discontinuity. We show in a frequentest framework that it is consistent under weaker assumptions than existing approaches and then discuss conditions in a Bayesian framework under which it can be considered the posterior mean given the observed conditional moments. We then use this approach to examine the effects of early grade retention. We show that the benefits of early grade retention policies are larger for students with lower baseline achievement and smaller for low-performing students who are exempt from retention. These findings imply that (1) the benefits of early grade retention policies are larger than have been estimated using traditional fuzzy regression discontinuity designs but that (2) retaining additional students would have a limited effect on student outcomes.

More →


Deven Carlson, Adam Shepardson.

As students are exposed to extreme temperatures with ever-increasing frequency, it is important to understand how such exposure affects student learning. In this paper we draw upon detailed student achievement data, combined with high-resolution weather records, to paint a clear portrait of the effect of temperature on student learning across a six-year period for students in Tulsa, Oklahoma. The detailed, longitudinal nature of our data allows us to estimate the effects of both test-day and longer-term temperature on student test performance, and to examine how the effects of both temperature measures vary across seasons, student background, and the distribution of student achievement. Our results show that test-day temperature has no significant effect on student test performance in fall or winter, but a clear negative effect on students’ spring performance, particularly in math. Second, we find that summer temperature has a positive, statistically significant, and substantively meaningful effect on student performance on the fall MAP assessment—these effects appear in both math and reading. The results also illustrate that 90-day temperature affects math performance in winter and spring, but these estimates are modest in substantive magnitude.

More →


Brian Jacob.

Media reports suggest that parent frustration with COVID school policies and the growing politicization of education have increased community engagement with local public schools. However, there is no evidence to date on whether these factors have translated into greater engagement at the ballot box. This paper uses a novel data set to explore how school board elections changed following the start of the COVID-19 pandemic. I find that school board elections post-COVID were more likely to be contested, and that voter turnout in contested elections increased. These changes were large in magnitude and varied with several district characteristics.

More →


Joshua B. Gilbert.
When analyzing treatment effects on test scores, researchers face many choices and competing guidance for scoring tests and modeling results. This study examines the impact of scoring choices through simulation and an empirical application. Results show that estimates from multiple methods applied to the same data will vary because two-step models using sum or factor scores provide attenuated standardized treatment effects compared to latent variable models. This bias dominates any other differences between models or features of the data generating process, such as the use of scoring weights. An errors-in-variables (EIV) correction removes the bias from two-step models. An empirical application to data from a randomized controlled trial demonstrates the sensitivity of the results to model selection. This study shows that the psychometric principles most consequential in causal inference are related to attenuation bias rather than optimal scoring weights.

More →


Joshua B. Gilbert, Luke W. Miratrix, Mridul Joshi, Benjamin W. Domingue.
Analyzing heterogeneous treatment effects (HTE) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and pre-intervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment. This study demonstrates that identical patterns of HTE on test score outcomes can emerge either from variation in treatment effects due to a pre-intervention participant characteristic or from correlations between treatment effects and item easiness parameters. We demonstrate analytically and through simulation that these two scenarios cannot be distinguished if analysis is based on summary scores alone. We then describe a novel approach that identifies the relevant data-generating process by leveraging item-level data. We apply our approach to a randomized trial of a reading intervention in second grade, and show that any apparent HTE by pretest ability is driven by the correlation between treatment effect size and item easiness. Our results highlight the potential of employing measurement principles in causal analysis, beyond their common use in test construction.

More →


Lucy C. Sorensen, Andrea Headley, Stephen B. Holt.

Involvement with the juvenile justice system carries immense personal costs to youth: 30% of detained youth drop out of school (relative to 5% nationally) and 55% are re-arrested within one year. These personal costs are compounded by societal costs – both directly in $214,000 of expenses per confined youth per year – and indirectly in lost social and economic productivity. While much of the extant research on the “school-to-prison pipeline” focuses on school disciplinary practices such as suspension, less attention has been given to understanding the impact of school referrals to the juvenile justice system on students’ relationship with school. Using novel administrative data from North Carolina, we link 3 years of individual educational and disciplinary infraction records to juvenile justice system records to identify the effect of juvenile justice referrals for school-based offenses on student academic and behavioral outcomes. We find that, even for the same offense type and circumstance, relative to students only punished for infractions internally in the school, students referred to juvenile justice experience lower academic achievement, increased absenteeism, and are more likely to be involved in future juvenile system contact. We show that these juvenile referrals are not inevitable and instead reflect a series of discretionary choices made by school administrators and law enforcement. Moreover, we examine demographic disparities in school-based referrals to juvenile justice and find that female students, Black students, and economically disadvantaged students are more likely to receive referrals even for the same offense type and circumstances.

More →


David Figlio, Cassandra M. D. Hart, Krzysztof Karbownik.

Using a rich dataset that merges student-level school records with birth records, and leveraging three alternative identification strategies, we explore how increase in access to charter schools in twelve districts in Florida affects students remaining in traditional public schools (TPS). We consistently find that competition stemming from the opening of new charter schools improves reading—but not math—performance and it also decreases absenteeism of students who remain in the TPS. Results are modest in magnitude.

More →


S. Michael Gaddis, Charles Crabtree, John B. Holbein, Steven Pfaff.

Although numerous studies document different forms of discrimination in the U.S. public education system, very few provide plausibly causal estimates. Thus, it is unclear to what extent public school principals discriminate against racial and ethnic minorities. Moreover, no studies test for heterogeneity in racial/ethnic discrimination by individual-level resource needs and school-level resource strain – potentially important moderators in the education context. Using a correspondence audit, we examine bias against Black, Hispanic, and Chinese American families in interactions with 52,792 public K-12 principals in 33 states. Our research provides causal evidence that Hispanic and Chinese American families face significant discrimination in initial interactions with principals, regardless of individual-level resource needs. Black families, however, only face discrimination when they have high resource needs. Additionally, principals in schools with greater resource strain discriminate more against Chinese American families. This research uncovers complexities of racial/ethnic discrimination in the K-12 context because we examine multiple racial/ethnic groups and test for heterogeneity across individual- and school-level variables. These findings highlight the need for researchers conducting future correspondence audits to expand the scope of their research to provide a more comprehensive analysis of racial/ethnic discrimination in the U.S.

More →


Jing Liu, Cameron Conrad, David Blazar.

This study provides the first causal analysis of the impact of expanding Computer Science (CS) education in U.S. K-12 schools on students’ choice of college major and early career outcomes. Utilizing rich longitudinal data from Maryland, we exploit variation from the staggered rollout of CS course offerings across high schools. Our findings suggest that taking a CS course increases students’ likelihood of declaring a CS major by 10 percentage points and receiving a CS BA degree by 5 percentage points. Additionally, access to CS coursework raises students’ likelihood of being employed and early career earnings. Notably, students who are female, low socioeconomic status, or Black experience larger benefits in terms of CS degree attainment and earnings. However, the lower take-up rates of these groups in CS courses highlight a pressing need for targeted efforts to enhance their participation as policymakers continue to expand CS curricula in K-12 education.

More →


Paul T. von Hippel.

Educational researchers often report effect sizes in standard deviation units (SD), but SD effects are hard to interpret. Effects are easier to interpret in percentile points, but converting SDs to percentile points involves a calculation that is not transparent to educational stakeholders. We show that if the outcome variable is normally distributed, simply multiplying the SD effect by 37 usually gives an excellent approximation to the percentile-point effect. For students in the middle three-fifths of a normal distribution, this rule of thumb is always accurate to within 1.6 percentile points for effect sizes of up to 0.8 SD. Two examples show that the rule can be just as accurate for empirical effects from real studies. Applying the rule to Kraft’s empirical benchmarks, we find that the least effective third of educational interventions raise scores by 0 to 2 percentile points; the middle third raise scores by 2 to 7 percentile points; and the most effective third raise scores by more than 7 percentile points.

More →