Methods
Experimental education research: clarifying why, how and when to use random assignment
Over the last twenty years, education researchers have increasingly conducted randomised experiments with the goal of informing the decisions of educators and policymakers. Such experiments have generally employed broad, consequential, standardised outcome measures in the hope that this would… more →
An Improved Method for Estimating School-Level Characteristics from Census Data
We propose a new method for estimating school-level characteristics from publicly available census data. We use a school’s location to impute its catchment area by aggregating the nearest n census block groups such that the number of school-aged children in those n block groups is just over the… more →
Implementation Matters: Generalizing Treatment Effects in Education
Targeted instruction is one of the most effective educational interventions in low- and middle-income countries, yet reported impacts vary by an order of magnitude. We study this variation by aggregating evidence from prior randomized trials across five contexts, and use the results to inform a… more →
A Global Regression Discontinuity Design: Theory and Application to Grade Retention Policies
We use a marginal treatment effect (MTE) representation of a fuzzy regression discontinuity setting to propose a novel estimator. The estimator can be thought of as extrapolating the traditional fuzzy regression discontinuity estimate or as an observational study that adjusts for endogenous… more →
Identification of Non-Additive Fixed Effects Models: Is the Return to Teacher Quality Homogeneous?
Panel or grouped data are often used to allow for unobserved individual heterogeneity in econometric models via fixed effects. In this paper, we discuss identification of a panel data model in which the unobserved heterogeneity both enters additively and interacts with treatment variables. We… more →
How Measurement Affects Causal Inference: Attenuation Bias is (Usually) More Important Than Scoring Weights
When analyzing treatment effects on test scores, researchers face many choices and competing guidance for scoring tests and modeling results. This study examines the impact of scoring choices through simulation and an empirical application. Results show that estimates from multiple methods applied… more →
Heterogeneity of item-treatment interactions masks complexity and generalizability in randomized controlled trials
Researchers use test outcomes to evaluate the effectiveness of education interventions across numerous randomized controlled trials (RCTs). Aggregate test data—for example, simple measures like the sum of correct responses—are compared across treatment and control groups to determine whether an… more →
Lottery-Based Evaluations of Early Education Programs: Opportunities and Challenges for Building the Next Generation of Evidence
Lottery-based identification strategies offer potential for generating the next generation of evidence on U.S.
Measuring returns to experience using supervisor ratings of observed performance: The case of classroom teachers
We study the returns to experience in teaching, estimated using supervisor ratings from classroom observations. We describe the assumptions required to interpret changes in observation ratings over time as the causal effect of experience on performance. We compare two difference-in-differences… more →
The NCTE Transcripts: A Dataset of Elementary Math Classroom Transcripts
Classroom discourse is a core medium of instruction --- analyzing it can provide a window into teaching and learning as well as driving the development of new tools for improving instruction. We introduce the largest dataset of mathematics classroom transcripts available to researchers, and… more →
Estimating Treatment Effects with the Explanatory Item Response Model
This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores… more →
Rethinking Principal Effects on Student Outcomes
School principals are viewed as critical actors to improve student outcomes, but there remain important methodological questions about how to measure principals’ effects. We propose a framework for measuring principals’ contributions to student outcomes and apply it empirically using data from… more →
Modeling Item-Level Heterogeneous Treatment Effects with the Explanatory Item Response Model: Leveraging Online Formative Assessments to Pinpoint the Impact of Educational Interventions
Analyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard statistical methods for addressing Heterogeneous Treatment Effects (HTE) fail to address the HTE that… more →
Correspondence Measures for Assessing Replication Success
Given recent evidence challenging the replicability of results in the social and behavioral sciences, critical questions have been raised about appropriate measures for determining replication success in comparing effect estimates across studies. At issue is the fact that… more →
Racial Category Usage in Education Research: Examining the Publications from AERA Journals
How scholars name different racial groups has powerful salience for understanding what researchers study. We explored how education researchers used racial terminology in recently published, high-profile, peer-reviewed studies. Our sample included all original empirical studies published in the… more →
Assessors influence results: Evidence on enumerator effects and educational impact evaluations
A significant share of education and development research uses data collected by workers called “enumerators.” It is well-documented that “enumerator effects”—or inconsistent practices between the individual people who administer measurement tools— can be a key source of error in survey data… more →
Design and Analytic Features for Reducing Biases in Skill-Building Intervention Impact Forecasts
Despite policy relevance, longer-term evaluations of educational interventions are relatively rare. A common approach to this problem has been to rely on longitudinal research to determine targets for intervention by looking at the correlation between children’s early skills (e.g., preschool… more →
Signal Weighted Value-Added Models
This study introduces the signal weighted teacher value-added model (SW VAM), a value-added model that weights student-level observations based on each student’s capacity to signal their assigned teacher’s quality. Specifically, the model leverages the repeated appearance of a given student to… more →
How to “QuantCrit:” Practices and Questions for Education Data Researchers and Users
‘QuantCrit’ (Quantitative Critical Race Theory) is a rapidly developing approach that seeks to challenge and improve the use of statistical data in social research by applying the insights of Critical Race Theory. As originally formulated, QuantCrit rests on five principles; 1) the centrality of… more →
Impact Evaluations of Teacher Preparation Practices: Challenges and Opportunities for More Rigorous Research
Many teacher education researchers have expressed concerns with the lack of rigorous impact evaluations of teacher preparation practices. I summarize these various concerns as they relate to issues of internal validity, external validity, and measurement. I then assess the prevalence of these… more →