Methods
Assessors influence results: Evidence on enumerator effects and educational impact evaluations
A significant share of education and development research uses data collected by workers called “enumerators.” It is well-documented that “enumerator effects”—or inconsistent practices between the individual people who administer measurement tools— can be a key source of error in survey data… more →
Design and Analytic Features for Reducing Biases in Skill-Building Intervention Impact Forecasts
Despite policy relevance, longer-term evaluations of educational interventions are relatively rare. A common approach to this problem has been to rely on longitudinal research to determine targets for intervention by looking at the correlation between children’s early skills (e.g., preschool… more →
Signal Weighted Value-Added Models
This study introduces the signal weighted teacher value-added model (SW VAM), a value-added model that weights student-level observations based on each student’s capacity to signal their assigned teacher’s quality. Specifically, the model leverages the repeated appearance of a given student to… more →
How to “QuantCrit:” Practices and Questions for Education Data Researchers and Users
‘QuantCrit’ (Quantitative Critical Race Theory) is a rapidly developing approach that seeks to challenge and improve the use of statistical data in social research by applying the insights of Critical Race Theory. As originally formulated, QuantCrit rests on five principles; 1) the centrality of… more →
Impact Evaluations of Teacher Preparation Practices: Challenges and Opportunities for More Rigorous Research
Many teacher education researchers have expressed concerns with the lack of rigorous impact evaluations of teacher preparation practices. I summarize these various concerns as they relate to issues of internal validity, external validity, and measurement. I then assess the prevalence of these… more →
Can learning be measured by phone? Evidence from Kenya
School closures induced by COVID-19 placed heightened emphasis on alternative ways to measure student learning besides in-person exams. We leverage the administration of phone-based assessments (PBAs) measuring numeracy and literacy for primary school children in Kenya, along with in-person… more →
Bridging human and machine scoring in experimental assessments of writing: tools, tips, and lessons learned from a field trial in education
In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the… more →
Common support violations in clustered observational studies of educational interventions
In education settings, treatments are often non-randomly assigned to clusters, such as schools or classrooms, while outcomes are measured for students. This research design is called the clustered observational study (COS). We examine the consequences of common support violations in the COS… more →
Bringing Transparency to Predictive Analytics: A Systematic Comparison of Predictive Modeling Methods in Higher Education
Colleges have increasingly turned to predictive analytics to target at-risk students for additional support. Most of the predictive analytic applications in higher education are proprietary, with private companies offering little transparency about their underlying models.
Measuring Conversational Uptake: A Case Study on Student-Teacher Interactions
In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and… more →
Characterizing Cross-Site Variation in Local Average Treatment Effects in Multisite Regression Discontinuity Design Contexts with an Application to Massachusetts High School Exit Exam
In multisite experiments, we can quantify treatment effect variation with the cross-site treatment effect variance. However, there is no standard method for estimating cross-site treatment effect variance in multisite regression discontinuity designs (RDD). This research rectifies this gap in… more →
Using Implementation Fidelity to Aid in Interpreting Program Impacts: A Brief Review
Poor program implementation constitutes one explanation for null results in trials of educational interventions. For this reason, researchers often collect data about implementation fidelity when conducting such trials. In this article, we document whether and how researchers report and measure… more →
Connected Networks in Principal Value-Added Models
A growing literature uses value-added (VA) models to quantify principals' contributions to improving student outcomes. Principal VA is typically estimated using a connected networks model that includes both principal and school fixed effects (FE) to isolate principal effectiveness from fixed… more →
Intersecting Inequalities: Racial/Ethnic and Socioeconomic Differences in Math Achievement and School Contexts in California
Past research extensively documents inequalities in educational opportunity and achievement by students’ race/ethnicity or socioeconomic status (SES). Less scholarship focuses on how race/ethnicity and SES interact and jointly contribute to educational inequalities. We advance this burgeoning… more →
Measuring Teaching Practices at Scale: A Novel Application of Text-as-Data Methods
Valid and reliable measurements of teaching quality facilitate school-level decision-making and policies pertaining to teachers. Using nearly 1,000 word-to-word transcriptions of 4th- and 5th-grade English language arts classes, we apply novel text-as-data methods to develop automated measures… more →
The Dynamics and Measurement of High School Homelessness and Achievement Disparities
There is no national consensus on how school districts calculate high school achievement disparities between students who experience homelessness and those who do not. Using administrative student-level data from a mid-sized public school district in the Southern United States, we show that… more →
Improving Average Treatment Effect Estimates in Small-Scale Randomized Controlled Trials
Researchers often include covariates when they analyze the results of randomized controlled trials (RCTs), valuing the increased precision of the estimates over the potential of inducing small-sample bias when doing so. In this paper, we develop a sufficient condition which ensures that the… more →
How Much Does Teacher Quality Vary Across Teacher Preparation Programs? Reanalyses from Six States
At least sixteen US states have taken steps toward holding teacher preparation programs (TPPs) accountable for teacher value-added to student test scores. Yet it is unclear whether teacher quality differences between TPPs are large enough to make an accountability system worthwhile. Several… more →
Using Semantic Similarity to Assess Adherence and Replicability of Intervention Delivery
Researchers are rarely satisfied to learn only whether an intervention works, they also want to understand why and under what circumstances interventions produce their intended effects. These questions have led to increasing calls for implementation research to be… more →
Design-Based Approaches to Causal Replication Studies
Recent interest to promote and support replication efforts assume that there is well-established methodological guidance for designing and implementing these studies. However, no such consensus exists in the methodology literature. This article addresses these challenges by describing design-… more →