Search EdWorkingPapers

Matthew A. Kraft

In this thought experiment, we explore how tutoring could be scaled nationally to address COVID-19 learning loss and become a permanent feature of the U.S. public education system. We outline a blueprint centered on ten core principles and a federal architecture to support adoption, while providing for local ownership over key implementation features. High school students would tutor in elementary schools via an elective class, college students in middle schools via federal work-study, and full time 2- and 4-year college graduates in high schools via AmeriCorps. We envision an incremental, demand-driven expansion process with priority given to high-needs schools. Our blueprint highlights a range of design tradeoffs and implementation challenges as well as estimates of program costs. Our estimates suggest that targeted approaches to scaling school-wide tutoring nationally, such as focusing on K-8 Title I schools, would cost between $5 and $15 billion annually.

More →


We study the adoption and implementation of a new mobile communication app among a sample of 132 New York City public schools. The app provides a platform for sharing general announcements and news as well as engaging in personalized two-way communication with individual parents. We provide participating schools with free access to the app and randomize schools to receive intensive support (training, guidance, monitoring, and encouragement) for maximizing the efficacy of the app. Although user supports led to higher levels of communication within the app in the treatment year, overall usage remained low and declined in the following year when treatment schools no longer received intensive supports. We find few subsequent effects on perceptions of communication quality or student outcomes. We leverage rich internal user data to explore how take-up and usage patterns varied across staff and school characteristics. These analyses help to identify early adopters and reluctant users, revealing both opportunities and obstacles to engaging parents through new communication technology.

More →


Researchers commonly interpret effect sizes by applying benchmarks proposed by Cohen over a half century ago. However, effects that are small by Cohen’s standards are large relative to the impacts of most field-based interventions. These benchmarks also fail to consider important differences in study features, program costs, and scalability. In this paper, I present five broad guidelines for interpreting effect sizes that are applicable across the social sciences. I then propose a more structured schema with new empirical benchmarks for interpreting a specific class of studies: causal research on education interventions with standardized achievement outcomes. Together, these tools provide a practical approach for incorporating study features, cost, and scalability into the process of interpreting the policy importance of effect sizes.

More →


In recent years, states have sought to increase accountability for public school teachers by implementing a package of reforms centered on high-stakes evaluation systems. We examine the effect of these reforms on the supply and quality of new teachers. Leveraging variation across states and time, we find that accountability reforms reduced the number of newly licensed teacher candidates and increased the likelihood of unfilled teaching positions, particularly in hard-to-staff schools. Evidence also suggests that reforms increased the quality of new labor supply by reducing the likelihood new teachers attended unselective undergraduate institutions. Decreases in job security, satisfaction, and autonomy are likely mechanisms for these effects.

More →


This paper describes and evaluates a web-based coaching program designed to support teachers in implementing Common Core-aligned math instruction. Web-based coaching programs can be operated at relatively lower costs, are scalable, and make it more feasible to pair teachers with coaches who have expertise in their content area and grade level. Results from our randomized field trial document sizable and sustained effects on both teachers’ ability to analyze instruction and on their instructional practice, as measured the Mathematical Quality of Instruction (MQI) instrument and student surveys. However, these improvements in instruction did not result in corresponding increases in math test scores as measured by state standardized tests or interim assessments. We discuss several possible explanations for this pattern of results.

More →


We examine the dynamic nature of teacher skill development using panel data on principals’ subjective performance ratings of teachers. Past research on teacher productivity improvement has focused primarily on one important but narrow measure of performance: teachers’ value-added to student achievement on standardized tests. Unlike value-added, subjective performance ratings provide detailed information about specific skill dimensions and are available for the many teachers in non-tested grades and subjects. Using a within-teacher returns to experience framework, we find, on average, large and rapid improvements in teachers’ instructional practices throughout their first ten years on the job as well as substantial differences in improvement rates across individual teachers. We also document that subjective performance ratings contain important information about teacher effectiveness. In the district we study, principals appear to differentiate teacher performance throughout the full distribution instead of just in the tails. Furthermore, prior performance ratings and gains in these ratings provide additional information about teachers’ ability to improve test scores that is not captured by prior value-added scores. Taken together, our study provides new insights on teacher performance improvement and variation in teacher development across instructional skills and individual teachers.

More →