- Matthew Kraft
Search for EdWorkingPapers here by author, title, or keywords.
We examine the dynamic nature of teacher skill development using panel data on principals’ subjective performance ratings of teachers. Past research on teacher productivity improvement has focused primarily on one important but narrow measure of performance: teachers’ value-added to student achievement on standardized tests. Unlike value-added, subjective performance ratings provide detailed information about specific skill dimensions and are available for the many teachers in non-tested grades and subjects. Using a within-teacher returns to experience framework, we find, on average, large and rapid improvements in teachers’ instructional practices throughout their first ten years on the job as well as substantial differences in improvement rates across individual teachers. We also document that subjective performance ratings contain important information about teacher effectiveness. In the district we study, principals appear to differentiate teacher performance throughout the full distribution instead of just in the tails. Furthermore, prior performance ratings and gains in these ratings provide additional information about teachers’ ability to improve test scores that is not captured by prior value-added scores. Taken together, our study provides new insights on teacher performance improvement and variation in teacher development across instructional skills and individual teachers.
Starting in 2011, Boston Public Schools (BPS) implemented major reforms to its teacher evaluation system with a focus on promoting teacher development. We administered independent district-wide surveys in 2014 and 2015 to capture BPS teachers’ perceptions of the evaluation feedback they receive. Teachers generally reported that evaluators were fair and accurate, but that they struggled to provide high-quality feedback. We conduct a randomized controlled trial to evaluate the district’s efforts to improve this feedback through an intensive training program for evaluators. We find little evidence the program affected evaluators’ feedback, teacher retention, or student achievement. Our results suggest that improving the quality of evaluation feedback may require more fundamental changes to the design and implementation of teacher evaluation systems.
Researchers commonly interpret effect sizes by applying benchmarks proposed by Cohen over a half century ago. However, effects that are small by Cohen’s standards are often large in the context of field-based education interventions. This focus on magnitude also obscures important differences in study features, program costs, and scalability. In this paper, I propose a new framework for interpreting effect sizes of education interventions, which consists of five broadly applicable guidelines and a detailed schema for interpreting effects from causal studies with standardized achievement outcomes. The schema introduces new effect-size and cost benchmarks, while also considering program scalability. Together, the framework provides scholars and research consumers with an empirically-based, practical approach for interpreting the policy importance of effect sizes from education interventions.