Search EdWorkingPapers

Heather C. Hill

Same Idea, Shifting Standards: An Experimental Study of Racial-Ethnic Biases in Ambitious Math Teaching

Teacher expectations and judgments about student capabilities are predictive of student achievement, yet such judgments may be influenced by salient dimensions of student identity and invite biases. Moreover, ambitious math teaching may also invite teacher biases due to the emphasis on student-generated inputs and ideas. In this pre-registered audit experiment, we investigate teacher biases in a) expectations and judgments about student capabilities in math and b) teacher responsiveness to students’ mathematical thinking. Through a between-subjects design, we randomly assigned teachers to a simulated classroom composed of predominantly Black, Latinx/e, or White students and prompted them to respond to a student’s mathematical solution. We also prompted teachers to judge the quality of the student’s mathematical thinking and rate their expectations about the difficulty of the problem for the typical student. Our findings show teachers expected greater task difficulty in both the Latinx/e and Black classroom conditions relative to the White. We also found teachers may be more likely to support student sense-making and provide more positive, substantive affirmations to Black students relative to White students for the same mathematical solution. We did not find differences by condition in other dimensions. Our findings have implications for teacher training and reform-oriented mathematics instruction.

More →


A Quantitative Study of Mathematical Language in Upper Elementary Classrooms

This study provides the first large-scale quantitative exploration of mathematical language use in upper elementary U.S. classrooms. Our approach employs natural language processing techniques to describe variation in teachers’ and students’ use of mathematical language in 1,657 fourth and fifth grade lessons in 317 classrooms in four districts over three years. Students’ exposure to mathematical language varies substantially across lessons and between teachers. Results suggest that teacher modeling, defined as the density of mathematical terms in teacher talk, does not substantially cause students to uptake mathematical language, but that teachers may encourage student use of mathematical vocabulary by means other than mere modeling or exposure. However, we also find that teachers who use more mathematical language are more effective at raising student test scores. These findings reveal that teachers who use more mathematical vocabulary are more effective math teachers.

More →


A Meta-Analysis of the Experimental Evidence Linking Mathematics and Science Professional Development Interventions to Teacher Knowledge, Classroom Instruction, and Student Achievement

Despite evidence that teacher professional development interventions in mathematics and science can increase student achievement, our understanding of the mechanisms by which this occurs – particularly how these interventions affect teachers themselves, and whether teacher-level changes predict student learning – remains limited. The current meta-analysis synthesizes 46 experimental studies of preK-12 mathematics and science professional development interventions to investigate how these interventions affect teachers’ knowledge and classroom instruction, and how these impacts relate to intervention effects on student achievement. Compared with controls, treatment group teachers had stronger performance on measures of knowledge and classroom instruction (pooled average impact estimate: +0.53 SD). Programs with larger impacts on teacher practice had significantly larger mean effects on student achievement. However, mean effects on student achievement were not significantly related to impacts on teacher knowledge. We discuss implications for future research and practice.

More →


Structured Reporting Guidelines for Classroom Intervention Research

Inconsistent reporting of critical facets of classroom interventions and their related impact evaluations hinders the field’s ability to describe and synthesize the existing evidence base. In this essay, we present a set of reporting guidelines intended to steer authors of classroom intervention studies toward providing more systematic reporting of key intervention features and setting-level factors that may affect interventions’ success. The guidelines were iteratively developed using recommendations and feedback from scholars active in conducting and synthesizing classroom intervention research. This effort aims to open wider the ‘black box’ in classroom research, communicating key information with more precision and detail to practitioners and future researchers, and permitting the field to more efficiently accumulate and synthesize findings on classroom interventions, determining what works, for whom, and under what conditions.

More →


Improving Teachers’ Questioning Quality through Automated Feedback: A Mixed-Methods Randomized Controlled Trial in Brick-and-Mortar Classrooms

While recent studies have demonstrated the potential of automated feedback to enhance teacher instruction in virtual settings, its efficacy in traditional classrooms remains unexplored. In collaboration with TeachFX, we conducted a pre-registered randomized controlled trial involving 523 Utah mathematics and science teachers to assess the impact of automated feedback in K-12 classrooms. This feedback targeted “focusing questions” – questions that probe students’ thinking by pressing for explanations and reflection. Our findings indicate that automated feedback increased teachers’ use of focusing questions by 20%. However, there was no discernible effect on other teaching practices. Qualitative interviews revealed mixed engagement with the automated feedback: some teachers noticed and appreciated the reflective insights from the feedback, while others had no knowledge of it. Teachers also expressed skepticism about the accuracy of feedback, concerns about data security, and/or noted that time constraints prevented their engagement with the feedback. Our findings highlight avenues for future work, including integrating this feedback into existing professional development activities to maximize its effect.

More →


Practice-Based Teacher Education Pedagogies Improve Responsiveness: Evidence from a Lab Experiment

Practice-based teacher education has increasingly been adopted as an alternative to more traditional, conceptually-focused pedagogies, yet the field lacks causal evidence regarding the relative efficacy of these approaches. To address this issue, we randomly assigned 185 college students to one of three experimental conditions reflective of common conceptually-focused and practice-based teacher preparation pedagogies. We find significant and large positive effects of practice-based pedagogies on participants’ skills in eliciting and responding to student thinking as demonstrated through a written assessment and a short teaching episode. Our findings contribute to a developing evidence base that can assist policymakers and teacher educators in designing effective teacher preparation at scale.

More →


Can Automated Feedback Improve Teachers’ Uptake of Student Ideas? Evidence From a Randomized Controlled Trial In a Large-Scale Online Course

Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource-intensive in most educational contexts. We develop M-Powering Teachers, an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage dialogic teaching practice that makes students feel heard. We conduct a randomized controlled trial in an online computer science course (n=1,136 instructors), to evaluate the effectiveness of our tool. We find that M-Powering Teachers improves instructors’ uptake of student contributions by 13% and present suggestive evidence that it also improves students’ satisfaction with the course and assignment completion. These results demonstrate the promise of M-Powering Teachers to complement existing efforts in teachers’ professional development.

More →


The NCTE Transcripts: A Dataset of Elementary Math Classroom Transcripts

Classroom discourse is a core medium of instruction --- analyzing it can provide a window into teaching and learning as well as driving the development of new tools for improving instruction. We introduce the largest dataset of mathematics classroom transcripts available to researchers, and demonstrate how this data can help improve instruction. The dataset consists of 1,660 45-60 minute long 4th and 5th grade elementary mathematics observations collected by the National Center for Teacher Effectiveness (NCTE) between 2010-2013. The anonymized transcripts represent data from 317 teachers across 4 school districts that serve largely historically marginalized students. The transcripts come with rich metadata, including turn-level annotations for dialogic discourse moves, classroom observation scores, demographic information, survey responses and student test scores. We demonstrate that our natural language processing model, trained on our turn-level annotations, can learn to identify dialogic discourse moves and these moves are correlated with better classroom observation scores and learning outcomes. This dataset opens up several possibilities for researchers, educators and policymakers to learn about and improve K-12 instruction.

The data and its terms of use can be accessed here: https://github.com/ddemszky/classroom-transcript-analysis

More →


Computationally Identifying Funneling and Focusing Questions in Classroom Discourse

Responsive teaching is a highly effective strategy that promotes student learning. In math classrooms, teachers might funnel students towards a normative answer or focus students to reflect on their own thinking, deepening their understanding of math concepts. When teachers focus, they treat students’ contributions as resources for collective sensemaking, and thereby significantly improve students’ achievement and confidence in mathematics. We propose the task of computationally detecting funneling and focusing questions in classroom discourse. We do so by creating and releasing an annotated dataset of 2,348 teacher utterances labeled for funneling and focusing questions, or neither. We introduce supervised and unsupervised approaches to differentiating these questions. Our best model, a supervised RoBERTa model fine-tuned on our dataset, has a strong linear correlation of .76 with human expert labels and with positive educational outcomes, including math instruction quality and student achievement, showing the model’s potential for use in automated teacher feedback tools. Our unsupervised measures show significant but weaker correlations with human labels and outcomes, and they highlight interesting linguistic patterns of funneling and focusing questions. The high performance of the supervised measure indicates its promise for supporting teachers in their instruction.

More →


U.S. Middle School Mathematics Instruction, 2016

In recent decades, U.S. education leaders have advocated for more intellectually ambitious mathematics instruction in classrooms. Evidence about whether more ambitious mathematics instruction has filtered into contemporary classrooms, however, is largely anecdotal. To address this issue, we analyzed 93 lessons recorded by a national random sample of middle school mathematics teachers. We find that lesson quality varies, with the typical lesson containing some elements of mathematical reasoning and sense-making, but also teacher-directed instruction with limited student input. Lesson quality correlates with teachers’ use of a textbook and with teachers’ mathematical background. We consider these findings in light of efforts to transform U.S. mathematics instruction.

More →