Search EdWorkingPapers

Search for EdWorkingPapers here by author, title, or keywords.

Heather C. Hill

Dorottya Demszky, Jing Liu, Heather C. Hill, Dan Jurafsky, Chris Piech.

Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource intensive in most educational contexts. We develop an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage teaching practice that supports dialogic instruction and makes students feel heard. We conduct a randomized controlled trial as part of an online computer science course, Code in Place (n=1,136 instructors), to evaluate the effectiveness of the feedback tool. We find that the tool improves instructors’ uptake of student contributions by 24% and present suggestive evidence that our tool also improves students’ satisfaction with the course. These results demonstrate the promise of our tool to complement existing efforts in teachers’ professional development.

More →

Dorottya Demszky, Jing Liu, Zid Mancenido, Julie Cohen, Heather C. Hill, Dan Jurafsky, Tatsunori Hashimoto.

In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and improving teachers' uptake at scale is challenging, as existing methods require expensive annotation by experts. We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) correlating these measures with educational outcomes. We find that although repetition captures a significant part of uptake, pJSD outperforms repetition-based baselines, as it is capable of identifying a wider range of uptake phenomena like question answering and reformulation. We apply our uptake measure to three different educational datasets with outcome indicators. Unlike baseline measures, pJSD correlates significantly with instruction quality in all three, providing evidence for its generalizability and for its potential to serve as an automated professional development tool for teachers.

More →

Heather C. Hill, Anna Erickson.

Poor program implementation constitutes one explanation for null results in trials of educational interventions. For this reason, researchers often collect data about implementation fidelity when conducting such trials. In this article, we document whether and how researchers report and measure program fidelity in recent cluster-randomized trials. We then create two measures—one describing the level of fidelity reported by authors and another describing whether the study reports null results—and examine the correspondence between the two. We also explore whether fidelity is influenced by study size, type of fidelity measured and reported, and features of the intervention. We find that as expected, fidelity level relates to student outcomes; we also find that the presence of new curriculum materials positively predicts fidelity level.

More →

Heather C. Hill, Zid Mancenido, Susanna Loeb.

Despite calls for more evidence regarding the effectiveness of teacher education practices, causal research in the field remains rare. One reason is that we lack designs and measurement approaches that appropriately meet the challenges of causal inference in the context of teacher education programs. This article provides a framework for how to fill this gap. We first outline the difficulties of doing causal research in teacher education. We then describe a set of replicable practices for developing measures of key teaching outcomes, and propose causal research designs suited to the needs of the field. Finally, we identify community-wide initiatives that are necessary to advance effectiveness research in teacher education at scale.

More →

Heather C. Hill, Erica Litke, Kathleen Lynch.

Background:
For nearly three decades, policy-makers and researchers in the United States have promoted more intellectually rigorous standards for mathematics teaching and learning. Yet, to date, we have limited descriptive evidence on the extent to which reform-oriented instruction has been enacted at scale.

Purpose:
The purpose of the study is to examine the prevalence of reform-aligned mathematics instructional practices in five U.S. school districts. We also seek to describe the range of instruction students experience by presenting case studies of teachers at high, medium and low levels of reform alignment.

Participants:
We draw on 1,735 video-recorded lessons from 329 elementary teachers in these five U.S. urban districts.

Research Design:
We present descriptive analyses of lesson scores on a mathematics-focused classroom observation instrument. We also draw upon interviews with district personnel, rater-written lesson summaries, and lesson video in order to develop case studies of instructional practice.

Findings:
We find that teachers in our sample do use reform-aligned instructional practices, but that they do so within the confines of traditional lesson formats. We also find that the implementation of these instructional practices varies in quality. Furthermore, the prevalence and strength of these practices corresponds to the coherence of district efforts at instructional reform.

Conclusions:
Our findings suggest that unlike other studies in which reform-oriented instruction rarely occurred (e.g. Kane & Staiger, 2012), reform practices do appear to some degree in study classrooms. In addition, our analyses suggest that implementation of these reform practices corresponds to the strength and coherence of district efforts to change instruction.

More →

Heather C. Hill, Derek C. Briggs.

Federal policy has both incentivized and supported better use of research evidence by educational leaders.  However, the extent to which these leaders are well-positioned to understand foundational principles from research design and statistics, including those that underlie the What Works Clearinghouse ratings of research studies, remains an open question. To investigate educational leaders’ knowledge of these topics, we developed a construct map and items representing key concepts, then conducted surveys containing those items with a small pilot sample (n=178) and a larger nationally representative sample (n=733) of educational leaders. We found that leaders’ knowledge was surprisingly inconsistent across topics. We also found most items were answered correctly by less than half of respondents, with cognitive interviews suggesting that some of those correct answers derived from guessing or test-taking techniques. Our findings identify a roadblock to policymakers’ contention that educational leaders should use research in decision-making.  

More →

Kathleen Lynch, Heather C. Hill, Kathryn Gonzalez, Cynthia Pollard.

More than half of U.S. children fail to meet proficiency standards in mathematics and science in fourth grade. Teacher professional development and curriculum improvement are two of the primary levers that school leaders and policymakers use to improve children’s science, technology, engineering and mathematics (STEM) learning, yet until recently, the evidence base for understanding their effectiveness was relatively thin. In recent years, a wealth of rigorous new studies using experimental designs have investigated whether and how STEM instructional improvement programs work. This article highlights contemporary research on how to improve classroom instruction and subsequent student learning in STEM. Instructional improvement programs that feature curriculum integration, teacher collaboration, content knowledge, pedagogical content knowledge, and how students learn all link to stronger student achievement outcomes. We discuss implications for policy and practice.

More →

Heather C. Hill, Kathleen Lynch, Kathryn Gonzalez, Cynthia Pollard.

How should teachers spend their STEM-focused professional learning time? To answer this question, we analyzed a recent wave of rigorous new studies of STEM instructional improvement programs. We found that programs work best when focused on building knowledge teachers can use during instruction: knowledge of the curriculum materials they will use, knowledge of content and how content can be represented for learners, and knowledge of how students learn that content. We argue that such learning opportunities improve teachers’ professional knowledge and skill, potentially by supporting teachers in making more informed in-the-moment instructional decisions.

More →

Matthew A. Kraft, Heather C. Hill.

This paper describes and evaluates a web-based coaching program designed to support teachers in implementing Common Core-aligned math instruction. Web-based coaching programs can be operated at relatively lower costs, are scalable, and make it more feasible to pair teachers with coaches who have expertise in their content area and grade level. Results from our randomized field trial document sizable and sustained effects on both teachers’ ability to analyze instruction and on their instructional practice, as measured the Mathematical Quality of Instruction (MQI) instrument and student surveys. However, these improvements in instruction did not result in corresponding increases in math test scores as measured by state standardized tests or interim assessments. We discuss several possible explanations for this pattern of results.

More →

Kathleen Lynch, Heather C. Hill, Kathryn Gonzalez, Cynthia Pollard.

We present results from a meta-analysis of 95 experimental and quasi-experimental preK-12 science, technology, engineering, and mathematics (STEM) professional development and curriculum programs, seeking to understand what content, activities and formats relate to stronger student outcomes. Across rigorously conducted studies, we found an average weighted impact estimate of +0.21 standard deviations. Programs saw stronger outcomes when they helped teachers learn to use curriculum materials; focused on improving teachers' content knowledge, pedagogical content knowledge and/or understanding of how students learn; incorporated summer workshops; and included teacher meetings to troubleshoot and discuss classroom implementation. We discuss implications for policy and practice.

More →