- Jing Liu
Search for EdWorkingPapers here by author, title, or keywords.
Responsive teaching is a highly effective strategy that promotes student learning. In math classrooms, teachers might funnel students towards a normative answer or focus students to reflect on their own thinking, deepening their understanding of math concepts. When teachers focus, they treat students’ contributions as resources for collective sensemaking, and thereby significantly improve students’ achievement and confidence in mathematics. We propose the task of computationally detecting funneling and focusing questions in classroom discourse. We do so by creating and releasing an annotated dataset of 2,348 teacher utterances labeled for funneling and focusing questions, or neither. We introduce supervised and unsupervised approaches to differentiating these questions. Our best model, a supervised RoBERTa model fine-tuned on our dataset, has a strong linear correlation of .76 with human expert labels and with positive educational outcomes, including math instruction quality and student achievement, showing the model’s potential for use in automated teacher feedback tools. Our unsupervised measures show significant but weaker correlations with human labels and outcomes, and they highlight interesting linguistic patterns of funneling and focusing questions. The high performance of the supervised measure indicates its promise for supporting teachers in their instruction.
Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource intensive in most educational contexts. We develop an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage teaching practice that supports dialogic instruction and makes students feel heard. We conduct a randomized controlled trial as part of an online computer science course, Code in Place (n=1,136 instructors), to evaluate the effectiveness of the feedback tool. We find that the tool improves instructors’ uptake of student contributions by 27% and present suggestive evidence that our tool also improves students’ satisfaction with the course and assignment completion. These results demonstrate the promise of our tool to complement existing efforts in teachers’ professional development.
Teachers' sense-making of student behavior determines whether students get in trouble and are formally disciplined. Status categories, such as race, can influence perceptions of student culpability, but the degree to which this contributes to racial disproportionality in discipline receipt is unknown. This study provides the first systematic documentation of teachers' use office discipline referrals (ODRs) in a large, diverse urban school district in California that specifies the identity of both the referred and referring individuals in all ODRs. We identify teachers exhibiting extensive referral behavior, or the top 5% referrers based on the number of ODRs they make in a given year and evaluate their contributions to disciplinary disparities. We find that "top referrers" effectively double the racial gaps in ODRs for both Black-White and Hispanic-White comparisons. These gaps are mainly driven by higher numbers of ODRs issued for Black and Hispanic students due to interpersonal offences and defiance, and also partially convert to racial gaps in suspensions. Both the level and racial compositions of the school sites where "top referrers" serve and their personal traits seem to explain some of their frequent referring behavior. Targeting supports and interventions to "top referrers" might afford an important opportunity to reduce racial disciplinary gaps.
Student absenteeism is often conceptualized and quantified in a static, uniform manner, providing an incomplete understanding of this important phenomenon. Applying growth curve models to detailed class-attendance data, we document that secondary school students' unexcused absences grow steadily throughout a school year and over grades, while the growth of excused absences remain essentially unchanged. Importantly, students starting the school year with a high number of unexcused absences, Black and Hispanic students, and low-income students accumulate unexcused absences at a significantly faster rate than their counterparts. Lastly, students with higher growth rates in unexcused absences consistently report lower perceptions of all aspects of school culture than their peers. Interventions targeting unexcused absences and/or improving school culture can be crucial to mitigating disengagement.
We use novel data on disciplinary referrals, including those that do not lead to suspensions, to better understand the origins of racial disparities in exclusionary discipline. We find significant differences between Black and white students in both referral rates and the rate at which referrals convert to suspensions. An infraction fixed-effects research design that compares the disciplinary outcomes of white and non-white students who were involved in the same multi-student incident identifies systematic racial biases in sentencing decisions. On both the intensive and extensive margins, Black and Hispanic students receive harsher sentences than their white co-conspirators. This result is driven by high school infractions and mainly applies to “more severe” infractions that involve fights or drugs. Reducing racial disparities in exclusionary discipline will require addressing underlying gaps in disciplinary referrals and the systematic biases that appear in the adjudication process.
In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and improving teachers' uptake at scale is challenging, as existing methods require expensive annotation by experts. We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) correlating these measures with educational outcomes. We find that although repetition captures a significant part of uptake, pJSD outperforms repetition-based baselines, as it is capable of identifying a wider range of uptake phenomena like question answering and reformulation. We apply our uptake measure to three different educational datasets with outcome indicators. Unlike baseline measures, pJSD correlates significantly with instruction quality in all three, providing evidence for its generalizability and for its potential to serve as an automated professional development tool for teachers.
We provide novel evidence on the causal impacts of student absences in middle and high school on state test scores, course grades, and educational attainment using a rich administrative dataset that tracks the date and class period of each absence. We use two similar but distinct identification strategies that address potential endogeneity due to time-varying student-level shocks by exploiting within-student, between-subject variation in class-specific absences. We also leverage information on the timing of absences to show that absences that occur after the annual window for state standardized testing do not affect test scores, providing a further check of our identification strategy. Both approaches yield similar results. We nd that absences in middle and high school harm contemporaneous student achievement and longer-term educational attainment: On average, missing 10 classes reduces math or English Language Arts test scores by 3-4% of a standard deviation and course grades by 17-18% of a standard deviation. 10 total absences across all subjects in 9th grade reduce both the probability of on-time graduation and ever enrolling in college by 2%. Learning loss due to school absences can have profound economic and social consequences.
Valid and reliable measurements of teaching quality facilitate school-level decision-making and policies pertaining to teachers. Using nearly 1,000 word-to-word transcriptions of 4th- and 5th-grade English language arts classes, we apply novel text-as-data methods to develop automated measures of teaching to complement classroom observations traditionally done by human raters. This approach is free of rater bias and enables the detection of three instructional factors that are well aligned with commonly used observation protocols: classroom management, interactive instruction, and teacher-centered instruction. The teacher-centered instruction factor is a consistent negative predictor of value-added scores, even after controlling for teachers’ average classroom observation scores. The interactive instruction factor predicts positive value-added scores. Our results suggest that the text-as-data approach has the potential to enhance existing classroom observation systems through collecting far more data on teaching with a lower cost, higher speed, and the detection of multifaceted classroom practices.
Classroom teachers in the US are absent on average approximately six percent of a school year. Despite the prevalence of teacher absences, surprisingly little research has assessed the key source of replacement instruction: substitute teachers. Using detailed administrative and survey data from a large urban school district, we document the prevalence, predictors, and variation of substitute coverage across schools. Less advantaged schools systematically exhibit lower rates of substitute coverage compared with peer institutions. Observed school, teacher, and absence characteristics account for only part of this school variation. In contrast, substitute teachers’ preferences for specific schools, mainly driven by student behavior and support from teachers and school administrators, explain a sizable share of the unequal distribution of coverage rates above and beyond standard measures in administrative data.
Valid and reliable measurements of teaching quality facilitate school-level decision-making and policies pertaining to teachers, but conventional classroom observations are costly, prone to rater bias, and hard to implement at scale. Using nearly 1,000 word-to-word transcriptions of 4th- and 5th-grade English language arts classes, we apply novel text-as-data methods to develop automated, objective measures of teaching to complement classroom observations. This approach is free of rater bias and enables the detection of three instructional factors that are well aligned with commonly used observation protocols: classroom management, interactive instruction, and teacher-centered instruction. The teacher-centered instruction factor is a consistent negative predictor of value-added scores, even after controlling for teachers’ average classroom observation scores. The interactive instruction factor predicts positive value-added scores.