- Jing Liu
Search EdWorkingPapers by author, title, or keywords.
Although learners are being connected 1:1 with instructors at an increasing scale, most of these instructors do not receive effective, consistent feedback to help them improved. We deployed M-Powering Teachers, an automated tool based on natural language processing to give instructors feedback on dialogic instructional practices —including their uptake of student contributions, talk time and questioning practices — in a 1:1 online learning context. We conducted a randomized controlled trial on Polygence, a re-search mentorship platform for high schoolers (n=414 mentors) to evaluate the effectiveness of the feedback tool. We find that the intervention improved mentors’ uptake of student contributions by 10%, reduced their talk time by 5% and improves student’s experi-ence with the program as well as their relative optimism about their academic future. These results corroborate existing evidence that scalable and low-cost automated feedback can improve instruction and learning in online educational contexts.
Teachers’ sense-making of student behavior determines whether students get in trouble and are formally disciplined. Status categories, such as race, can influence perceptions of student culpability, but the degree to which teachers’ initial identification of student misbehavior exacerbates racial disproportionality in discipline receipt is unknown.This study provides the first systematic documentation of teachers’ use of office discipline Referrals (ODRs) in a large, diverse urban school district in California that specifies the identity of both the referred and referring individuals in all ODRs. We identify teachers exhibiting extensive referring behavior, or the top 5 percent referrers based on the number of ODRs they make in a given year and evaluate their contributions to disciplinary disparities. We find that “top referrers” effectively double the racial gaps in ODRs for both Black-White and Hispanic-White comparisons. These gaps are mainly driven by higher numbers of ODRs issued for Black and Hispanic students due to interpersonal offences and defiance, and also partially convert to racial gaps in suspensions. Both the level and racial compositions of the school sites where “top referrers” serve and their personal traits seem to explain some of their frequent referring behavior. Targeting supports and interventions to “top referrers” might afford an important opportunity to reduce racial disciplinary gaps
Teachers affect a wide range of students’ educational and social outcomes, but how they contribute to students’ involvement in school discipline is less understood. We estimate the impact of teacher demographics and other observed qualifications on students’ likelihood of receiving a disciplinary referral. Using data that track all disciplinary referrals and the identity of both the referred and referring individuals from a large and diverse urban school district in California, we find students are about 0.2 to 0.5 percentage points (7% to 18%) less likely to receive a disciplinary referral from teachers of the same race or gender than from teachers of different demographic backgrounds. Students are also less likely to be referred by more experienced teachers and by teachers who hold either an English language learners or special education credential. These results are mostly driven by referrals for defiance and violence infractions, Black and Hispanic male students, and middle school students. While it is unclear whether these findings are due to variation in teachers’ effects on actual student behavior, variation in teachers’ proclivities to make disciplinary referrals, or a combination of the two, these results nonetheless suggest that teachers play a central role in the prevalence of, and inequities in, office referrals and subsequent student discipline.
Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource intensive in most educational contexts. We develop an automated tool based on natural language processing to give teachers feedback on their uptake of student contributions, a high-leverage teaching practice that supports dialogic instruction and makes students feel heard. We conduct a randomized controlled trial as part of an online computer science course, Code in Place (n=1,136 instructors), to evaluate the effectiveness of the feedback tool. We find that the tool improves instructors’ uptake of student contributions by 27% and present suggestive evidence that our tool also improves students’ satisfaction with the course and assignment completion. These results demonstrate the promise of our tool to complement existing efforts in teachers’ professional development.
Noncognitive constructs such as self-efficacy, social awareness, and academic engagement are widely acknowledged as critical components of human capital, but systematic data collection on such skills in school systems is complicated by conceptual ambiguities, measurement challenges and resource constraints. This study addresses this issue by comparing the predictive validity of two most widely used metrics on noncogntive outcomes|observable academic behaviors (e.g., absenteeism, suspensions) and student self-reported social and emotional learning (SEL) skills|for the likelihood of high school graduation and postsecondary attainment. Our findings suggest that conditional on student demographics and achievement, academic behaviors are several-fold more predictive than SEL skills for all long-run outcomes, and adding SEL skills to a model with academic behaviors improves the model's predictive power minimally. In addition, academic behaviors are particularly strong predictors for low-achieving students' long-run outcomes. Part-day absenteeism (as a result of class skipping) is the largest driver behind the strong predictive power of academic behaviors. Developing more nuanced behavioral measures in existing administrative data systems might be a fruitful strategy for schools whose intended goal centers on predicting students' educational attainment.
Student absenteeism is often conceptualized and quantified in a static, uniform manner, providing an incomplete understanding of this important phenomenon. Applying growth curve models to detailed class-attendance data, we document that secondary school students' unexcused absences grow steadily throughout a school year and over grades, while the growth of excused absences remain essentially unchanged. Importantly, students starting the school year with a high number of unexcused absences, Black and Hispanic students, and low-income students accumulate unexcused absences at a significantly faster rate than their counterparts. Lastly, students with higher growth rates in unexcused absences consistently report lower perceptions of all aspects of school culture than their peers. Interventions targeting unexcused absences and/or improving school culture can be crucial to mitigating disengagement.
Responsive teaching is a highly effective strategy that promotes student learning. In math classrooms, teachers might funnel students towards a normative answer or focus students to reflect on their own thinking, deepening their understanding of math concepts. When teachers focus, they treat students’ contributions as resources for collective sensemaking, and thereby significantly improve students’ achievement and confidence in mathematics. We propose the task of computationally detecting funneling and focusing questions in classroom discourse. We do so by creating and releasing an annotated dataset of 2,348 teacher utterances labeled for funneling and focusing questions, or neither. We introduce supervised and unsupervised approaches to differentiating these questions. Our best model, a supervised RoBERTa model fine-tuned on our dataset, has a strong linear correlation of .76 with human expert labels and with positive educational outcomes, including math instruction quality and student achievement, showing the model’s potential for use in automated teacher feedback tools. Our unsupervised measures show significant but weaker correlations with human labels and outcomes, and they highlight interesting linguistic patterns of funneling and focusing questions. The high performance of the supervised measure indicates its promise for supporting teachers in their instruction.
We use novel data on disciplinary referrals, including those that do not lead to suspensions, to better understand the origins of racial disparities in exclusionary discipline. We find significant differences between Black and white students in both referral rates and the rate at which referrals convert to suspensions. An infraction fixed-effects research design that compares the disciplinary outcomes of white and non-white students who were involved in the same multi-student incident identifies systematic racial biases in sentencing decisions. On both the intensive and extensive margins, Black and Hispanic students receive harsher sentences than their white co-conspirators. This result is driven by high school infractions and mainly applies to “more severe” infractions that involve fights or drugs. Reducing racial disparities in exclusionary discipline will require addressing underlying gaps in disciplinary referrals and the systematic biases that appear in the adjudication process.
In conversation, uptake happens when a speaker builds on the contribution of their interlocutor by, for example, acknowledging, repeating or reformulating what they have said. In education, teachers' uptake of student contributions has been linked to higher student achievement. Yet measuring and improving teachers' uptake at scale is challenging, as existing methods require expensive annotation by experts. We propose a framework for computationally measuring uptake, by (1) releasing a dataset of student-teacher exchanges extracted from US math classroom transcripts annotated for uptake by experts; (2) formalizing uptake as pointwise Jensen-Shannon Divergence (pJSD), estimated via next utterance classification; (3) conducting a linguistically-motivated comparison of different unsupervised measures and (4) correlating these measures with educational outcomes. We find that although repetition captures a significant part of uptake, pJSD outperforms repetition-based baselines, as it is capable of identifying a wider range of uptake phenomena like question answering and reformulation. We apply our uptake measure to three different educational datasets with outcome indicators. Unlike baseline measures, pJSD correlates significantly with instruction quality in all three, providing evidence for its generalizability and for its potential to serve as an automated professional development tool for teachers.
We provide novel evidence on the causal impacts of student absences in middle and high school on state test scores, course grades, and educational attainment using a rich administrative dataset that tracks the date and class period of each absence. We use two similar but distinct identification strategies that address potential endogeneity due to time-varying student-level shocks by exploiting within-student, between-subject variation in class-specific absences. We also leverage information on the timing of absences to show that absences that occur after the annual window for state standardized testing do not affect test scores, providing a further check of our identification strategy. Both approaches yield similar results. We nd that absences in middle and high school harm contemporaneous student achievement and longer-term educational attainment: On average, missing 10 classes reduces math or English Language Arts test scores by 3-4% of a standard deviation and course grades by 17-18% of a standard deviation. 10 total absences across all subjects in 9th grade reduce both the probability of on-time graduation and ever enrolling in college by 2%. Learning loss due to school absences can have profound economic and social consequences.