Search and Filter
Assessment
Estimating Treatment Effects with the Explanatory Item Response Model
Topics: MethodsThis simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores… more →
On the Threshold: Impacts of Barely Passing High-School Exit Exams on Post-Secondary Enrollment and Completion
Many states use high-school exit examinations to assess students’ career and college readiness in core subjects. We find meaningful consequences of barely passing the mathematics examination in Massachusetts, as opposed to just failing it. However, these impacts operate at different educational… more →
The Other Half of the Story: Does Excluding the Early Grades from School Ratings Matter?
Because high-stakes testing for school accountability does not begin until third grade, accountability ratings for elementary schools do not directly measure students’ academic progress in grades K through 2. While it is possible that children’s test scores in grades 3 and above are highly… more →
Modeling Item-Level Heterogeneous Treatment Effects with the Explanatory Item Response Model: Leveraging Online Formative Assessments to Pinpoint the Impact of Educational Interventions
Topics: MethodsTags: AssessmentAnalyses that reveal how treatment effects vary allow researchers, practitioners, and policymakers to better understand the efficacy of educational interventions. In practice, however, standard statistical methods for addressing Heterogeneous Treatment Effects (HTE) fail to address the HTE that… more →
Promises, Pitfalls, and Tradeoffs in Identifying Gifted Learners: Evidence from a Curricular Experiment
Disparities in gifted representation across demographic subgroups represents a large and persistent challenge in U.S. public schools. In this paper, we measure the impacts of a school-wide curricular intervention designed to address such disparities. We implemented Nurturing for a Bright… more →
Does Teacher Professional Development Improve Student Learning? Evidence from Leading Educators’ Fellowship Model
Topics: Teacher and Leader DevelopmentTeachers are the most important school-specific factor in student learning. Yet, little evidence exists linking teacher professional development programs and the strategies or activities that comprise them to student achievement. In this paper, we examine a fellowship model for professional… more →
Assessors influence results: Evidence on enumerator effects and educational impact evaluations
Topics: MethodsA significant share of education and development research uses data collected by workers called “enumerators.” It is well-documented that “enumerator effects”—or inconsistent practices between the individual people who administer measurement tools— can be a key source of error in survey data… more →
Signal Weighted Value-Added Models
Topics: MethodsTags: Assessment, EfficacyThis study introduces the signal weighted teacher value-added model (SW VAM), a value-added model that weights student-level observations based on each student’s capacity to signal their assigned teacher’s quality. Specifically, the model leverages the repeated appearance of a given student to… more →
Bias in kindergarten ability group placement: Does parental lobbying make it worse? Do formal assessments make it better?
Von Hippel & Cañedo (2021) reported that US kindergarten teachers placed girls, Asian-Americans, and children from families of high socioeconomic status (SES) into higher ability groups than their test scores alone would warrant. The results fit the view that teachers were biased.
Navigating Remote Delivery of Assessments for Head Start Children During the COVID-19 Pandemic
Leiah Groom-Thomas, Monica G. Lee, Cate Smith Todd, Kathleen Lynch, Susanna Loeb, Scott McConnell, Lydia Carlis.Many preschool agencies nationwide continue to experience closures and/or conversions to virtual or hybrid instruction due to the ongoing COVID-19 pandemic. Despite the importance of understanding young children’s learning and development during the COVID emergency, limited knowledge exists on… more →
A Bridge to Graduation: Post-Secondary Effects of an Alternative Pathway for Students Who Fail High School Exit Exams
Tags: High schools, AssessmentHigh school exit exams are meant to standardize the quality of public high schools and to ensure that students graduate with a set of basic skills and knowledge. Evidence suggests that a common perverse effect of exit exams is an increase in dropout for students who have difficulty passing tests… more →
Using Predicted Academic Performance to Identify At-Risk Students in Public Schools
Topics: Student LearningTags: Assessment, EquityMeasures of student disadvantage—or risk—are critical components of equity-focused education policies. However, the risk measures used in contemporary policies have significant limitations, and despite continued advances in data infrastructure and analytic capacity, there has been little… more →
Can learning be measured by phone? Evidence from Kenya
Topics: MethodsSchool closures induced by COVID-19 placed heightened emphasis on alternative ways to measure student learning besides in-person exams. We leverage the administration of phone-based assessments (PBAs) measuring numeracy and literacy for primary school children in Kenya, along with in-person… more →
Students' Grade Satisfaction Influences Evaluations of Teaching: Evidence from Individual-level Data and an Experimental Intervention
Student surveys are widely used to evaluate university teaching and increasingly adopted at the K-12 level, although there remains considerable debate about what they measure. Much disagreement focuses on the well-documented correlation between student grades and their evaluations of instructors… more →
Bridging human and machine scoring in experimental assessments of writing: tools, tips, and lessons learned from a field trial in education
Topics: MethodsIn a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the… more →
Measuring and Summarizing the Multiple Dimensions of Teacher Effectiveness
Topics: Teacher and Leader DevelopmentTags: AssessmentThere is an emerging consensus that teachers impact multiple student outcomes, but it remains unclear how to measure and summarize the multiple dimensions of teacher effectiveness into simple metrics for research or personnel decisions. We present a multidimensional empirical Bayes framework and… more →
Teacher Preparation Programs and Graduates' Growth in Instructional Effectiveness
Topics: Teacher and Leader DevelopmentMany prior studies have examined whether there are average differences in levels of teaching effectiveness among graduates from different teacher preparation programs (TPPs); other studies have investigated which features of preparation predict graduates’ average levels of teaching effectiveness… more →
College Entrance Exam-Taking Strategies in Georgia
Using administrative data from Georgia, we provide the first study of the full set of college entrance exam-taking strategies, including who takes the ACT and the SAT (or both), when they take the exams, and how many times they take each exam. We have several main findings. First, one-third of… more →
Characterizing Cross-Site Variation in Local Average Treatment Effects in Multisite Regression Discontinuity Design Contexts with an Application to Massachusetts High School Exit Exam
Topics: MethodsTags: Assessment, High schoolsIn multisite experiments, we can quantify treatment effect variation with the cross-site treatment effect variance. However, there is no standard method for estimating cross-site treatment effect variance in multisite regression discontinuity designs (RDD). This research rectifies this gap in… more →
Using Implementation Fidelity to Aid in Interpreting Program Impacts: A Brief Review
Topics: MethodsTags: Assessment, CurriculumPoor program implementation constitutes one explanation for null results in trials of educational interventions. For this reason, researchers often collect data about implementation fidelity when conducting such trials. In this article, we document whether and how researchers report and measure… more →