Search and Filter

The Sensitivity of Value-Added Estimates to Test Scoring Decisions

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the sensitivity of VA estimates to scoring method using empirical item response data from 23 education datasets. We show that VA estimates are frequently highly sensitive to scoring method, holding constant students and items. While the various test scores are highly correlated, on average, different scoring approaches result in VA percentile ranks that vary by over 20 points, and over 50% of teachers or schools ranked in more than one quartile of the VA distribution. Dispersion in VA ranks is reduced with complete item response data and more consistent correlations between baseline and endline scores across scoring methods. We conclude that consideration of both measurement error and model uncertainty is necessary for appropriate interpretation of VAMs.

Keywords
value-added model, item response theory, test scoring, reliability, education policy
Education level
Topics
Tags
Document Object Identifier (DOI)
10.26300/g4gn-s810
EdWorkingPaper suggested citation:
Gilbert, Joshua B., James G. Soland, and Benjamin W. Domingue. (). The Sensitivity of Value-Added Estimates to Test Scoring Decisions. (EdWorkingPaper: -1226). Retrieved from Annenberg Institute at Brown University: https://doi.org/10.26300/g4gn-s810

Machine-readable bibliographic record: RIS, BibTeX