When analyzing treatment effects on test scores, researchers face many choices and
competing guidance for scoring tests and modeling results. This study examines the
impact of scoring choices through simulation and an empirical application. Results
show that estimates from multiple methods applied to the same data will vary because
two-step models using sum or factor scores provide attenuated standardized treatment
effects compared to latent variable models. This bias dominates any other differences
between models or features of the data generating process, such as the use of scoring
weights. An errors-in-variables (EIV) correction removes the bias from two-step models.
An empirical application to data from a randomized controlled trial demonstrates the
sensitivity of the results to model selection. This study shows that the psychometric
principles most consequential in causal inference are related to attenuation bias rather
than optimal scoring weights.
Gilbert, Joshua B.. (). How Measurement Affects Causal Inference: Attenuation Bias is (Usually) More Important Than Scoring Weights. (EdWorkingPaper:
-766). Retrieved from
Annenberg Institute at Brown University: https://doi.org/10.26300/4hah-6s55
Given the rapid adoption of machine learning methods by education researchers, and growing acknowledgement of their inherent risks, there is an urgent need for tailored methodological guidance on how to improve and evaluate the validity of…
Differences in effect sizes between researcher developed (RD) and independently developed (ID) outcome measures are widely documented but poorly understood in education research.