Analyzing heterogeneous treatment effects (HTE) plays a crucial role in understanding
the impacts of educational interventions. A standard practice for HTE analysis is to
examine interactions between treatment status and pre-intervention participant characteristics,
such as pretest scores, to identify how different groups respond to treatment.
This study demonstrates that identical patterns of HTE on test score outcomes can
emerge either from variation in treatment effects due to a pre-intervention participant
characteristic or from correlations between treatment effects and item easiness parameters.
We demonstrate analytically and through simulation that these two scenarios
cannot be distinguished if analysis is based on summary scores alone. We then describe
a novel approach that identifies the relevant data-generating process by leveraging
item-level data. We apply our approach to a randomized trial of a reading intervention
in second grade, and show that any apparent HTE by pretest ability is driven by the
correlation between treatment effect size and item easiness. Our results highlight the
potential of employing measurement principles in causal analysis, beyond their common
use in test construction.