Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

Drew H. Bailey

Anamarie A. Whitaker, Margaret Burchinal, Jade M. Jenkins, Drew H. Bailey, Tyler W. Watts, Greg J. Duncan, Emma R. Hart, Ellen S. Peisner-Feinberg.

High-quality preschool programs are heralded as effective policy solutions to promote low-income children’s development and life-long wellbeing. Yet evaluations of recent preschool programs produce puzzling findings, including negative impacts, and divergent, weaker results than demonstration programs implemented in the 1960s and 70s. We provide potential explanations for why modern preschool programs have become less effective, focusing on changes in instructional practices and counterfactual conditions. We also address popular theories that likely do not explain weakening program effectiveness, such as lower preschool quality and low-quality subsequent environments. The field must take seriously the smaller positive, null, and negative impacts from modern programs and strive to understand why effects differ and how to improve program effectiveness through rigorous, longitudinal research.

More →


Emma R. Hart, Drew H. Bailey, Sha Luo, Pritha Sengupta, Tyler W. Watts.

Fadeout is a pervasive phenomenon: post-test impacts on cognitive skills commonly decrease in the years following an educational intervention. Less is known, although much is theorized, about social-emotional skill persistence. The current meta-analysis investigated whether educational RCT impacts on social-emotional skills demonstrated greater persistence than impacts on cognitive skills among 87 interventions involving 59,237 participants and 443 outcomes measured at post-test and at least one follow-up. For post-test impacts of the same magnitude, persistence rates were similar (43% of post-test magnitude) across skill types for follow-ups occurring 6 to 12 months after post-test. At 1- to 2-year follow-ups, persistence rates were larger for cognitive skills (37%) than for social-emotional skills. Interestingly, smaller posttest impacts persisted at proportionately higher rates than larger impacts, which may benefit interventions measuring social-emotional outcomes given their smaller post-test impacts. Considered in whole, social-emotional and cognitive skills demonstrated similar patterns of fadeout.

More →


Daniela Alvarez-Vargas, Sirui Wan, Lynn S. Fuchs, Alice Klein, Drew H. Bailey.

Despite policy relevance, longer-term evaluations of educational interventions are relatively rare. A common approach to this problem has been to rely on longitudinal research to determine targets for intervention by looking at the correlation between children’s early skills (e.g., preschool numeracy) and medium-term outcomes (e.g., first-grade math achievement). However, this approach has sometimes over—or under—predicted the long-term effects (e.g., 5th-grade math achievement) of successfully improving early math skills. Using a within-study comparison design, we assess various approaches to forecasting medium-term impacts of early math skill-building interventions. The most accurate forecasts were obtained when including comprehensive baseline controls and using a combination of conceptually proximal and distal short-term outcomes (in the nonexperimental longitudinal data). Researchers can use our approach to establish a set of designs and analyses to predict the impacts of their interventions up to two years post-treatment. The approach can also be applied to power analyses, model checking, and theory revisions to understand mechanisms contributing to medium-term outcomes.

More →


Drew H. Bailey, Greg J. Duncan, Richard J. Murnane, Natalie Au Yeung.

A survey targeting education researchers conducted in November, 2020 provides both short- and longer-term predictions of how much achievement gaps between low- and high-income students in U.S elementary schools will change as a result of COVID-related disruptions to schooling and family life. Relative to a pre-COVID achievement gap of 1.00 SD, respondents’ median forecasts for increases in achievement gaps in elementary school by spring, 2021 were very large – from 1.00 to 1.30 and 1.25 SD, respectively, for math and reading. Researchers forecast only small reductions in gaps between spring 2021 and 2022. Although forecasts were heterogeneous, almost all respondents predicted that gaps would grow during the pandemic and would not return to pre-pandemic levels in the following school year. We discuss some implications of these predictions for strategies to reduce learning gaps exacerbated by the pandemic as well as the mental models researchers appear to employ in making their predictions.

More →


Drew H. Bailey, Jade M. Jenkins, Daniela Alvarez-Vargas.

The sustaining environments hypothesis refers to the popular idea, stemming from theories in developmental, cognitive, and educational psychology, that the long-term success of early educational interventions is contingent on the quality of the subsequent learning environment. Several studies have investigated whether specific kindergarten classroom and other elementary school factors account for patterns of persistence and fadeout of early educational interventions. These analyses focus on the statistical interaction between an early educational intervention – usually whether the child attended preschool – and several measures of the quality of the subsequent educational environment. The key prediction of the sustaining environments hypothesis is a positive interaction between these two variables. To quantify the strength of the evidence for such effects, we meta-analyze existing studies that have attempted to estimate interactions between preschool and later educational quality in the United States. We then attempt to establish the consistency of the direction and a plausible range of estimates of the interaction between preschool attendance and subsequent educational quality by using a specification curve analysis in a large, nationally representative dataset that has been used in several recent studies of the sustaining environments hypothesis. The meta-analysis yields small positive interaction estimates ranging from approximately .00 to .04, depending on the specification. The specification curve analyses yield interaction estimates of approximately 0. Results suggest that the current mix of methods used to test the sustaining environments hypothesis cannot reliably detect realistically sized effects. Our recommendations are to combine large sample sizes with strong causal identification strategies, and to study combinations of interventions that have a strong probability of showing large main effects.

More →


Remy Pages, Dylan Lukes, Drew H. Bailey, Greg J. Duncan.

Using an additional decade of CNLSY data, this study replicated and extended Deming’s (2009) evaluation of Head Start’s life-cycle skill formation impacts in three ways. Extending the measurement interval for Deming’s adulthood outcomes, we found no statistically significant impacts on earnings and mixed evidence of impacts on other adult outcomes. Applying Deming’s sibling comparison framework to more recent birth cohorts born to CNLSY mothers revealed mostly negative Head Start impacts. Combining all cohorts shows generally null impacts on school-age and early adulthood outcomes.

More →