AEFP 45th Annual Conference

Toward a Meaningful Impact through Research, Policy & Practice

March 19-21, 2020

Leveraging Managerial Autonomy to Turn Around Low-Performing Schools: Evidence from the Innovation Schools program in Denver Public Schools

Philip Gigliotti, Rockefeller College of Public Affairs and Policy, SUNY University at Albany,

Low performance in urban public schools is a persistent problem in American public education. Despite decades of reform efforts, including school finance, accountability and school choice reforms, these performance deficits stubbornly resist correction. School turnaround reforms have been introduced with the objective of driving rapid performance improvements through transformational managerial interventions. While a number of turnaround reforms have produced large performance improvements, others have failed to produce results. This heterogeneity recommends further contributions to the evaluation literature on school turnarounds to explore their effects in different contexts and with different reform strategies.
This study evaluates a managerial autonomy-based turnaround intervention to explore how turnarounds impact school performance over time and to test the effects of managerial autonomy-based educational reforms on school performance. After struggling with chronic low performance, Denver Public School District expanded accountability and school choice, spurring significant reform in traditional public schools. To encourage these reforms, DPS developed the Innovation Schools program, which waived district policies governing staffing, scheduling, management and curriculum to allow the implementation of innovative reforms. Schools were granted autonomy to implement innovative policies, such as in-house teacher evaluation, longer school days and years, private fundraising and consulting, and curricular reforms. Since Innovation Schools were differentiated through the provision of managerial autonomy, this reform presents an opportunity to test both the effects of turnaround reforms in general and the understudied relationship between managerial autonomy and public school performance.
This study employs school-level data from the Colorado Department of Education to evaluate the effect of the Innovation Schools reform on school performance using a two-way fixed effects difference-in-differences design. The dependent variable is school-level mean score on year-end examinations and the treatment leverages within school variation from 13 public schools that transitioned to Innovation status. Robustness checks include estimation with propensity score matched comparison groups to address non-random selection into treatment, Wild Bootstrap hypothesis testing to address the small number of treated clusters, graphical examination and placebo testing to assess parallel trends assumptions, estimation with school-specific linear time trends, and event-study models to understand the development of treatment effects over time.
District-wide, 50-60 percent of students fail to achieve proficiency on year-end examinations, and proficiency rates are significantly lower in Innovation Schools. I find that transition to Innovation status produced significant positive effects of between .1 and .3 standard deviations on math, reading and writing test scores. However, event study models suggest that the positive effects of the reform peaked in year two of implementation, and decline steeply thereafter. Findings are robust across different model specifications.
The results suggest that delivering managerial autonomy to struggling public schools might drive significant performance improvements. However, the fade out of effects after year two suggests that turnaround schools must maintain efforts beyond the initial implementation period to sustain results. This paper contributes to the school turnaround literature, providing evidence of how turnaround effects develop over time, and to the broader educational policy literature by providing evidence on the effects of managerial autonomy-based reforms.


Very informative work, with helpful visualizations of results. The methods mention an 8-year panel, but it looks like you have 12 years in the time series results, is that right? Did you have data on each for 7 years prior and 5 years post for each of the 13 schools that constitute the treatment group? Since there are just 13 schools that transitioned to Innovation status, it might be interesting to display trends looked like for each school. If it's possible to collect information on what was similar and different across the 13 schools, that could add helpful context. To consider potential mechanisms, do you have data on staff changes - when those occurred, whether schools were able to retain staff, information on teacher qualifications or performance? Thank you for sharing your work! If you have any questions about my comments, you can reach me at

Thank you very much Cara for your interest in my poster. The event study coefficients measure time from treatment initiation, which is staggered across 5 years and treatment cohorts. While the first treatment cohort (treated in 2009-10) has 5 years of post-treatment data, the last treatment cohort (treated in 2013-14) only has 1. Similarly, the first treatment cohort has 3 years of pre-treatment data, while the last treatment cohort has 7. The event study indicators pool all Innovation Schools currently in year n relative to treatment, so they can give an indication of the average trajectory of pre-treatment and post-treatment test score development across treatment cohorts. For this reason, there is attrition from the treatment year indicator groups as we move farther from treatment, making interpretation of very early and very late periods nuanced. (The cohorts for the year -4 through year +3 indicators all have at least 10 schools, but the membership declines significantly outside that range. The year -7 indicator only represents the 1 school in the last treatment cohort, while the year +5 indicator only represents the 3 schools in the first treatment cohort.) I have explored how these dynamic effect trends develop by treatment cohort, and find that the rapid increase into year 2 and decline thereafter is relatively consistent across cohorts. The paper does include some analysis of non-academic factors, including Enrollment, Student-Teacher Ratio, Teacher Salary, and Student Discipline, but I was somewhat limited in terms of what was publicly available or able to be requested at the school level. The most notable relationship was that Teacher Salary decreased significantly by approximately $2,575 in treatment schools, from a pre-treatment mean of approximately $53,000. This may indicate a shift towards less experienced teachers, and corresponding turnover, consistent with the experience of other turnaround reforms. Thank you once again for your comments and please let me know if you have any more questions.

This is an interesting observation. I work in school improvement in Kentucky. One of the key pieces that we often discuss is the sustainability of changes made by leadership. In some districts we see greater managerial flexibility offered to low-performing schools and we also generally see rapid improvement (anecdotally). I would attribute those changes due to dramatic shifts in teaching staff rather than sustainable systematic change efforts. From my vantage point, improvement that happens too quickly is a cause for alarm. I am interested to know if you have any data related to what those principals actually did with their managerial flexibility. Also, is there a union contract in Denver? How was the union contract influenced by this managerial flexibility? Matthew Courtney

Thank you Matthew for your interest in my poster. I also think the finding of unsustainable effects is very interesting, and has important implications for policy-makers and practitioners considering school improvement reforms. I am very interested to hear that this finding mirrors your experience in Kentucky schools, and that you may be inclined to consider rapid performance increases as a cause for concern. This is counter-intuitive to the way I sometimes feel as an academic policy researcher, where a finding of a rapid performance increase resulting from a policy or intervention can seem very exciting. I think this project has pushed me to be more critical in thinking through the implications of performance increases and whether they represent sustainable change. To respond to some of your questions: Each school submitted a request for individual waivers from district policies to facilitate their planned reforms. Table 1 under the Adoption and Implementation of Innovation Schools heading lists some of the specific waivers requested and the percentage of schools requesting them. 100% of schools received waivers for budgetary control, personnel selection and pay, and longer school day and year, along with nearly 90% waiving a vast array of other policies related to personnel. In terms of budgetary control, districts were released from the responsibility to return unspent funds to the district and were even allowed to engage in external fundraising, which led to partnerships with private consultants that are explored in the paper. In terms of personnel, the program was facilitated by the broader Innovation Schools Act, which was passed by the Colorado State legislature in 2008. This provided schools with the legal ability to waive teacher collective bargaining which allowed complete reform of teacher human resource policies. In practice this allowed them to fire and hire with reduced restrictions, and it allowed them to prevent the district from reallocating their teachers to other schools. Additionally, most schools developed their own employee evaluation and professional development programs, including pay for performance initiatives. Thank you once again for your comments and let me know if you have any further questions.

This is an interesting innovation and definitely highlights the value of event studies. It seems odd that we see gigantic positive effects followed by 0 effects. Do you have any idea why? It seems that it would be worth finding out whether there was principal turnover or some change in policy implementation.

Thank you Doug for this important question driving at the mechanisms behind these effects. One approach to addressing your question could be to do some more detailed analysis to identify heterogeneity in the development of these dynamic achievement trends by treatment cohort. As I mentioned in my response to Cara, the best evidence of mechanisms that I uncovered in the non-academic analysis in this paper was that teacher salary declined significantly in treated schools, which could reveal turnover leading to replacement of experienced teachers with less experienced teachers, consistent with the experience of other turnaround reforms. To explore whether these personnel mechanisms are driving these results, I could test the hypothesis that the dynamic teacher salary trajectories by cohort mirror the academic trends. That is, in years where the treatment cohorts saw large academic impacts, I would test for negative impacts on teacher salary, indicating some type of changes to personnel. In years with no academic effects, there should be few corresponding impacts in the teacher salary variable. This could indicate stalled reforms being associated with the difficulty sustaining academic achievement effects. This would be consistent with Matthew’s observation that large performance gains are often associated with dramatic shifts in teacher staff, but that these impacts are then difficult to sustain. While this idea is very preliminary, I think your question has raised some important considerations for me in trying to explore mechanisms, and has suggested some potentially informative approaches for looking at mechanisms in revised versions of the paper. Thank you once again for your comments and let me know if you have any further questions.

Are you able to access the raw score data that underlie the proficiency demarcation? If so, then you could check whether the pattern of results holds for a continuous outcome (of average underlying score) instead of a dichotomous one.

Thank you Dan for your interest in my poster. The original version of this paper used the school-level percentage of students meeting proficiency standards in each subject as the dependent variable. However, consistent with feedback I received and some important work by Andrew Ho outlining the problems with using proficiency measures in educational impact analysis, I sought out the school level mean test scores. The dependent variables in this version of the paper are continuous school level mean standardized test scores, ranging from 0 to approximately 800, in math, reading and writing. The main results in both versions were mostly consistent, indicating that the problems with using proficiency rates were not meaningfully biasing results in this case. However, since the interpretation of changes in mean test score are much clearer given the concerns raised by Ho, and it’s a better practice to use continuous test scores than proficiency rates, I have used the mean test scores in all recent work on this topic. I think this current approach using the continuous mean test scores is consistent with your recommended approach. Thank you once again for your comments and let me know if you have any further questions.

Neat topic and nice poster! Your estimation strategy is solid and the results are quite interesting. I had a few questions and suggestions for future work: 1) Do the innovation schools all serve the same grade levels? If not, can you look at heterogeneous effects by primary vs. secondary schools? Also, if you have high schools, can you look at graduation rates as an outcome? 2) Do you have access to any data that would allow you to see what school-level factors are changing that may explain the increase in test scores? For example, spending or hiring decisions? 3) Given that you have relatively few treatment groups, you could use a synthetic control approach for each innovation school. This might help you see if a subset of schools are driving your results. 4) The 17 schools that opened as innovation schools are interesting. I realize you don't have pre-treatment data on these schools, but I wonder if you could look descriptively at how they compare to other schools in the district. If they seem to be performing better for the first several years and then about the same, that could provide additional support to your main findings.

Thank you Riley for your interest in my poster. The current version of the paper explores a number of sources of heterogeneity in the performance impacts of the program. While I do not look specifically at primary vs secondary schools, the small number of treated schools did allow me to explore how each school contributes to the impact of the reform. By exploring this heterogeneity, I found that the effects were driven by a group of 8 “high-performing” schools who had effect sizes almost double the main treatment effect. The remaining 5 “low-performing” schools had null negative impacts from the reform, indicating a lack of progress or possible negative impacts. I interpret this with the argument that delivering autonomy to schools to develop their own vision of a school improvement reform is a “high-risk high-reward” intervention; providing discretion has the potential to drive innovation, but it can go awry if plans or implementation are faulty. Another interpretation is that the “low-performing” schools were mostly high-schools while the “high-performing” schools were mostly elementary and middle schools. I have a general impression that younger students are more likely to respond to school improvement reforms, which could be supported by recent meta-analytical evidence of the charter school literature by Betts and Tang (2019) that suggests achievement effects in elementary and middle schools but not in high schools. As I mentioned in my replies to Cara and Doug, I am somewhat limited in terms of publicly available data, and have not been successful requesting data on personnel at the school level. However, I do have data on teacher salaries and find that the reform led to significant negative impacts on teacher pay of approximately $2500 from a pre-treatment mean of $53,000. This could be consistent with increased teacher turnover leading to replacement of experienced teachers with inexperienced teachers, a phenomenon that has been identified in other turnaround studies. Matthew notes that he perceives these large replacements of teaching staff can lead to dramatic performance increases which may be unsustainable. There seems to be a lot of interest in more granular explorations of how these dynamic performance trends develop. In my response to Doug, I noted that an exploration of how the dynamic academic performance effects, along with the dynamic teacher salary effects, develop across the 5 treatment cohorts could be very informative in explaining mechanisms. I think it is now likely that this type of analysis will make its way into revised versions of the paper. I’ve also received some interest in what happened to the 17 schools that opened as Innovation Schools. I’ve tried to focus on those that transition to Innovation School status, because I think they really represent what happens when an existing school tries to use this reform to “turnaround” performance from their prior trajectory. But I think there are ways to look at the other 17 schools which could have important policy implications or implications for the interpretation of the main results. One approach would be to estimate an event study for this group that excludes their first year of operation. This would allow estimation of how academic trends developed in these schools compared to a group of similar schools. The similarity of these trajectories to the main sample could be interesting and valuable to readers. Thank you once again for your comments on my poster and let me know if you have any further questions.

Add new comment