In March the Department for Education (DfE) released a working paper called Measuring the Performance of Schools within Academy Chains and Local Authorities. It contains the first official performance ranking of the organisations responsible for groups of schools in England: local authorities and academy chains. The performance of individual schools has been monitored and published in the UK since at least 1992, but until now the performance of groups of schools has not been systematically reported on. The new ranking therefore represents the expansion of data driven accountability to an additional tier of the school system.
The findings have been analysed and commented on widely. Anti-academy campaigners used them to claim that academy chains do not deliver better results than local authorities. Newsnight’s Chris Cooke pointed out that there are high and low performing examples of both local authorities and academy chains and argued that instead of debating their relative merits, we should focus on working out how to emulate the most successful examples, whether that be Ark academies or Hackney local authority. Robert Hill, education adviser in the Blair government, used the findings to do just that. Hill argues that high ranking chains tend to work in localised clusters, expand slowly and focus on pedagogy and oversight. Although they do not say it explicitly, it is safe to assume that the DfE also intends school commissioners to use the rankings when choosing academy chains to take over ‘failing’ schools.
This may all sound like useful, evidence-base policy analysis. But I want to argue that this sort of performance ranking is so flawed that it is effectively meaningless and therefore not very useful, either for drawing policy lessons or making commissioning decisions.
The new performance measure developed by the DfE ranks authorities and chains based on the ‘value added’ they achieve for pupils across their schools. This is measured as the difference between their predicted GCSE grades (based on Key Stage 2 attainment) and actual GCSE grades. This means that secondary schools do not take the credit (or blame) for the performance of their feeder primary schools, which is sensible. It also places a lower weight on the results of schools who have just joined an academy chain, to reflect the chains limited influence on that particular school.
What it does not take account of however, are non-school factors which influence pupil progress during secondary school. As the DfE analysts put it, their measure assumes that schools have the same “propensity for improvement” (p16). The problem is that we know this is not true. Pupil’s household income, for example, is known to have a strong relationship with attainment. Leaving it out will therefore create significant inaccuracies in the ranking.
The DfE hint at incorporating such contextual factors in future versions of the ranking (p17). This may sound sensible, but we have been here before. When dissatisfaction grew with the value added measure (first introduced in 2002) to rank individual schools, the government developed a Contextual Value Added (CVA) measure, which tried to control for such non-school factors. But research by Lorraine Dearden and colleagues demonstrated that leaving out the level of education of a pupil’s mother (data which is not generally collected) caused “significant systematic biases in school CVA measures for the large majority of schools.” Stephen Gorrard then pointed out that (non-random) missing data meant there were large errors in the estimates and, by extension, any ranking based on them. Adding contextual information to the DfE’s new measure would therefore only repeat the mistakes made with CVA, which has since been abolished.
These flaws in the new measure are severe enough to make them effectively meaningless, since it is unclear whether a high score represents the influence of the local authority/chain, the influence of other factors which are not taken into account, or just statistical noise.
There is also a more general sense in which this new measure ignores lessons from recent education research. A great deal of work in the last five years has tried to identify the policies and approaches behind London schools relative success. But Simon Burgess has now shown using census data that all of London’s superior performance can be accounted for by differences in ethnicity and migration patterns. If he is right, then the hunt for ‘what worked’ in London has largely been a wild goose chase. Indeed the suspicious concentration of London-based local authorities and academy chains at the top of DfE’s new ranking suggests that migration patterns might also be what is driving the results of their analysis. Studying successful exemplars, whether cities or academy chains, is difficult and potentially misleading.
A better approach to finding out what works is to study policies. Because we can measure the attainment of the same pupils before and after a policy is implemented, it is possible to rule out the influence of a range of other factors, even when we cannot measure them. Returning to the London example, a highly-aspirational recent immigrant before the policy is implemented is still a highly-aspirational recent immigrant after it is implemented. Their migration status therefore cannot be what is driving any observed changes in outcomes after the policy is implemented. Another benefit of studying policies is that they are easier to replicate. If a specific professional development programme for teachers is found to be effective, for example, it is fairly straightforward to deliver that programme in other schools. Other things equal, knowing that Hackney is an effective local authority just isn’t as useful.
The flaws in the DfE’s new accountability measure for local authorities and academy chains are severe enough to make them effectively meaningless. The ranking on which they are based is therefore not very useful, either for drawing policy lessons or making commissioning decisions. In general, evaluating policies will provide more reliable and useful insights than trying to identify and analyse examples of effective providers. Let’s not repeat the mistakes of past accountability reforms.
This piece originally appeared on the LSE Politics and Policy Blog.