Comparability of National Tests Over Time: Key Stage Tests Standards 1996 - 2001
Carried out by University of Cambridge Local Examinations Syndicate for the Qualifications and Curriculum Authority (QCA) this project was instigated to investigate the equivalence of standards set in national tests over a period of several years. The experimental design used an ‘equivalent-groups’ definition of comparability, whereby children allocated at random to equivalent groups taking two (present and past) versions of a test were expected to obtain similar results if standards were well aligned.
Key Stage Two Science: 1996 v 2001
Perhaps the most important feature of the experimental data is that it provides support for the view that there was a great improvement in children's performance on Key Stage Two Science tests in the years 1996-2001. However as in the Key Stage Two English analysis, there were signs that a small part of the very large improvement in national test results reported between 1996 and 2001 might be a product of a shift in test standards.
Key Stage Three Science: 1996 v 2001
Experimental comparison of the levels achieved, over all tiers, via the 1996 and 2001 versions of Key Stage Three Science, suggested that after controlling for variations in ability arising from gender effects and assignment to the groups taking the two versions, the levels achieved were very similar. These data therefore suggest that the quite substantial gains in Key Stage Three Science test results reported nationally between 1996 and 2001 were merited; reflecting improvements in teaching and learning in schools.
The research team concluded that ‘Like the similar conclusions regarding almost all the curriculum areas at all three key stages investigated, perhaps this should be recognised as the most important inference we have been able to make’.