Exam Results: What can we really learn from them?

6 min readAug 30, 2023

If you’re a current Head of Department or senior leader, then you have probably spent a lot of the last week poring over sets of exam results. However, oftentimes I have found that the process often gets clouded by a lot of noisy data. We get obsessed with progress data showing how one class performed compared to another, or how this subject did compared to that one.

Some of that can be useful to provide a very broad indication of relative performance, but even then it’s often frought with problems. Is it really fair to compare the results of a core subject like Maths, where all students might be setted in one block, where there are more teaching groups than form groups for the year group and priority timetabling (ie no last period lessons), vs those of a big option subject like History or Geography where you might have three classes of 31 students each, completely mixed attaining, vs those of a small, niche option subject which are inherently quite self-selecting. Subjects inhabit very different contexts within your school ecosystem and senior leaders do well to remind themselves of this during the exam review period in September.

As a Head of Department, I always found the materials available to download from your exam board website much more useful. Here I could have every student’s mark for every question, I could total up the marks on each paper and add in notional grades (which I’d then average for each paper to compare performance of our cohort across different papers), compare averages for our cohort for each question with the national average or those of similar centres, and then work out areas of strength and weakness from there. This information is hard, externally verified data about the effect of our teaching and learning in our subject. Did we prepare the students as well as possible for the different versions of the papers that came out? Is there a paper we seemed to do really well on versus one where our students struggled?

This information always seemed to provide much more useful insights into where we needed to improve as a Department than having information about the gender gap or the performance of the two EAL students out of 90 in my cohort or something like that. Telling me that PP kids did way worse than non-PP kids doesn’t help me teach either group any better to be honest. But if I know that they all messed up the question on nineteenth-century law enforcement then at least I’ve got something I can re-teach better.

For example, at the end of my first year as head of department after the 2019 GCSE History exams, I was able to identify that Medicine through Time, previously perceived as a strong section for our centre, was actually one where students had under-performed. There was a question on the growth of medical knowledge during the Renaissance that caused students lots of problems and we were way behind the national average. This question was probably not likely to have been asked on the old spec and was a sign that our teaching to the new specification was not yet sharp enough on the Medicine topic which had been taught in the department for years.

One further curiosity is how in every school I’ve had to go and ask for the relevant accounts when you would think this is something that should be provided as standard especially for a new Head of Department (and navigating the secure areas of exam board websites to get to the good stuff should probably be part of a new HoDs induction CPD). It feels like it’s assumed that you should be doing this yet I’ve encountered lots of HoDs across many schools without access to this information.

As a Senior Leader I now have lots of different subjects to think about and have tried to apply some of those ideas to looking at these. There are a few challenges. For example, Science has 12 total papers across the Combined course (6 each for Higher and Foundation). If your centre does separate sciences as well that’s a lot of data to wade through. It’s also not as easy to identify reasons for why some things happened — all you can really do is hypothesise a bit. One of the things I picked up from Science was that Higher Chemistry was the strongest of the three Sciences, and Higher Physics the weakest (with Biology the weakest) but the trend was reversed in Foundation tier, despite extremely high grade bounadaries for Foundation Physics relative to the other Sciences.

In Maths we had very even performance across all three papers (except for Foundation Tier Paper 3 on that tier) in what was a good year for us in this subject. However it was interesting to look at questions where students performed poorly relative to national. For example, this 2-marker below on the Foundation tier was a weaker question

We also had a one-marker below that didn’t score well

One thesis I’ve got, that links to other areas of school improvement we are working on, is around literacy. Have the words ‘perimeter’ and ‘mode’ caused problems for students in a school where reading ages on average are low compared to chronological age? Did students confuse ‘perimeter’ for ‘area’ as intuitively when you look at the picture you are almost drawn to start counting the squares rather than the length of the sides of the shapes? Hammering accuracy on technical vocabulary like this ‘might’ be an area to work on but I’d still need to talk to the people who have been teaching this course and our students before drawing any firm conclusions.

[As an aside, I had to do quite a lot of invigilation this year and I find watching the kids doing Maths exams so interesting — especially on the Foundation tier where the actual Maths is quite easy, but deciphering what Maths you need to do from the wordy problem is the tricky bit. You can see them tripping up in real time and it takes a lot of restraint for me not to jump in — as well as the fact that clearly I’d get fired!]

A few other trends stood out. In line with the issues around school attendance, making sure the students turned up for every exam was a challenge and we saw in a number of subjects a tail off in performance on the later papers in the series. For example our Paper 3 average grade in History (Edexcel) was lower than it was for Paper 2, which seems surprising given the relative ease of Paper 3 vs Paper 2 which is pretty brutal. We saw this in English and Foundation Maths as well. Perhaps for students who are struggling to even get the all-important ‘standard pass’ Grade 4, there is little incentive to rock up for a mid-June paper when the die is already cast in terms of your result no matter how many ‘inspirational’ assemblies we give about how every grade counts.

As alluded to above, literacy remains a huge focus. Subjects with lots of writing found it a bit tougher this year. Even in Urdu, a successful subject in our school (where we have a very high number of students from Pakistani backgrounds), writing grades were much, much lower compared to listening and (especially) speaking grades. I knew this was a trend but the actual numbers were starker than I expected but it’s a useful line of enquiry for next year. If we can drive up reading ages in our school, ensure students remain motivated through the whole series and have better attendance in general we can make serious strides in making further improvements beyond where we’ve got to so far.

If you’ve got this far I hope you are able to navigate your own exam review season well enough and productively enough, whichever side of the table you are on! Let’s try and learn something useful this time around!

Exam Results: What can we really learn from them?

Written by Kristian Shanks