Assessment in History — A Sissyphean task?

6 min readNov 18, 2020

This post is inspired by a rather excellent short blog by David Didau, which can be found here. In it, Didau argues that we need to move away from ‘grades’ to assess pupils, and move towards checking whether pupils have met curriculum expectations. I absolutely, whole-heartedly agree. Didau also rightly attacks some of the ‘backdoor’ grading processes we’ve seen as a way of replacing Key Stage 3 curriculum levels (such as ‘Emerging’ or ‘Mastering’) and suggests that the danger of these is that students associate themselves with the grade rather than the actual areas for improvement.

And yet…

I am left with more questions than answers, as it pertains to History.

In particular, I think the issue is — why do I assess and what purpose is it serving? These might seem daft questions, but let’s go back to basics here.

There are two key assessment types — formative and summative. In short, with formative assessment, I want to find out what they can and can’t do, what they do and don’t know. I want to know this so I can help them address those issues and get better at the subject.

With a summative assessment, I want to know how good they are at the thing I’ve taught them compared to, either, the rest of the students in the class, or some kind of national benchmark if I’m teaching them at Key Stage 4, or compared to my own priors about what are appropriate expectations for a student in a particular age group.

However, there are some obvious flaws in my practice here, or at least gaps in the logic of what I’m doing.

I still don’t think I’m very good at distinguishing formative and summative tasks. An essay is probably a summative task (thanks to David Didau again here on twitter for helping me clarify my thinking here), but I probably treat it more formatively, in that I will give them feedback with the hope that they will use that to write better essays in the future. The problem is, is that essays in the future will inevitably on different topics — either because of the linear nature of the curriculum as we move through different time periods, or, if it’s an exam course, the low likelihood that the precise essay I set as a homework or a mock exam is the one that comes up in a proper exam. So how transferable is feedback from one essay to another? Maybe not that much.
Linked to the above is the issue of DIRT. What is the point in getting them to improve an essay they’ve already written (especially if we see them as summative tasks) when the goal is to make them ready for a random unseen essay on any aspect of the topic? How can the conventional DIRT tasks of redrafting and improving previously set work adequately prepare them for that challenge? This whole DIRT policy, or red for reflection or whatever your school calls it has got away from the fact that it is probably better done as part of a drafting and redrafting process, rather than as a way of improving an ‘end product’ piece. But I don’t think DIRT is done like that in most schools.
Due to the demands of whole school marking policies, I need to show I am giving feedback on work. But, so I don’t die, I need to make this manageable from a workload perspective given that I teach 7 different year groups and 9 different classes with limited non-contact time. That can mean that I am missing valuable opportunities to give feedback. In addition, I tend not to record formative assessment anywhere (e.g. scores in low stakes tests or records of which questions they got right or wrong), and if I do I rarely go back and consult them. Usually it’s just stored in my head as much as possible.
Do I even know how to produce assessments that measure how good a kid is at History, that is also (thanks to Daisy Christodoulou for helping my thinking here) reliable and valid? I think this is really hard, given the vast amounts of History we expect kids to know about. Are we, as a cohort of History teachers, aware of the limits of what our assessment can do? I am not sure we can meaningfully assess students on what they should have learned in Key Stage 3 in the constraints of limited teaching time and one hour lessons and so on. Either the assessment would need to be so long as to be impractical, or to be short enough it would only be able to assess a tiny sample of the domain, which could lead to unreliability in the assessment (in that some students may get lucky or unlucky depending on whether what comes up on the test chimes with what they know well). We know that History is a highly unreliable subject to assess at GCSE already — if the exam boards can’t get it right what are my chances as a single head of department ploughing my own lonely furrow?
So the counterpoint to the above is that we can assess students as we go along. But then we come back to the point above that my feedback becomes less useful as we’ve probably moved onto the next topic by the time it comes for me to give feedback on the work they’ve done on the previous topic — given the domain specificity of history knowledge and it’s probable lack of transferability.
Now, one area where there may be transferability is over some sort of generic ‘essay writing skills’. I would say that, until I encountered The Writing Revolution, my ability to give feedback on those was, frankly, ineffective and had precisely zero impact. There is a language around essay construction that I think a lot of history teachers don’t have. TWR can really help there I think in providing actionable targets that go beyond ‘add more detail’ but it still won’t help too much unless it is allied to good knowledge of the topic you are writing an essay on. Of course the habits of exam boards in producing predictable question stems and formulaic exams has added to this. The feedback on most exam questions probably tends to revolve around giving the cheat codes for the exam, rather than helping them be better at History as a discipline. Perhaps instead of exams with more predictability and less choice, we need exams with less predictability (of question style) and more choice.
The purpose of summative assessment (other than end of course terminally assessed exams) seems to be to produce a score, or some evidence that leads to a score, to enter into a school reporting or tracking system. Is that a good use of anyone’s time? How much time should be spent feeding back on such a task when it constitutes the end product of a child’s learning on a topic? How do we avoid the problem Didau talks about earlier of the child associating themselves with the score? If we don’t have summative assessments and scores and so on, what do we actually give to parents as part of our legal duty to report on their progress, that they will actually understand? My general view is that the only thing we can really tell them is, a) how their child is doing compared to the other children in the class, and b) a guesstimate of the grade they might get at GCSE based on the quality of work seen so far and my prior knowledge of previous children who have produced work of a similar quality. That’s it! What else can we meaningfully say that doesn’t get bogged down in teacher jargon?

I’m out of thoughts for the time being — it’s an interesting intellectual exercise and I think we still look far too much at proxies for learning rather than actual learning. Probably because it’s incredibly difficult to effectively encapsulate how much a child has learned — especially in a subject as vast as History.

Assessment in History — A Sissyphean task?

Written by Kristian Shanks

No responses yet