Monday, March 30, 2020

How on earth are we going to give grades this summer?


This is a premature blog, given that we will know what the plans are inside a week, but I am writing to get my thoughts in order and to raise some of the potential problems ahead of a final decision being made.

There are three important issues here:
  1. There isn’t a way to do this that everyone will be happy with.
  2. Something has to be done, and the organisations charged with doing this have an impossible job.
  3. You can have a simple system, or you can have a fair system, but you can’t have both.
The third of these is important, because if you have a simple system it will be unfair on very many students.  Yet the more complicated the system is, in a bid to make it fairer, the more difficult it will be to understand, so it won’t appear to be fair. 

One reason for writing this is that I can think of only one system that could be vaguely fair, but it is complicated, and I worry that when it is published it will be difficult to understand. 

The statement from the DfE

On the Gov.uk website (1) the DfE’s priorities are set out, shown here in italics.

This year’s summer exam series, including A levels, GCSEs and other qualifications, and all primary assessments, have been cancelled as we fight to stop the spread of coronavirus.

This is unsurprising and I can’t see any way around this.  We do not know how long schools and colleges will be closed for and there is no good way for exams to be taken without breaking the government’s rules on congregating in public.  We are not equipped as a country to sit exams online and there is not the time to set up secure ways of doing this (for example, how would you know that the person sitting at home behind their computer is the candidate and not a sibling or parent?). 

So: no exams, then. 

The Government’s priority is now to ensure affected students can move on as planned to the next stage of their lives, including going into employment, starting university, college or sixth form courses, or an apprenticeship in the autumn.

This means having some way of determining which students have achieved qualifications in particular subjects.  If exam results were only used as a way of proceeding to the next stage of education or training then it would be possible not to award grades at all, but it would seem unfair on the current group of students if they were not to receive their own grades.  Universities have made offers based on grades and will presumably continue to do so.

This means ensuring GCSE, A and AS level students are awarded a grade which fairly reflects the work that they have put in.

This doesn’t say “that fairly reflects the grade they would have got in their exams”. 

There will also be an option to sit an exam early in the next academic year for students who wish to.

There seems to be an expectation that this will happen in September.  This seems to be a fair and reasonable thing to offer students, and the exam papers for the summer exams (which have presumably been written already) can be used for that.  Marking will take several months as usual, so such students will not be able to take up university places in the autumn term. 
It will be interesting to see whether students are required to accept the grade they are ‘awarded’ this summer, or can reject it in favour of the grade they will get in the exam, or whether they will keep the higher grade if they subsequently take the exam and score poorly.  If the exam grade is automatically used, whether higher or lower, then this will be a tough call for some students.

Ofqual will develop and set out a process that will provide a calculated grade to each student which reflects their performance as fairly as possible, and will work with the exam boards to ensure this is consistently applied for all students. The exam boards will be asking teachers, who know their students well, to submit their judgement about the grade that they believe the student would have received if exams had gone ahead.

This is where the difficulties arise.  More on this below.

To produce this, teachers will take into account a range of evidence and data including performance on mock exams and non-exam assessment – clear guidance on how to do this fairly and robustly will be provided to schools and colleges. The exam boards will then combine this information with other relevant data, including prior attainment, and use this information to produce a calculated grade for each student, which will be a best assessment of the work they have put in.

Again – more on this below.

Ofqual and exam boards will be discussing with teachers’ representatives before finalising an approach, to ensure that it is as fair as possible. More information will be provided as soon as possible.

I wonder who they are talking to.  Unions?  Universities?  Subject associations?  Head teachers?  MAT leaders?  I hope and expect it will be a wide range of representatives. 

The aim is to provide these calculated grades to students before the end of July. In terms of a permanent record, the grades will be indistinguishable from those provided in other years.

This means a student will have a ‘grade 7’ and not a ‘coronavirus grade 7’.

We will also aim to ensure that the distribution of grades follows a similar pattern to that in other years, so that this year’s students do not face a systematic disadvantage as a consequence of these extraordinary circumstances.

Rather than not facing disadvantage, I wonder whether this is more about not allowing ‘grade inflation’ for this year.  First off, this will be very hard to achieve, given that a ‘take the exam’ option is being offered.  If an identical distribution to last year is created using the techniques they devise, they cannot know how many students will reject that grade and seek to score higher in the exams in the autumn.
This statement does suggest a way they might create the results (more below).

Education Secretary Gavin Williamson said: […] I have asked exam boards to work closely with the teachers who know their pupils best to ensure their hard work and dedication is rewarded and fairly recognised.

This is not a surprise, but it is worth noting that it forms a fundamental shift in the way students are assessed.  Clearly, this will involve some sort of teacher assessment and teacher input.  Many teachers may well welcome this.  But the biggest shift is that previously it has felt like {me and my students} in the red corner, versus {the examiners} in the blue corner.  In the past we could discuss exam technique (“the examiners like to see…”, “they are looking for…”, “they won’t give you marks unless…”) and if a result wasn’t what a pupil hoped for then they could blame it on the exam.  Now the pupils know that their teacher will have some involvement in determining their grade.  All of a sudden the teachers have changed side and it’s {students} against {teachers, DfE and exam boards}.

We recognise that some students may nevertheless feel disappointed that they haven’t been able to sit their exams. If they do not believe the correct process has been followed in their case they will be able to appeal on that basis.

I am intrigued as to what sort of evidence we might be called on to provide, and how it will be possible to provide that evidence for parents and pupils to see without breaking other pupils’ confidentiality.  It’s going to be interesting!

In addition, if they do not feel their calculated grade reflects their performance, they will have the opportunity to sit an exam at the earliest reasonable opportunity, once schools are open again. Students will also have the option to sit their exams in summer 2021.

This seems absolutely reasonable, with the caveats mentioned above.

There is a very wide range of different vocational and technical qualifications as well as other academic qualifications for which students were expecting to sit exams this summer. These are offered by a large number of awarding organisations, and have differing assessment approaches – in many cases students will already have completed modules or non-exam assessment which could provide evidence to award a grade. We are encouraging these organisations to show the maximum possible flexibility and pragmatism to ensure students are not disadvantaged.

If part of the exam is coursework, or a languages oral exam, then on the one hand it seems reasonable to use that as evidence towards the grade the pupils get.  But it mustn’t be the only thing that is used.  Otherwise, there is little point in students sitting all of the other elements of the course in a ‘normal’ year. 

Ofqual is working urgently with the sector to explore options and we will work with them to provide more details shortly.  The Government will not publish any school or college level educational performance data based on tests, assessments or exams for 2020.

This seems fair on the face of it, and I certainly wouldn’t want schools to be judged on the strength of grades that are not exam grades.  Yet we are saying to pupils “these results are so dodgy we can’t use them to assess schools but we think you should accept them”.  This feels odd.

Some options

Now, let’s look at the meat of this.

Ofqual will develop and set out a process that will provide a calculated grade to each student which reflects their performance as fairly as possible, and will work with the exam boards to ensure this is consistently applied for all students.

A “calculated grade”.  This means there will be some sort of method, formula, or system that will be put in place and which will be applied to all schools and colleges.

The exam boards will be asking teachers, who know their students well, to submit their judgement about the grade that they believe the student would have received if exams had gone ahead.

To lots of people this suggested predicted grades would be used. There are a number of difficulties with just using these grades.

  a)  Predicted grades are notoriously inaccurate, particularly at A-level. This is unsurprising because they are given months ahead of the exam and there is plenty of time for students to change their working practices between the prediction and the exam.
  b)  Sometimes they are given as an ‘aspirational’ grade.
  c)  Sometimes they are depressed so as not to make a student over-confident.
  d)  While teachers will have done their best to produce predicted grades (that are already in the UCAS system), they will not have expected them to be used to determine the pupils’ actual grades.

To produce this, teachers will take into account a range of evidence and data including performance on mock exams and non-exam assessment

Can mock exam grades be used?  Mock exams are carried out in very many different ways and these cannot be compared fairly across schools.

  a)  They happen at different times in different schools. Comparing mocks taken in one school in November with others taken in March is nonsense.
  b)  Some schools set a full exam paper that includes questions on topics that haven’t yet been covered, while others remove such questions.
  c)  Some do all of the papers, while others don’t.
  d)  Some mark the mocks as if they were the real thing, while others take into account that they have been sat months before the actual exam.
  e)  Students know they are mock exams and may not have revised sufficiently, or understood that process, and certainly won’t have expected those exams to create their final grade.
The exam boards will then combine this information with other relevant data, including prior attainment, and use this information to produce a calculated grade for each student, which will be a best assessment of the work they have put in.

Some have worried that this means KS2 levels will be used to give GCSE grades and that GCSE grades will give Year 13 students their A-level grades.  That isn’t what this says, and it would clearly be unfair to say to a student in Yr 11 that everything they have done for the past 5 years will be disregarded and their GCSE grades will instead be based on tests they took at the age of 10 or 11. 

What is to be done?

It is obvious that a grade cannot just “be awarded” by teachers.  In most subjects, the students haven’t been carrying out coursework and there isn’t anything comparable that can be used to moderate grades given by teachers working in different ways in different schools.  Further to this, if there are ‘too many’ grade 7s awarded nationally (remember, there is a desire to keep the overall outcomes this year comparable to those of last year) then how do you decide which students will be given a grade 6 instead (or vice-versa if not enough have been awarded)?

Here is my prediction (with further detail below):

1) Teachers will be asked to rank students in each subject within their school. The highest-attaining student in a particular subject, followed by the second highest … to the student we expect to achieve the lowest grade.
2) The DfE and exam boards will decide, using prior attainment data for these students and prior value-added for the school, how many of each grade in each subject a school ‘ought’ to achieve.
3) The students will receive those grades.

Ranking students

This will not be straightforward to do, but work by Daisy Christodoulou (2) and others involved in ‘comparative judgement’ has shown that deciding whether one piece of work is ‘better’ than another is easier than giving it a grade.  (If two people carry out these two types of grading then they are much more likely to agree on which is better than they are to give identical grades).

Using prior attainment data to give grades for the school

This is not straightforward to explain.  I will start with a naïve way of doing this and will then tweak the model to make it more sophisticated.  This will result in something that is fairer but more difficult to understand. 

We can look at last year’s GCSE results in the school and can see, in a particular subject, how many of each grade there were.  For example, let’s imagine a particular school had 20 students who took GCSE History in 2019 and that these were their grades:

Grade
1
2
3
4
5
6
7
8
9
No. of pupils
1
0
1
4
3
5
4
1
1

We could turn that into percentages
Grade
1
2
3
4
5
6
7
8
9
% of pupils
5%
0%
5%
20%
15%
25%
20%
5%
5%

Then we multiply the number of candidates for 2020 by these percentages.  For example, if there are 37 candidates for History in 2020 then we multiply those percentages by 37 and round them off:
Grade
1
2
3
4
5
6
7
8
9
% of pupils
2
0
2
7
6
9
7
2
2

Then the list put together by the teachers is deployed: the first two students on the list are awarded a grade 9, the next two get a grade 8, the subsequent seven get a grade 7, etc.

This doesn’t take into account that different cohorts of students have different prior attainment (remember, the Gov.uk article mentioned that ‘prior attainment’ would be used.  It doesn’t seem fair to use an individual’s KS2 test result (the only comparable ‘prior attainment’ we have for Yr 11 students), but it could be feasible to use that on a school level.

Here is an increased level of complexity:
This new table shows the KS2 levels achieved by the students who took GCSE History at the school last year, with their GCSE results along the top.  This highlighted number shows that there were two pupils who got a level 4 at KS2 (in English and Maths) and went on to get a grade 6 in GCSE History.

GCSE grades 2019

1
2
3
4
5
6
7
8
9
2
1

1






KS2
3



1
1




level
4



3
1
2



5




1
3
4
1

6








1

At this school there were two pupils who took GCSE History in 2019 who had a level 3 at KS2.  Of them, half got a grade 4 and the other half a grade 5.  So we would look at the KS2 grades of the 37 students who were planning to take GCSE History in 2020, and would say that half of those with a level 3 at KS2 will get a grade 4 and half a grade 5.  We will do this for each of the KS2 levels to produce the grades for the school.

Those grades would then be aggregated, so we will still have a total number of grades at grade 9, grade 8, etc, for the school.  The list devised by the teachers will then be used to assign grades to pupils in the same way as previously.

This is much more nuanced, in that it takes account of differences between cohorts, but is harder to explain.  A further level of complexity could involve using ‘fine KS2 levels’ (eg 4a, 4b, 4c) or the decimal levels used by Progress 8.

What is good about this method (and how can it be tweaked further)?

It is fortunate that the current Year 10s were the first students to take the new version of KS2 SATS, so the Year 11s have the same type of prior attainment as last year’s cohort (KS2 levels and not a scaled score). 

The vast majority of GCSE subjects have had two years of results under the new 9 to 1 grading system, so it would be possible to combine the school’s results for the past two years for many subjects.  (For English, Maths and English Lit there are three years of grades that could be used.  For some subjects there is only one year of prior data with the 9 to 1 system.)

It is helpful that the teachers are not deciding whether a student should get a particular grade, but instead are only choosing an order for their students.

If the entry cohort to the school changes then that will be reflected by this system.  If the entry cohort for a particular subject changes then that will also be recognised by this system.

Problems with this method

There are lots of potential problems with this system:

1) Schools that have had a significant change (eg they have gone into special measures this year, or have made significant changes, or are improving rapidly) will not see this reflected in the exam results, which will be broadly in line with those of the previous year. I don’t see any way around this.

2) A subject that has undergone a significant change (eg there was a lot of teacher absence in the past but there is now a new and experienced team in place) won’t see this reflected in their results.

3) Previous students with particular issues will affect the current cohort. A school-refuser last year will affect the grade of a student this year.

4) If a subject has a small cohort this could lead to unfair grading.

5) A new subject to the school won’t have prior data that can be used.

6) Schools with a significant number of pupils who join after the start of Yr 7 won’t have as much matched data (KS2 to GCSE) that can be used.

7) Subjects where early entry is reasonable (such as native speakers taking GCSE French in Yr 9) will be difficult to grade.

8) Certain groups of students do not fare well under teacher assessment. For example, a study (3) found that non-white-British pupils achieved lower teacher assessed levels (compared to their test scores) than their WBR counterparts. How do we avoid any unconscious bias on the grounds of race, and also by gender, background, accent, FSM, etc?

9) There are always surprises on results day, where teachers are amazed (in a positive or a negative way) about the grade a certain student has gained.

10) There may be student perception that there will be favouritism (“my teacher doesn’t like me”).

We will need to compare across classes taught by different teachers.  Determining the order of students is where the data will come in.  Going back to the Gov.uk guidance:

To produce this, teachers will take into account a range of evidence and data including performance on mock exams and non-exam assessment – clear guidance on how to do this fairly and robustly will be provided to schools and colleges.

Because we are only ranking students within a single school, the data we have is likely to be similar and comparable for all our students, and it doesn’t matter if it is not comparable to that of other schools.  While we cannot compare easily between schools, we will be able to use teacher assessment and mock exam results within a school to create an order of students.

The exam boards will then combine this information with other relevant data, including prior attainment, and use this information to produce a calculated grade for each student, which will be a best assessment of the work they have put in.

How might A-levels work?

This blog has focused largely on GCSEs, but a similar process could be used for A-levels, making use of GCSE grades as prior attainment.  There will again be problems will small cohorts in certain subjects and most of the other concerns raised above will still apply.

How might KS2 work?

I am outside my area of expertise here.  Could a similar process work for KS2, using KS1 levels as a baseline?  Or, given that teacher assessment is already part of KS2 and there is a system in place for doing this, will that be used instead of this system?  A final thought on this one, is that if this KS2 data won’t be used to publish school level performance data in 2020, presumably that means Progress 8 (which is based on KS2 data) won’t be published in 2025 (when the current Year 6 pupils reach Year 11).

Conclusion

I started with these three issues:

1) There isn’t a way to do this that everyone will be happy with.
2) Something has to be done, and the organisations charged with doing this have an impossible job.
3) You can have a simple system, or you can have a fair system, but you can’t have both.

This blog has explained what I see as being the least-worst way of doing this, and I have tried to acknowledge the problems that are inherent in this, and any other method.

I finish by saying ‘Good luck’ to everyone involved in whatever system is deployed.  Good luck to those who are devising the system, to the teachers who will need to use it, and good luck to those whose grades will depend on it.




Wednesday, March 25, 2020

Some maths with Social Distancing


As part of the advice of what to do to avoid spreading coronavirus, the BBC has some little graphics.  The one about social distancing (where we were advised to keep 2m apart) is this:


Is this to scale?
If we lie the chap down, and assume he is about 6 ft tall, then the circle appears to have a diameter of about 2 metres.


But hold on, we have to be 2m _away_ from other people: I mustn’t just be this distance away from others but should be twice that amount.  The first diagram here is bad, the second is good.


If I remove the major arc of the circle, suddenly this graphic makes sense:


The artist was thinking of us as Subbuteo figures

(Royalty-free image from Shutterstock)

This then leads to the question of how many people could fit in a room, obeying the requirement of being 2m apart.
My first thought was that this was a ‘packing’ problem, and for a 2m by 5m room I produced this diagram:


But we aren’t actually Subbuteo figures, so it’s fine for our ‘base’ to overlap the edge of the room.  In fact, we can have at least 6 people in the room:


Further questions:

  1. Can we have more than 6 people in this room?
  2. Is it possible for them to get into the room without violating each other’s space?  Does it depend on where the door is?
  3. How many people will fit in rooms of other dimensions?
  4. What is the smallest room that will fit a particular number of people?