We’re going to have predicted grades for the Leaving Cert. We don’t have full details on how the mechanism is going to work yet, but assuming the people in charge are literate in experimental design and statistics we can guess the main ideas. A key question: are they going to publish details of the model?
A lot of people are worried about some specific potential weaknesses in the system. Some are real, some are not, some are real but no worse than in a normal year. Some are systematic, and some are random. Some can be mitigated, some can’t. This post tries to sort them out.
The teacher will tend to award lower grades to students they dislike. Unavoidable. Both conscious and unconscious bias could be in play here. But the effect size can be reduced by requiring evidence-based decisions.
Students who tend to appear weak early in the cycle, relative to exams, will be disadvantaged. Unavoidable, but the effect size can be reduced by taking into account Junior Cert results, which are at the end of a cycle. Also, everyone works harder and knows more at the end of the cycle than they did earlier, so this effect doesn’t pick out a subgroup unfairly.
Some schools use open-book mock exams, or different standards of mock exams, or harsher grading. This doesn’t matter. Think of the mock exam result as an independent variable in a linear relationship with the final exam result, within each school.
Teachers will be under pressure from students and parents. The Minister should loudly, publicly, require principals and teachers to log all communication with students, parents, and families – verbal, written, electronic – which touches on the Leaving Certificate. This would help somewhat to put a doubt in parents’ minds about seeking to pressure teachers. I think this policy would be better than setting up an official channel of communication via principals, because the effect of that would be that disallowed direct communication with teacher would continue but would not be logged by teachers.
Students in weaker schools, non-fee paying schools, disadvantaged schools, or schools with lower mean historical grades in the exam will be unfairly adjusted downwards. (A note on terminology: I don’t know a good catch-all term for these schools. What I know for sure is that terms like “under-performing schools” or “weaker schools” might imply the wrong thing. School performance should be measured by outcomes relative to ability and other variables at intake, not by raw outcomes. When people say “under-performing schools” they’re talking about raw outcomes. This is an issue that transcends Covid-19 and Leaving Cert 2020.) Of course, students in these schools are disadvantaged in many ways – they always were. But the new mechanism will not make this worse. The teacher produces a predicted grade for their whole class. If this distribution is far higher or lower than the historical distribution for that class, it will be adjusted to approximately match. This prevents grade inflation by generous/weak-willed teachers (or the opposite, presumably rarer). Yes, the distribution for a historically lower-grades school will be lower, but that doesn’t mean a student in such a school is more likely to have their predicted grades adjusted downwards. Consider two students of equal ability, in two schools of very different historical grade distributions. It must follow that they are at very different percentiles within their schools. Also, if neither school’s distribution of predicted grades is out of line with historical data, there’s no adjustment, and they receive equal grades. If the student in the “weaker school” (scare quotes) receives the higher predicted grade, then (in expectation) that is because their teacher gave more generous grades to their class, and they’ll be adjusted downward – (in expectation) to the same grade as the student in the “stronger school”. There is no reason to expect that teachers in “weaker schools” are more prone to inflation or generosity.
Biased officials in the Department are likely to be more generous to fee-paying or middle-class schools. This is a distinct point from the above, because it’s about the human factor as opposed to the system. As argued above, I don’t buy that “weaker schools” are disadvantaged by the new mechanism, but they could be by the human factor. To try to mitigate this, again, there has to evidence-based decision-making. In order to make a downward adjustment to Disdvantaged School D, as opposed to Fee-paying School F, the claim that has to be supported is not that School D historically gets worse grades than School F. Instead, the biased officials must be required to show that the teachers in School D have inflated grades, relative to previous years in School D (whereas those in School F haven’t).
How can we know whether a distribution of predicted grades is too far out of line from previous years? We know this because we can examine the year-to-year variation of true grades, and set a confidence interval based on that.
What if a class is particularly strong relative to previous years? Obviously, the teacher will be predicting a higher distribution of grades, and it will appear out of line with their historical distribution, and so would be adjusted downwards. But if a teacher and school believe it is an exceptionally strong year, they should be allowed to make this case – e.g. based on performance at Junior Cert or mock exams, relative to their previous years’ comparable results. This doesn’t involve inter-school comparisons of mock exams.
There just isn’t a sufficiently strong correlation between predicted grades and true grades. I have seen the figure of 16% of grades agreeing, in the UK, where predicted grades have been used for several purposes for many years. Experience matters. The proportion of grades that agree is a vulnerable metric – a plain old Pearson correlation would be better. Either way, this is a real weakness. But – perhaps importantly – while some of the error is due to systematic effects, e.g. conscious or unconscious bias, probably most of the error is unsystematic. Predicting grades is just hard. This introduces something of a lottery to the result, but lotteries are “fair” in a particular sense.