Who will take the initiative if not government?
In the discussions about the removal of levels there have been demands that someone (presumably the government) should do something to replace them. There is a good reason why this should not happen. Firstly, the unworkably complex national strategies were the outcome of government micromanaging the system when teachers are much better able to do it. There were certainly some good aspects to the National Strategies that would be good to preserve but what seems like a one size fits all approach ends up doing more harm than good. My second point is going to take a little more explanation.
If we want to make forecasts and set institutional targets for the purpose of internal development and improvement, (I think we should so let's not get divereted into another discussion. If you don't, OK stop reading here.) there is a dilemma because we need nationally representative data in order to compare our individual data to see how we are doing in relation to everyone else. As soon as we start contributing to national data, our data can be used against us in accountability measures. Not that accountability is a bad thing, just that experience should warn us that it can be detrimental if it prevents objective analysis. The political pressure on any government will make it impossible to resist the individual data being made public. The evidence comes directly from grade inflation, gaming the system etc etc. and has resulted in the need for a very expensive reform of qualifications. On these grounds we need to ensure the data can be used for the intended purpose and not get hi-jacked. How might this be achieved?
Crowd-sourcing useful data
What is required is a system of voluntary testing such that schools can submit their results and have them aggregated into a nationally relevant sample and then individual results can be checked against this sample to monitor progress. A common multiple choice test in each subject might not be perfect but it would have two key benefits. One it would be inexpensive to deliver and two it would be very quick and responsive to providing the information needed. If this testing was delivered by a Non-governmental organisation, the data would be safe to use for the intended purpose.
This is the idea behind a system we devised with NAACE, Mirandanet and Computing at School initially for Computing. Test each child using a short multiple choice test at the end of each year in KS3. If they are in the top 10% consistently each year they will almost certainly get an A/A* in a GCSE or GCSE equivalent. If they are in the lowest 10% probably F or G. Schools can then use the information as they think appropriate. We can deliver these tests for free using on-line technology. That's another reason not to use the DfE, a tender for doing a national test delivery could run into millions diverting resources out of the classroom. How can we afford to do it for free? We have access to EU grant funding and some people might want an official level 1 qualification at the end of KS3 for their learners including "points". For those that do there is a charge and that pays for devising the free tests and supporting the technology which are needed for that in any case. The technology is already developed and based on free and open source applications so there is no significant development cost. Think of how Dropbox, LinkedIN and other fremium models operate. So it is sustainable without needing any dependency on the DfE, OFSTED or OfQual.
Precision, accuracy and validity.
How accurate will it be? That depends on how many people take part and how well we design the tests. We can certainly improve the tests when a lot of people have done them and we have time to analyse what comes back. The data collected from the tests can be used to improve them so that is an elegant dimension. More people gives a more representative sample so that is a reason for making it free in order to encourage participation. It certainly won't be perfect but then a focus on perfection tends to be misguided in any case. The question is, overall is it better than the alternatives? While a lot of effort went into fine adjustment of multi-level criteria spliting them into eg 5 a, b, c, there really is little evidence that that precision was justified given the spread of possible interpretations. How likely would it be that if I took a sample of 100 bits of level 5a work at random from teachers around the country and removed the levels assigned giving the work to a random sample of 100 different teachers that they would all come back with the same levels? As far as I am aware, no-one has done that exercise but I certainly would not want to bet my house on the outcome. The cost of such an exercise is probably why t was never done. Now we might not have any better precision with the baseline testing model to start with but we can fairly easily measure the precision from statistical data and if small adjustments that do not have a big opportunity cost in teaching will make a significant difference, let's do them. If small gains mean disrupting teaching, let's not do them.
Don't lose sight of why we are here
In the end the purpose is to teach kids better. We should never lose sight of the opportunity cost of bureauratic systems in the context of that overall desired outcome. Bureaucracy is not rigour and although it is necessary, more of it that is unnecessary detracts from rather than raises standards. Criterion referencing is a good idea to a point. It provides a rationale for baseline competence eg in practical activities and tracking progress through breadth of experience. It is a logical basis for constructing tests and exams. It is not a good means for deciding fine levels of performance across a wide range of content because all descriptive criteria are open to interpretation. For this reason we are not throwing the baby out with the bathwater. Direct criterion matching to performance has a place at broad levels such as EQF level 1, 2, 3. It is not sufficient on its own to grade acquired knowledge and understanding to determine potential or readiness for study in higher education, for example. But controlled exams focused on knowledge and understanding are vital too in grading relative performance efficiently and with some precision. Broadly, criteria matching lends itself to formative assessment so we are providing optional but flexible, free cloud based tools for managing that process in as much or as little detail as the teacher decides is appropriate in their circumstances. The baseline testing can help inform the teacher quickly and simply if their judgements against the criteria are reasonable or whether they seem too harsh or too lenient. This is using two different assessment methods intelligently to get the best from each rather than assuming they are mutually exclusive.
The explanation is more complex than implementation. In summary all we need do is:
- Track progress through a set of competence statements as pupils go through the programmes of study. Can they provide evidence that is plausible that they can cope with the content at broadly the right level?
- Give them a test of their knowledge and understanding that positions them amongst their peers at that time. Do they progress faster or slower than them over time?
That's it. If you want to get into self-assessment peer assessment, IEPs or any other formative activities the tools are there to do it. They will build a practical e-portfolio for each pupil and provide the evidence needed for coursework for formal qualifications. It is up to the professional teacher to decide in their context how far that complexity helps or hinders their teaching in supporting the learning outcomes for their pupils.