Push to have robots mark school tests under fire from prominent US academic
A push to have robots mark English tasks in NAPLAN testing has come under attack, with a prominent US academic calling for a halt to the plans, claiming there are major flaws.
- Academic says assessing "artistic" uses of writing beyond automated systems
- Assessment authority defends system, says it's "as reliable" as human marking
- Teachers Union "seriously concerned" about "computer robot marking"
From next year, NAPLAN writing tasks will be marked by an automated essay scoring system. They will be double-marked by a teacher.
It is part of a plan to introduce fully automated marking and testing by 2020.
The proposal has outraged teachers' unions, who argue it is impossible for a robot to score the subjective aspects of writing.
Now a US academic has joined the fray, warning Australian educators to proceed with extreme caution.
Les Perelman, a retired professor from the prestigious Massachusetts Institute of Technology (MIT), said it would disadvantage many students because the algorithms used tend to reward "verbose gibberish" and give higher marks to essays that use complex words and sentences.
Dr Perelman, a former director of writing at MIT, has published widely on writing assessment and was commissioned by the NSW Teacher's Federation to review a 2015 paper by the Australian Curriculum and Assessment Authority (ACARA) that concluded automated essay scoring was as effective, if not more so, than human markers.
The US academic criticised the document, saying it relies on "hearsay evidence, extremely dubious methodology and incorrect information".
"The flaws in the report are so major that it cannot justify any use of AES in high-testing situations," Dr Perelman said in his report.
"Assessment of creativity, poetry, irony or other more artistic uses of writing is beyond such systems."
Automated marking 'as reliable' as humans, ACARA says
Stanley Rabinowitz, the general manager of assessment and reporting at ACARA, said he had not yet seen Dr Perelman's report but rejected its criticism.
"If we were testing irony and humour I would be just as sceptical," he said, defending the automated system.
"We're not saying it could mark anything, we're not saying it could mark a university paper or a very short answer, but for the narrative and persuasive prompts that we have using the NAPLAN writing rubric, it works as well as human markers."
Dr Perelman is a prominent critic of robot marking in the US and is partly responsible for the winding back of automated essay scoring there.
In 2014, with MIT postgraduate students, he created a program called the Babel Generator, which generates essays, based on three keyword prompts, that computers mark highly even though they read as complete nonsense.
Dr Rabinowitz said ACARA's 2015 paper was well thought-out and evidence-based, and that the authority had additional research to be released next month that would further justify automated marking.
"ACARA's extensive research indicates automated marking is as reliable and valid as human marking," Dr Rabinowitz said.
"There is a significant body of independent international research supporting the validity and reliability of automated marking."
Union 'seriously concerned' about NAPLAN scoring
NSW Teacher's Federation acting president Gary Zadkovich called on ACARA to suspend its plan to introduce automated essay scoring.
"We're seriously concerned about the proposal to move to computer robot marking of extended prose by students," Mr Zadkovich said.
"No machine has the capacity to assess creative flair, imaginative use of language, humour, irony, tone, and deliberate repetition.
"We believe that parents expect their students to be taught by teachers, and similarly they expect those teachers to assess their students' work."
The education company Pearson Australia holds contracts to administer NAPLAN testing in most states, and automated essay scoring systems would be used to mark tests.
Mr Zadkovich said the union was concerned that the move towards robot marking would further entrench the influence of profit-making companies in public education.
The NSW English Teachers Association said it did not specifically endorse automated marking outright, and wanted to see firm controls in place.
"If it is applied to extended prose, particularly creative writing, we would like to see that computer marking is tested against human marking, and continually so, to ensure that the programming behind the computer is sufficiently sophisticated, allowing for nuances of expression and creative expression that's so valued in English," said the association's executive officer Eva Gold.