archives

Machine-generated nonsense might score better than you on the GMAT

Apr 29, 2014, 3:10 PM UTC

Robots, like this life-size, humanoid robot from the CeBIT computer technology trade fair, have gotten pretty advanced. But can robo-graders effectively evaluate humans’ writing?

John MacDougall/AFP via Getty Images

Libby Nelson was Vox’s editorial director, politics and policy, leading coverage of how government action and inaction shape American life. Libby has more than a decade of policy journalism experience, including at Inside Higher Ed and Politico. She joined Vox in 2014.

A computer-generated essay — which included the line “humankind will always subjugate privateness” — scored 5.4 points out of 6 when graded by the same software used to score essays for the Graduate Management Admission Test.

The writing program, created by the MIT’s former director of undergraduate writing, Les Perelman, can create an essay in under a second given up to three keywords. It’s meant as a theatrical critique of grading essays by machine, writes Steve Kolowich in the Chronicle of Higher Education:

Mr. Perelman’s fundamental problem with essay-grading automatons, he explains, is that they “are not measuring any of the real constructs that have to do with writing.” They cannot read meaning, and they cannot check facts. More to the point, they cannot tell gibberish from lucid writing.

Humans still grade GMAT essays, which have one human scorer and one computer. When the human and computer disagree — as they should in the case of Perelman’s sample essay — an additional person is brought in to resolve the dispute.

But computer scoring is generally on the rise, helped along by a high-profile study (which Perelman disputes) showing that machines can grade writing about as well as humans.

One of the groups writing Common Core standardized tests hopes to use machines, sometimes called robo-graders, as well as humans, when scoring essays on the tests next year. The tests are currently being tried out in classrooms; one of the goals for the field test is calibrating the software so it can score students’ writing.

See More: