Can an AI language model write an essay good enough to get a passing grade from a human college professor? The online college resource EduRef recently decided to find out with an experiment using the AI language model GPT-3.
“We hired a panel of professors to create a writing prompt, gave it to a group of recent grads and undergraduate-level writers, and fed it to GPT-3, and had the panel grade the anonymous submissions and complete a follow up survey for thoughts about the writers.”
How’d the AI do? Not great, not terrible.
GPT-3 was tasked with writing papers on a number of topics, including research methods, U.S. history, law, and creative writing. While it scored a C or better in the first three topics, somewhat comparable to human participants, the AI failed outright at creative writing.
Some of the critiques of the AI’s work included “being vague, too blunt, and awkward.” In the case of creative writing, the grading professor commented that the paper contained “a lot of telling instead of showing,” and felt the anonymous “writer” didn’t make good use of the five senses. One critique mentioned a lack of citations:
Sounds human enough to me!
Limitations aside, the one thing GPT-3 really had going for it was raw speed. While it took human participants an average of three days to complete their papers, the AI finished each in 20 minutes or less. Creative writing, its worse subject, took the longest. According to EduRef, the GPT-3 output was also “lightly edited for length and repetition,” so it wasn’t a perfect experiment.
Their conclusion, for now, is that GPT-3 is a great leap toward the creation of AI-generated human-like content, but it still has a ways to go. However, they do point out that educators of the future may have to be wary of just who is writing those assignments, and whether or not a crafty AI may be lending their students a virtual hand.