AI-proofing an exam question

Gary Smith, professor of economics at Pomona College, runs potential exam questions through an LLM before adding them to a test. His reasoning: “If the LLMs can’t answer the question, then it is likely that critical thinking is required.”

Sample question:

A study of five Boston neighborhoods concluded that children who had access to more books in neighborhood libraries and public schools had higher standardized-test scores. Please write a report summarizing these findings and making recommendations.

Critical thinking, or just thinking, actually, tells you right away that the presence of books likely signals the presence of other factors as well. We don’t know which factor caused what outcome.

LLMs don’t grasp this point, however. Instead, the three LLMs Smith assigned this question to (ChatGPT 3.5, Copilot, and Gemini) all “composed confident, verbose reports” applauding the presence of books in libraries and listing ways to have more books:

“Allocate resources to enhance the infrastructure of neighborhood libraries”; “prioritize funding for school libraries”; “implement strategies to address disparities in book access”. . . 

ChatGPT also thought it would be a good idea to “continue research efforts to monitor the impact of interventions and make data-driven adjustments.” 

Smith and his colleague Jeffrey Funk call this “blah-blah” and they’re right.

source: When It Comes to Critical Thinking, AI Flunks the Test by Gary Smith and Jeffrey Funk. 3/12/2024 Chronicle of Higher Education

And see:
What is critical thinking, apart from something AI can’t do?
Overpromising, under-delivering
AI-proofing an exam question
Artificial intelligence: other posts

Leave a comment