The term AI researchers use for the AI’s unreliability is “hallucination“:
He recently asked both LaMDA and ChatGPT to chat with him as if it were Mark Twain. When he asked LaMDA, it soon described a meeting between Twain and Levi Strauss, and said the writer had worked for the bluejeans mogul while living in San Francisco in the mid-1800s. It seemed true. But it was not. Twain and Strauss lived in San Francisco at the same time, but they never worked together.
Scientists call that problem “hallucination.” Much like a good storyteller, chatbots have a way of taking what they have learned and reshaping it into something new — with no regard for whether it is true.
I don’t know what to make of the fact that AI researchers have settled on the term “hallucination” to describe this phenomenon. I find it interesting.
Actually, I find it intriguing. “Hallucination” implies a form of “inaccuracy” far beyond simple mistake or even “misinformation.” There’s a nightmarish, dystopian quality to the word, too.
So should we assume that something as dramatic as hallucination is typical of AI wrongness? I don’t have a term for what I’ve seen in the three ChatGPT papers I read this fall, but whatever you call it, it was a lot less dramatic than Mark Twain working for Levi Strauss in mid-1800s San Francisco.
We will see. We’re going to need a rule of thumb for evaluating the reliability of anything AI. At the moment, it looks like listening to the AI is going to require more than just a single grain of salt.