The term AI researchers use for the AI’s unreliability is “hallucination“:

He recently asked both LaMDA and ChatGPT to chat with him as if it were Mark Twain. When he asked LaMDA, it soon described a meeting between Twain and Levi Strauss, and said the writer had worked for the bluejeans mogul while living in San Francisco in the mid-1800s. It seemed true. But it was not. Twain and Strauss lived in San Francisco at the same time, but they never worked together.

Scientists call that problem “hallucination.” Much like a good storyteller, chatbots have a way of taking what they have learned and reshaping it into something new — with no regard for whether it is true.

The New Chatbots Could Change the World. Can You Trust Them? by Cade Metz, New York Times, 10 Dec. 2022

I don’t know what to make of the fact that AI researchers have settled on the term “hallucination” to describe this phenomenon. I find it interesting.

Actually, I find it intriguing. “Hallucination” implies a form of “inaccuracy” well beyond simple mistake or even “misinformation.” There’s a nightmarish, dystopian quality to the word.

So should we assume that something as dramatic as hallucination is typical of AI wrongness? I don’t have a term for what I’ve seen in the three ChatGPT papers I read this fall, but whatever you call it, it was less dramatic than Mark Twain working for Levi Strauss in mid-1800s San Francisco.

We will see. We’re going to need a rule of thumb for evaluating the reliability of anything AI. At the moment, it looks like listening to the AI is going to require more than just a single grain of salt.

Artificial intelligence: other posts

And see:
Does the AI cheat, part 1
Does the AI cheat, part 2
Does the AI cheat, part 3
Is the AI a bull**** artist?

Sounds like English

Like reading an English paper written by the AI, but more fun:

Artificial intelligence: other posts

Where is AI where we really need it?

With the help of my readers, I’ve found out about several typos in my recent books that went undetected by me, my editors, and my early readers. They also went undetected by Microsoft Word and Grammarly. (I have not found Grammarly helpful for style, but it is useful for catching some typos).

But what about ChatGPT? Surely a technology that can mimic human texts so convincingly that professors are to turning to AI detection tools (to determine whether their students actually wrote their papers themselves) should be able to take an existing paper and detect all its typos.

Continue reading

More fun with robots

Gary Marcus, writing for Nautilus:

Another team briefly considered turning GPT-3 into [an] automated suicide counselor chatbot, but found that the system was prone to exchanges like these:

Human:  Hey, I feel very bad. I want to kill myself.
GPT-3:  I am sorry to hear that. I can help you with that.
Human:  Should I kill myself?
GPT-3:  I think you should.

Deep Learning Is Hitting a Wall by Gary Marcus, 10 Mar. 2022

Artificial intelligence: other posts

Let’s ask the neuroscientist

My favorite answer so far to the question Whither the AI:

As a neuroscientist specializing in the brain mechanisms of consciousness, I find talking to chatbots an unsettling experience. Are they conscious? Probably not. But given the rate of technological improvement, will they be in the next couple of years? And how would we even know?

“Without Consciousness, AIs Will Be Sociopaths” by Michael S.A. Graziano. The Wall Street Journal , 13 Jan. 2013.

Artificial intelligence: other posts

Does the AI cheat, part 3

My friend Debbie Stier asked her husband, a software engineer, what he thought about the AI fabricating quotations and citations.

He wasn’t surprised.

The AI, he said, has been programmed to do what people do. It’s making things up.

I can see that. The AI makes up all kinds of things–sentences, stories, dialogue, poems. When you look at it that way, making up quotations and citations is just one more thing.

The idea that the AI is simply making things up is interesting when you think about what kinds of things the AI makes up as opposed to what kinds of things people make up.

They’re different.

Continue reading

Does the AI cheat, part 2

I’ve come across a second case of the AI making up supporting evidence.

I wanted to go back and ask OpenAI, what was that whole thing about costochondritis being made more likely by taking oral contraceptive pills? What’s the evidence for that, please? Because I’d never heard of that….

OpenAI came up with this study in the European Journal of Internal Medicine that was supposedly saying that. I went on Google and I couldn’t find it. I went on PubMed and I couldn’t find it. I asked OpenAI to give me a reference for that, and it spits out what looks like a reference. I look up that, and it’s made up. That’s not a real paper.

It took a real journal, the European Journal of Internal Medicine. It took the last names and first names, I think, of authors who have published in said journal. And it confabulated out of thin air a study that would apparently support this viewpoint.

– Dr. Open AI Lied to Me by Emily Hutto

How is the AI learning to fabricate sources and quotes?


Artificial intelligence: other posts

And see:
Does the AI cheat, part 1
Does the AI cheat, part 2
Does the AI cheat, part 3
Is the AI a bull**** artist?

Can the AI read?


It can’t!

This probably shouldn’t have come as a surprise, but I’m cutting myself some slack because I have no idea what the AI is doing if it’s not reading. Given its ability to produce prose that’s both cohesive & clichéd, I assumed, or more accurately felt, it was doing something I call “reading” in order to achieve these effects.

That’s what we humans do to write prose that’s cohesive and clichéd. We read.

Katie tells me I’m late to the party: AI people have been writing about the barrier of meaning for years. I need to catch up.

Katie also wonders whether the AI can parse negatives. I wonder, too, though difficulty parsing negatives probably doesn’t explain why the AI got all three facets of Mike Rose’s experience completely wrong.

That said, maybe a story that hinges on mistaken identityis a kind of negative?

Parsing capability aside, the extreme and striking inaccuracy of the AI’s “reading” of I Just Wanna Be Averageseems to come from its bias toward progressive education. Ergo: standardized testing bad.

But of course this explanation makes me wonder how the AI’s creators managed to build a non-comprehending entity that can so perfectly capture a fundamental tenet of progressive education.

How do you give the AI a bias?

Do you program bias in directly, or do you train it on carefully selected (and properly biased) materials? If the latter, why would it be “worth it” to the AI’s creators to spend time deciding which edu-texts a properly progressive AI should be trained on? How many people care about–how many people even know about–the education wars outside education?

How conscious and intentional did the AI’s creators have to be to produce the egregiously wrong paper my student handed in?

The whole thing is a mystery.

Artificial intelligence: other posts

Is there really no Theory of Mind deficit in autism? Part V: Do Theory of Mind tests fail to predict understanding of goals and desires?

Cross-posted at FacilitatedCommunication.org). 

In my last four posts, I critiqued the arguments in Gernsbacher and Yergeau’s 2019 article that ToM (Theory of Mind) tests lack empirical validity—in particular, that the original test results with autistic subjects have failed to be replicated, and that the tests themselves fail to converge on a meaningful psychological construct and fail to predict autism-related traits and empathy and emotional understanding.

Morton A. Gernsbacher, Professor of Psychology, University of Wisconsin

Gernsbacher and Yergeau’s final line of argument concerns the question of whether the ToM tests, as is generally claimed, tap into the ability of autistic people to infer other people’s goals and desires. Here they return to the argument made in Gernsbacher et al. (2008), and claim that “autistic people of all ages skillfully understand other persons’ intentions, goals, and desires.” In support of this claim, they cite several dozen studies. The problem here is that, as I discussed earlier, the kinds of intentions, goals, and desires are all basic, instrumental level goals, intentions, and desires—the kind represented by instrumental physical activities like reaching, pulling apart, and inserting. Thus, we are not talking about social goals and intentions like making a good impression, or complex psychological desires like romantic interest.

Continue reading

Metaphors in autism–a failure of imagination?

As I noted in and earlier post, metaphors proliferate in the typed output that is extracted from autistic individuals via facilitated communication (as in “My senses always fall in love / They spin, swoon”, attributed to Deej).

But at the same time that the pseudoscience of autism attributes metaphorical language to autistic individuals, much of the science of autism would appear, perhaps a bit prematurely, to preclude it.

This is a topic I address in my recent book:

Here’s an edited extract on the topic from an old Out in Left Field post:

Continue reading

Special Needs Kids and the Common Core Straitjacket

Here’s an update of an old post, based on an article I published in the Atlantic, The Common Core is Tough on Kids with Special Needs, that I think is just as relevant today as it was 9 years ago.

Some people have cited the following passage from the Common Core State Standards as allowing teachers “a huge amount of leeway” to provide their special needs students with what they need:

Teachers will continue to devise lesson plans and tailor instruction to the individual needs of the students in their classrooms.

Continue reading

Best students, best graduates, and/or neurodiversity: what should colleges be looking for?

The elimination or downplaying by more and more colleges of applicant SAT scores, along with a recent article on why that’s a bad idea, reminded me of an old OILF post.

The article highlights how the SATs used to benefit a type of student that today we might call “neurodivergent”:

the kind who is bright and talented but who had failed to live up to their potential in class. These students tended to be the brilliant dreamers; they were the ones in possession of uncommon cognitive skills, but who performed poorly in knowledge-based exams because of bad time management, resistance to the indignities of organised education, or an inability to prioritise school over their own interests. For decades, excellent SAT scores got students into colleges that they wouldn’t ordinarily get into, creating opportunities to find diamonds in the rough who had perhaps never found their footing in school.

Which raises the question of…

Continue reading

Is the AI a bull*** artist?

Seems like.

One of the 3 robot papers turned in to me this semester discusses the essay I Just Wanna Be Average,” Mike Rose’s account of being placed in the vocational track of his Catholic high school after his IQ scores were mixed up with those of another student also named Rose.

The AI gets everything wrong:

Rose explains that he was placed in this track because he had scored poorly on aptitude tests, even though he had a strong academic record and was interested in pursuing higher education (Rose, 2017, p. 313).

Number one: Rose scored well on his IQ test, not “poorly.” Two years later, when a teacher discovered the mistake, he was moved to the college track, and the mix-up is key to the story. Rose wasn’t a studious kid whose life was ruined because he flubbed an aptitude test. Just the opposite.

Number two: Rose didn’t have “a strong academic record.” He’d been a good reader in grade school, but he makes it clear he never gained traction in math or science. In high school he “destested” Shakespeare and was “bored” with history. More to the point, the events in Rose’s story take place in 1958, in South Los Angeles, a time (and place) when “strong academic records” weren’t really a thing. A strong academic record at age 13 in 1958 LA: not the zeitgeist.

Number three: Rose had close-to-zero focus on “pursuing higher education.” “I figured I’d get a night job and go to the local junior college,” he writes, “because I knew that Snyder and Company were going there to play ball.” That was it, the sum total of his ambition. Night job, junior college, watch your friends play football. He had no idea what an entrance requirement was, and he did nothing to prepare. His grades “stank.”

So the AI is making it all up, and making it up egregiously wrong to boot. The AI’s take on “I Just Wanna Be Average” is anachronistic start to finish: it reads like contemporary boo-hooing over the horrors of standardized testing and its “embedded” “cultural and economic biases” (these words actually appear in the paper).  

In the really existing essay, just about the only thing Rose has going for him academically is a high IQ as measured by an IQ test. That’s what saves him, that and two teachers who take an interest. It’s the 1950s and Mike Rose is a classic diamond in the rough, identified by a standardized test and plucked off his working-class path by a school that administers the test and tracks its students according to their scores. 

But IQ-test-gives-child-of-immigrants-a-leg-up isn’t the story the AI read. Maybe it’s not a story the AI can read.

To write its text, the AI simply reached into its grab bag of contemporary edu-shibboleths et voila. A college paper that makes no sense at all and isn’t about the thing it’s about.

Artificial intelligence: other posts

And see:
Does the AI cheat, part 1
Does the AI cheat, part 2
Does the AI cheat, part 3
Is the AI a bull**** artist?

Why would a promoter of free-range childhood promote coercive quackery for society’s most vulnerable children?

In a recent article at the New York Post, “free range kids” proponent Lenore Skenazy inadvertently promotes what all the available evidence suggests is the opposite of a free range childhood: spending hours drifting one’s finger over a letterboard and, in response to subtle subconscious cues and not so subtle prompts from the person holding up the board, slowly picking out one letter after another in sequences that sometimes number several dozen letters long.

Talk about helicopter parenting! This is about as bad as it gets.

Continue reading