Theory of Mind deficit in autism? Part I: is it all about language instead?

(Cross-posted at

In this post and five subsequent posts, I review the final article in my series on Morton Gernsbacher’s FC-friendly articles on the nature of autism. In this last article, Empirical Failures of the Claim That Autistic People Lack a Theory of Mind, Gernsbacher and Yergeau (2019) go further than any of Gernsbacher’s previous articles in making the case against autism as a socio-perceptual, socio-cognitive disorder. In particular,  Gernsbacher and Yergeau claim that the original studies that showed Theory of Mind deficits in autism have failed to replicate and been overturned by later studies.

Morton A. Gernsbacher, Professor of Psychology, University of Wisconsin

Gernsbacher and Yergeau open with a straw-man characterization of what Theory of Mind proponents purportedly have said: “The assertion that autistic people lack a theory of mind—that they fail to understand that other people have a mind or that they themselves have a mind—pervades psychology”. The more common claim, rather, is that autism involves some degree of deficit in the recognition/awareness of emotions in other people (e.g., attending to and recognizing facial expressions, or deducing specific emotions from behavior), combined with some degree of deficit in socio-cognitive perspective taking: in figuring out how to respond appropriately to another person’s emotional needs; in deducing the belief set of another person when those beliefs conflict with one’s own.

Of course, Gernsbacher and Yergeau’s argument depends, to some extent, on what exactly is meant by Theory of Mind (henceforth, ToM). To some extent, ToM has traditionally been defined by the long-standing tests used to assess ToM: false-belief tests like Sally-Anne test and the Smarties test; Happé’s Strange Stories; Baron-Cohen’s Mind in the Eyes test (which we’ll call “Eyes test” for short); and the Animated Triangles test. Here is a quick summary of the key tests:

  • The Strange Stories Test measures the ability to deduce the social/emotional reasons for characters’ behaviors in a narrative sequence.
  • The Eyes Test measures the ability to recognize emotions from facial information that is restricted to a rectangular region around the eyes.
  • The Animated Triangles Test measures the ability to ascribe emotions to self-propelled, interacting shapes in a short animation.
  • The false-belief tests measure the ability to make inferences about the beliefs held by (or about) individuals who are missing key pieces of information. Two common examples are:
    • The Sally-Anne “unexpected location change”  test, in which Sally doesn’t witness Anne moving her marble from a basket to a box
    • The Smarties “unexpected contents” test, in which a candy box contains a pencil rather than candy

False-belief tasks are the most cognitively complex of the ToM tasks: they measure what we can call “cognitive perspective taking”—the ability to put oneself in someone else’s cognitive shoes and deduce their state of knowledge or beliefs. Gernsbacher and Yergeau begin their article by reiterating one of the arguments made in Gernsbacher and Frymiare (2005) about how failure in such tests correlates with language development rather than with autism. Typically developing children need to reach a certain level of language development to pass false-belief tests, and children with non-autism related language delays—e.g., deafness—are also delayed in passing false-belief tests. But Gernsbacher and Frymiare (2005), as I discuss in my *earlier post*, do not address the fact that autistic individuals need to reach a much higher language level than other groups do to pass these tests. This fact suggests that those autistic individuals who do pass false-belief tests may be doing so through an atypical mechanism—i.e., a more deliberative reasoning out, or “hacking out” through language.

Gernsbacher and Yergeau, on the other hand, do attempt to address the “hacking out” hypothesis. Their counterargument is that it is hard to reconcile the “hacking out through language” strategy with the language impairments common to autism. They ask: “How and why would autistic people preferentially use language to ‘hack out’ the answers while non-autistic people, without communication impairments, do not?” The answer, of course, is that language, given the challenges of autism, may nonetheless still be autistic people’s best option—assuming, of course, that they reach the requisite level of linguistic mastery. Moreover, the fact that the level of linguistic mastery required for autistic people to pass false-belief tests is considerably higher than that required for non-autistic individuals suggests that, when autistic individuals pass false-belief tests, language is doing more work—i.e., playing a bigger role—than it does for non-autistic people.

Gernsbacher and Yergeau also reiterate Gernsbacher and Frymiare’s (2005) arguments about the complex language involved in false-belief tests—and how, therefore, failing to pass false-belief tasks may merely reflect language difficulties. However, as I noted in reference to the 2005 paper, there are non-verbal versions of the false-belief tests—also called “implicit” false-belief tests—that rely only on anticipatory gazing (e.g., towards where Sally will look for her marble) as opposed to answering a complex question (e.g., “Where does Sally think her marble is”). One implicit false-belief experiment is discussed in Senju et al. (2009). They find that the autistic participants, unlike the non-autistic participants, failed to look anticipatorily at the location where Sally would look for her marble. Indeed, this is even the case with high functioning adults with Asperger’s who (as would be expected from their verbal mental ages) were able to the standard (verbal) ToM tests, including the Sally-Anne task and a second order ToM test called the “ice cream test”.

Instead of discussing these eye-tracking versions of the false-belief tests, Gernsbacher and Frymiare cite, as their only example of nonverbal false-belief tasks, the picture-sequencing experiment of Baron-Cohen et al. (1986). This experiment compared autistic and non-autistic performance on sequencing pictures into three kinds of sequences: mechanical sequences (e.g., a balloon rising into the sky and then being punctured by a tree branch),  behavioral sequences (e.g., a child entering a store and exchanging money for merchandise), and “intentional”/ToM sequences (e.g., a child putting some candy in the box, leaving the room, and then, after someone else opens the box and eats the candy, returning, re-opening the box, and looking disappointed). Regarding these findings, Gernsbacher and Yergeau say:

Four research teams, of whom we are aware, have published attempts to directly replicate these results—and none could do so. Using the same stimuli, procedures, and analyses, no other research team has replicated the finding that autistic participants perform significantly worse than typically developing participants on the “intentional” picture sequences.

The studies cited by Gernsbacher and Yergeau here are Ozonoff et al. (1991); Oswald and Ollendick (1989); Buitelaar et al. (1999), and Brent, Rios et al. (2004). And while all of them do indeed fail to replicate Baron-Cohen et al.’s results, some of their other findings do suggest that ToM deficits nonetheless characterize autism.

For example, Ozonoff et al. found “significantly poorer emotion perception abilities in this sample of autistic individuals.” They also found that a subset of the autistic group (as opposed to the Asperger’s group) performed significantly less well on the first order ToM test (the Smarties test) and that this performance correlated with the emotion perception scores.

Brent et al., meanwhile, found that the autistic participants “performed worse than typically developing controls on the mentalizing Strange Stories task and the Eyes task. Their conclusion: “children with ASD are impaired relative to typically developing children in advanced theory of mind abilities.”

 As some of these authors acknowledge, the picture sequencing tasks may not accurately capture ToM deficits. This is not the same thing as finding that ToM deficits do not exist.

Despite Ozonoff’s replications of relative difficulty for autistic people on first-order false-belief tests, Gernsbacher and Yergeau claim that “Baron-Cohen, Leslie, and Frith (1985)’s seminal study reporting that autistic participants are prone to fail first-order False Belief tasks… is also prone to fail replication.” Here they cite Dahlgren & Trillingsgaard (1996);  Russell & Hill (2001); Oswald & Ollendick (1989); Fitzpatrick et. al. (2013); Yirmiya & Shulman (1996); Yirmiya et. al. (1998); Moran et. al., (2011).

To the extent that these studies fail to replicate an autism-specific tendency to fail false-belief tasks, it is largely because of the linguistic confounds discussed above: all children fail the standard false-belief tests before they achieve a certain language level, and all children pass the standard false-belief tests above a certain language level. What’s specific to autism, as we discussed, is the need to achieve a significantly higher level of linguistic functioning than in other groups in order to pass explicit false-belief tests. Furthermore, Buitelaar et. al. (1999), cited elsewhere in Gernsbacher and Yergeau’s piece, found that performance on first-order false-belief tests correlated with autistic traits as measured by a standard measure of autism symptomology (the CARS).

Gernsbacher and Yergeau then turn to second-order false-belief tasks. These are tasks that involve calculating what one character will believe about another character’s (faulty) belief (“Where does Anne think that Sally will look for her marble?”). Here they point to studies showing that autistic participants are no more prone to fail second-order false-belief tasks than other populations are—a fact that is indeed borne out by the various studies. However, there are some key things to keep in mind. First, since passing a first-order false-belief test is a prerequisite for passing a second-order false-belief test, these studies only include individuals who pass first-order false-belief-tests. In autism, as we discussed, passing first-order false-belief tests means having achieved a significantly higher verbal mental age than in other groups. And what the studies (e.g., Tager-Flusberg & Sullivan, 1994) propose is that, at this point, challenges other than autism-specific challenges like mentalizing figure most importantly. Prime candidates are executive functioning (EF) challenges like working memory and cognitive load/processing demands.

In addition, the ability to pass second-order false-belief tests does not rule out ToM difficulties in the real world. For example, Leekam & Prior (1994), despite finding that the autistic participants were no more prone than other groups to fail second-order false-belief tests and could “make appropriate social judgements about lies and jokes”, also found difficulties, as per parent reports, in real-world ToM skills. These reports turned up difficulty both with lying, in particular white lies, and with joking—and here there was no difficulty between second-order “passers” and “failers.” Overall, Leekam & Prior’s result support that “theory of mind development is delayed relative to the mental age of normal children”.

Similarly, Bauminger & Kasari (1999) report that, despite the similar performance by autistic and non-autistic participants on second-order false-belief tests, “the autistic children, as a group, gave more irrelevant or wrong responses when asked to explain their answers to the belief question.” (Such explanations are not generally part of standard false-belief tests). Such performance, they note, suggests the ability to pass second-order false-belief tests does not rule out deficits in social understanding. They cite similar findings by Ozonoff & Miller (1995) and others. As Bauminger & Kasari put it, “Even very able children with autism, or children with Asperger syndrome, have difficulties in applying theory of mind when faced with a real social situation.”

One theory of mind task that is intended to measure real-world skills are Happé’s Strange Stories—stories that involve real-world mentalizing situations. But here, too, Gernsbacher and Yergeau argue that there has been a failure to replicate—a failure, specifically, to replicate Happé’s (1994) finding that autistic participants who pass first- or second-order False Belief tests nonetheless fail the Strange Stories test.

But once again there is the confounding role of language. The Strange Stories, by their very nature, are grounded in language and involve much more verbiage than false-belief tests. Even if the syntax isn’t as complex, there are a greater range of vocabulary words. Consistent with this, Sheeren et al. found that age and verbal abilities (as well as general reasoning abilities), not autism diagnosis, were the determining factors for success with Strange Stories. Consistent, in turn, with Sheeren et. al., those studies that found no deficiencies with Strange Stories involved adults with high functioning autism or Asperger’s (Senju et. al., 2009; Schuwerk et. al., 2018; Ponnet et. al., 2004).

As for younger children, Gillotte et. al., (2004) report that, despite similarities in overall performance between autistic and non-autistic groups, the former gave significantly more inappropriate mental state answers (e.g., angry instead of sad) to the Strange Stories questions. This suggests, Gillotte et al. propose, that many children with autism, despite “severe ToM difficulties”, had “learned to apply generalized compensatory strategies such as offering an explanation in terms of mental states to any question as to why a person acted in a particular way.”

Another study involving younger children, cited by Gernsbacher and Yergeau as failing to replicate Happé (1994)’s Strange Stories results, is White, et. al. (2009). While children with autism who showed ToM impairment on other tests performed “significantly more poorly” than controls on the Strange Stories tests, they also performed more poorly on counterparts in which the stories were based on animals. Perhaps this suggested to Gernsbacher and Yergeau, inasmuch as animals don’t have minds, that the issue was something other than ToM. White et al.’s explanation, rather, is that “a mentalizing [ToM] deficit may affect understanding of biologic agents even when this does not explicitly require understanding others’ mental states.”

Returning, now, to autistic adults, even those highly verbal autistic participants who performed well on the Strange Stories tests struggled with other ToM measures. Schuwerk et. al. (2018), for example, report that such participants scored significantly lower on the Eyes test and also showed less accurate anticipatory looking in an eye-tracking-based implicit false-belief test akin to the one discussed above. Schneider et. al. (2013) also found that, despite similar performance on explicit ToM tests, autistic individuals showed no evidence of anticipatory looking (“tracking”) in an implicit false-belief test, “even over a one-hour period and many trials.” They conclude that “the systems involved in implicit and explicit ToM are distinct” and that “impaired implicit false-belief tracking may play an important role in ASD”.

Furthermore, Murray et. al. (2017) show that, when the Strange Stories are substituted with an animated counterpart in which language is less of a confounding factor, the differences between autistic and non-autistic performances increase, with the former showing more relative difficulty. As Murray et. al. note,

Adults with ASD had lower scores, indicating difficulties with social cognition that could not be explained by general cognitive factors (e.g., verbal ability) and were specific to understanding the intentions behind nonliteral language in communication.

Ponnet et. al., (2004)’s findings were similar. On the standard “static” versions of both the Strange Stories and Eyes tests, the adults with Aspergers performed similarly to typically developing adults. But in the more naturalistic empathic accuracy task, in which participants attempted to infer a person’s thoughts and feelings in a videotape of that person in a naturally occurring conversation with another person, the individuals with Asperger’s performed significantly worse. Roeyers et. al., (2001) obtained similar results in similar experiments with adults with pervasive developmental disorder (PDD).  As they put it:

These findings suggest that the mind-reading deficit of a subgroup of able adults with PDD may only be apparent when a sufficiently complex naturalistic assessment method is being used.

One study cited by Gernsbacher and Yergeau as failing to replicate Happé (1994)’s Strange Stories results, Spek et. al. (2010), in fact did replicate it. They found that HFA and Aspergers adults “were impaired in performance of the Strange Stories test and the Faux-pas test [which taps the ability to detect social blunders] and reported more theory of mind problems than the neurotypical adults.” It was only their performance on the Eyes test—which, as we noted earlier, tests the ability to infer specific emotions from the eye-region of faces—that their performance was equivalent to the other groups.

As for the Eyes test, Gernsbacher and Yergeau note, correctly, that it involves sophisticated vocabulary (words like “envious” and “fantasizing”)—though we should add that word definitions are often provided as part of the test. Gernsbacher and Yergeau then cite research, both on autistic individuals and on individuals with non-autistic language challenges, that shows that “the best predictor of Reading-the-Mind-in-the-Eyes” is vocabulary. But one of these studies still shows connections, if weaker, to cognitive empathy and emotion perception (Olderbak et. al., 2015). In addition, as we’ve discussed earlier, vocabulary itself is grounded in social engagement: early vocabulary development is a function of frequency of responses to Joint attention (RJA). Given how subsequent vocabulary builds on earlier vocabulary, it could be that vocabulary levels, even late in development, correlate with early (and even current) social engagement—the more so with words like “envious” and “fantasizing”. If so, then there may be an underlying social variable that underlies both vocabulary, especially socio-emotional vocabulary, and performance on the Eyes test.

This blog post has gone on long enough. For those who’ve made it this far, here’s a quick recap:

  • Some ToM tests use language as a testing medium, and thus require a certain threshold level of linguistic comprehension. This can result in language skills being a bigger causal factor in test performance than mentalizing/perspective-taking skills are—except when we consider the causal role played by mentalizing/perspective-taking skills in the acquisition of those language skills.
  • Autistic individuals need to acquire significantly more language than non-autistic individuals do before they can pass false-belief tests. This suggests that autistic individuals are using language to reason through these tests in a more deliberate way than their non-autistic counterparts do.
  • Second-order false belief tests (“Where does Ann think that Sally will look for her marble?”) appear to be primarily measuring something other than social reasoning skills (e.g., working memory and cognitive load/processing demands). But second-order false belief tests are generally administered only to those who pass first-order false-belief tests—those, in order words, who have already reached a significant social-cognitive milestone.
  • Second-order false belief tests aside, the patterns of performance on ToM tests observed in autistic vs. non-autistic individuals, including in the articles cited by Gernsbacher and Yergeau, consistently indicate ToM difficulties in autism.
  • The most impaired ToM skills are those involving
    • automatic (as opposed to deliberate) perspective-taking (e.g., automatic gaze shifting to where Sally will look for her marble)
    • socially naturalistic settings.


Bauminger, N., & Kasari, C. (1999). Brief report: theory of mind in high-functioning children with autism. Journal of autism and developmental disorders29(1), 81–86.

Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a “theory of mind”?. Cognition21(1), 37–46.

Baron-Cohen, S., Leslie, A. M., & Frith, U. (1986). Mechanical, behavioural and Intentional understanding of picture stories in autistic children. British Journal of Developmental Psychology, 4, 113–125.

Brent, E., Rios, P., Happé, F., & Charman, T. (2004). Performance of children with autism spectrum disorder on advanced theory of mind tasks. Autism : the international journal of research and practice8(3), 283–299 (p. 286).

Buitelaar, J. K., van der Wees, M., Swaab-Barneveld, H., & van der Gaag, R. J. (1999). Theory of mind and emotion-recognition functioning in autistic spectrum disorders and in psychiatric control and normal children. Development and psychopathology11(1), 39–58.

Dahlgren, S. O., & Trillingsgaard, A. (1996). Theory of mind in non-retarded children with autism and Asperger’s syndrome: A research note. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 37, 759–763.

Fitzpatrick, P., Diorio, R., Richardson, M. J., & Schmidt, R. C. (2013). Dynamical methods for evaluating the time-dependent unfolding of social coordination in children with autism. Frontiers in Integrative Neuroscience, 7, Article 21.

Gernsbacher, M. A., & Frymiare, J. L. (2005). Does the Autistic Brain Lack Core Modules?. The journal of developmental and learning disorders9, 3–16.

Gernsbacher, M. A., & Yergeau, M. (2019). Empirical Failures of the Claim That Autistic People Lack a Theory of Mind. Archives of scientific psychology7(1), 102–118.

Gillott, A., Furniss, F., & Walter, A. (2004). Theory of mind ability in children with specific language impairment. Child Language Teaching and Therapy, 20, 1–11.

Happé, F. G. E. (1994). An advanced test of theory of mind: Understanding of story characters’ thoughts and feelings by able autistic, mentally handicapped, and normal children and adults. Journal of Autism and Developmental Disorders, 24, 129–154.

Leekam, S. R., & Prior, M. (1994). Can autistic children distinguish lies from jokes? A second look at second-order belief attribution. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 35, 901–915.

Moran, J. M., Young, L. L., Saxe, R., Lee, S. M., O’Young, D., Mavros, P. L., & Gabrieli, J. D. (2011). Impaired theory of mind for moral judgment in high-functioning autism. Proceedings of the National Academy of Sciences of the United States of America, 108, 2688–2692.

Murray, K., Johnston, K., Cunnane, H., Kerr, C., Spain, D., Gillan, N., . . . Happé, F. (2017). A new test of advanced theory of mind: The “Strange Stories Film Task” captures social processing differences in adults with autism spectrum disorders. Autism Research, 10, 1120–1132.

Olderbak, S., Wilhelm, O., Olaru, G., Geiger, M., Brenneman, M. W., & Roberts, R. D. (2015). A psychometric analysis of the reading the mind in the eyes test: Toward a brief form for research and applied settings. Frontiers in Psychology, 6, Article 1503.

Oswald, D. P., & Ollendick, T. H. (1989). Role taking and social competence in autism and mental retardation. Journal of Autism and Developmental Disorders, 19, 119–127.

Ozonoff, S., & Miller, J. N. (1995). Teaching theory of mind: a new approach to social skills training for individuals with autism. Journal of autism and developmental disorders25(4), 415–433.

Ozonoff, S., Rogers, S. J., & Pennington, B. F. (1991). Asperger’s syndrome: Evidence of an empirical distinction from high-functioning autism. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 32, 1107–1122.

Ponnet, K., Buysse, A., Roeyers, H., & De Corte, K. (2005). Empathic accuracy in adults with a pervasive developmental disorder during an unstructured conversation with a typically developing stranger. Journal of Autism and Developmental Disorders, 35, 585–600.

Roeyers, H., Buysse, A., Ponnet, K., & Pichal, B. (2001). Advancing advanced mind-reading tests: Empathic accuracy in adults with a pervasive developmental disorder. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 42, 271–278.

Russell, J., & Hill, E. L. (2001). Action-monitoring and intention reporting in children with autism. Journal of Child Psychology and Psychiatry, and Allied Disciplines, 42, 317–328.

Schneider, D., Slaughter, V. P., Bayliss, A. P., & Dux, P. E. (2013). A temporally sustained implicit theory of mind deficit in autism spectrum disorders. Cognition, 129, 410–417.

Schuwerk, T., Priewasser, B, Sodian, B, & Perner, J. (2018). The robustness and generalizability of findings on spontaneous false belief sensitivity: A replication attempt. Royal Society Open Science, 5, Article 72273.

Senju, A., Southgate, V., White, S., & Frith, U. (2009). Mindblind eyes: An absence of spontaneous theory of mind in Asperger syndrome. Science, 325, 883–885.

Spek, A. A., Scholte, E. M., & Van Berckelaer-Onnes, I. A. (2010). Theory of mind in adults with HFA and Asperger syndrome. Journal of Autism and Developmental Disorders, 40, 280–289.

Tager-Flusberg, H., & Sullivan, K. (1994). A second look at second-order belief attribution in autism. Journal of Autism and Developmental Disorders, 24, 577–586.

White, S., Hill, E., Happé, F., & Frith, U. (2009). Revisiting the strange stories: Revealing mentalizing impairments in autism. Child Development, 80, 1097–1117.

Yirmiya, N., Erel, O., Shaked, M., & Solomonica-Levi, D. (1998). Meta-analyses comparing theory of mind abilities of individuals with autism, individuals with mental retardation, and normally developing individuals. Psychological Bulletin, 124, 283–307.

Yirmiya, N., & Shulman, C. (1996). Seriation, conservation, and theory of mind abilities in individuals with autism, individuals with mental retardation, and normally developing children. Child Development, 67, 2045–2059.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s