Is diminished Joint Attention really not a problem for word learning in autism?

(Cross-posted at

The standard view of autism as a socio-cognitive disorder accounts for the language delays/limitations in autism by:

  1. Observing that Joint Attention (JA) is diminished in autism and
  2. Pointing to causal connections between JA and word learning.

In my last post on the autism-related research of Morton Gernbacher, I critiqued Gernsbacher’s claim that JA is impaired in autism and concluded by saying that:

Diminished eye-contact and diminished JA have serious consequences for linguistic and socio-cognitive development. Where language is concerned, treating eyes as socially salient is crucial—especially for basic word learning.

But another article co-authored by Gernbacher—Akhtar and Gernbacher (2007)—calls into question some of these connections, arguing that JA is not necessary for word learning. Crucially, this argument depends on a narrower definition of JA than is standard and than is found in the research that connects JA to word learning.

In this, my third installment of Gernbacher critiques, I take a look at this argument and at this narrower view of JA.

Morton Ann Gernsbacher, Department of Psychology, University of Wisconsin-Madison

JA, Akhtar and Gernsbacher argue, involves “mutually coordinated attention,” such that JA applies only if both parties are consciously engaged in it. This definition, curiously, conflicts with the definition Gernbacher uses a year later in the article I critiqued two weeks ago. (That later definition, also problematic, includes reflexive, as opposed to deliberate, JA behaviors: behaviors that, in autism, resemble attention to non-social stimuli like arrows).

According to the standard definition of JA, in contrast, it’s enough that one person purposefully coordinates their focus of attention with that of another person. This typically involves what’s called “gaze following,” or turning one’s head and eyes in the same direction as someone else’s.

Not only is this the standard definition of JA, it is also the definition used in the various studies that find a role for JA in word learning. For example, in a study by Baron-Cohen et al. (1997), an experimenter presented each child participant with two unusual objects, gave one to child, held the other, and waited till the child was looking at the child’s toy. The experimenter then looked intently at experimenter’s toy and uttered the novel word (e.g. “peri”). This procedure was repeated with a second pair of unusual objects. For a child to learn the made-up words, he/she needed to stop looking at his/her toy and instead look at the experimenter’s toy. Whether or not this shared focus of attention is “mutually coordinated” is beside the point.

The results of the Baron-Cohen et al. experiment established a connection between this not-necessarily mutually-coordinated JA and both autism, on one hand, and word learning, on the other. When the participants were later presented with bags containing all four unusual objects and asked questions like “Can you find the peri?”, the result was that, compared to 70.6% of intellectually impaired and 79% of same-verbal-mental aged non-autistic children, only 29.4% of autistic children got the right answer. Most of the time, the autistic children assumed that the word referred to the toy they had been holding rather than the one the experimenter was referring to. This indicates that they had not looked up and engage in JA with the experimenter when she named the object she was holding.

Akhtar and Gernsbacher correctly observe that people, including children as young as two, can learn words as bystanders—i.e., when they are not actively engaged in a conversation. But being a bystander doesn’t prevent you from engaging in JA in the sense of purposefully coordinating your focus of attention with that of the person who is speaking. Whether that person is addressing a third party, or speaking to themselves (as was ostensibly the case in the Baron-Cohen et al. experiment), bystander JA (1) is still possible and (2) opens up opportunities for word learning.  Indeed, Akhtar & Gernsbacher cite an experiment by Akhtar (2005) that involved a variation on the Baron-Cohen et al. experiment in which children were given an engaging toy to play with while two adults interacted with other toys. In this experiment, the children—who, importantly, were typically developing rather than autistic—were able to learn the novel word.

Another reason that Akhtar and Gernsbacher give for claiming that JA is irrelevant to vocabulary acquisition in autism is that, they claim, the component of JA most affected by autism is initiating Joint Attention (IJA), as opposed to responding to Joint Attention (RJA). IJA occurs when an individual initiates a JA process by attempting to direct the other person to attend to the same thing they’re attending to. RJA occurs when someone responds to that person’s attempts, shifting their attention to the thing the other person is attending to. RJA, which Akhtar and Gernsbacher claim is relatively unimpaired in autism, is what most research implicates as the aspect of JA that is most relevant to vocabulary development.  

To support their claim that RJA is relatively unimpaired in autism, Akhtar and Gernsbacher cite Mundy (2006). But what Mundy states, more precisely, is this: RJA in autism, as measured by gaze following (turning your head in the same direction as someone else’s)

  1. Is initially quite impaired
  2. Improves after the first two years of life
  3. Reaches “adaptive” levels in many older children with autism and those with higher mental ages.

But the process of word learning, importantly, starts well before age 2, and it is in the earliest years of life that the fastest growth in vocabulary typically occurs. Therefore, to the extent that RJA is instrumental to word learning (as Baron-Cohen et al show it to be), delays in reaching “adaptive” levels of RJA will result in commensurate delays in word learning.

In addition, IJA, Mundy argues, also gives rise to situations that are conducive to language learning. Impairments in IJA, therefore, further impede word learning in autism.

Akhtar and Gernsbacher cite Morgan et al. (2003) as reporting that JA and vocabulary development are uncorrelated in autism. But the JA behaviors Morgan et al. measured don’t include gaze following. Rather, Morgan et al. explored whether a child would look up at the researcher’s eyes when the researcher covered the child’s hands while the child tried to play with a toy, showed the child a toy and then withdraw it when the child reached for it, or de-activated a toy. This is not gaze following; it is something more akin to what’s called social referencing.

Akhtar and Gernsbacher go on to cite a number of additional studies that, they claim, have “failed to show a relation between vocabulary development and joint attention in autism.” But all the articles they cite (listed below) focus on IJA rather than RJA, and therefore don’t address what even Akhtar & Gernsbacher says is the aspect of JA most relevant to word learning:

  1. Mundy et al. (1998) only looks IJA (pointing, showing). They find that, in Downs Syndrome, a high rate of IJA independent of language development.
  2. Lord and Pickles (1996) also focuses specifically IJA (showing, directing attention) or more general socially directed gaze (i.e., gaze at social stimuli like clapping). They found that the “main effects of the children’s language level” came from other factors rather than these.
  3. Stone and Yoder (2001) also looked at IJA (which was measured by parent survey) and didn’t measure vocabulary but expressive language more generally.
  4. Travis et al. (2001) also focused on IJA (because, as they reported, RJA measures were high for most of the participants in their study). They found language uncorrelated with IJA; they did not tease out vocabulary in particular.

Interestingly, early on in their article, Akhtar and Gernsbacher cite Mundy et al. (1990) as their one example of the “robust literature examining the joint attention skills of autistic children.” But Akhtar and Gernsbacher don’t discuss the details of this article, which doesn’t support their case. Mundy et al., though they mostly looked at IJA (“gestural joint attention,” as measured by pointing and sharing), also included head turns in response to the experimenter’s points (an instance of RJA that is highly connected to word learning). And what did Mundy et al. find?

They found—consistent with most of the research on Joint Attention and language—that these various Joint Attention skills collectively predict language learning.


Akhtar N. (2005). Is joint attention necessary for early language learning? In: Homer Bruce D, Catherine S. Tamis-LeMonda., (Eds). The Development Social Cognition and Communication (pp. 165–179). Erlbaum.

Baron-Cohen, S., Baldwin, D. A., & Crowson, M. (1997). Do children with autism use the speaker’s direction of gaze strategy to crack the code of language?. Child development68(1), 48–57.

Akhtar, N., & Gernsbacher, M. A. (2007). Joint Attention and Vocabulary Development: A Critical Look. Language and linguistics compass1(3), 195–207.

Lord, C., & Pickles, A. (1996). Language level and nonverbal social-communicative behaviors in autistic and language-delayed children. Journal of the American Academy of Child and Adolescent Psychiatry35(11), 1542–1550.

Morgan, B., Maybery, M., & Durkin, K. (2003). Weak central coherence, poor joint attention, and low verbal ability: independent deficits in early autism. Developmental psychology39(4), 646–656.

Mundy P. (2006). New Perspectives on the Role of Joint Attention in Language Development; Keynote address given to the 27th annual Symposium for Research on Child Language Disorders; Madison, WI.

Mundy, P., Sigman, M., Kasari, C., & Yirmiya, N. (1988). Nonverbal communication skills in Down syndrome children. Child development59(1), 235–249.

Stone, W. L., & Yoder, P. J. (2001). Predicting spoken language level in children with autism spectrum disorders. Autism : the international journal of research and practice5(4), 341–361.

Travis, L., Sigman, M., & Ruskin, E. (2001). Links between social understanding and social behavior in verbally able children with autism. Journal of autism and developmental disorders31(2), 119–130.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s