I spent my last three posts discussing the failure of artificial intelligence, even state of the art AI, to understand what we say to it and what it says to us. But understanding, surely, isn’t everything. Real-world communications aside, how does AI fare re one of Catherine’s and my other core interests: linguistic tools for clear, coherent sentences?
Two elements of clear coherent sentences are word choice and punctuation, and some of the more common word choice and punctuation errors have long been handled by basic word processors. In the last decade, however, more sophisticated tools have emerged: tools like ProWritingAid, Ginger, WhiteSmoke, and Grammarly. The latter, with over 10 million users, is hard to miss unless you completely avoid YouTube. If you survey Grammarly’s Internet reviews, you get the sense that it is the most sophisticated—and most expensive—of the new tools. It’s also the easiest one to get information about without buying a subscription. So Grammarly is my focus today.
I left off my last post with a claim that AI systems–even those that can answer questions and engage in conversations–are eluded by little words like “it.”
To flesh this out, I turn to a fantastic new book I just finished by Melanie Mitchell–Artificial Intelligence: A Guide for Thinking Humans. (Her book first crossed my radar back before my Twitter suspension, when “autism gadfly” Jonathan Mitchell–her brother and my Twitter friend–mentioned it in a tweet.)
In a chapter on the limits of natural language processing, Mitchell discusses miniature language-understanding tests called “Winograd schemas”, named for NLP researcher Terry Winograd. Here is the first of her examples:
As we saw in my last post on GPT-3, state of the art AI can write poetry, computer code, and passages that ruminate about creativity. It can also, with priming from a human being, emulate a particular style:
Before asking GPT-3 to generate new text, you can focus it on particular patterns it may have learned during its training, priming the system for certain tasks… If you prime it with dialogue, for instance, it will start chatting with you.https://www.nytimes.com/2020/11/24/science/artificial-intelligence-ai-gpt3.html
And if you it to write in the style of, say, Scott Barry Kaufman, it will do that.
But, as the New York Times notes, there are limits:
Here, as promised is the full text of my critique:
Reviewer II concludes (yes, there is, in fact, a conclusion to their critique) with the second half of my introduction:
This study, however, is based on faulty assumptions that undermine both its rationale and its conclusions: assumptions about test performance; about prerequisites for certain linguistic skills; about a disconnect between speech-based vs. typing-based communication skills; about what it would take for someone to point to letters based on cues from the person holding up the board; and about the study’s central premise: the agency of eye gaze.
Reviewer II views this not as a thesis statement to be supported in subsequent paragraphs, but as an accusation to be dissected as is:
Reviewer II next turns to the concept of message passing. Here’s what I wrote:
The most direct way to validate authorship is through “message passing” tests involving prompts that the facilitator has not seen ahead of time. For contact-based facilitation, studies dating back to the 1990’s indicate that facilitators do in fact guide typing.2,3,4 Results show that the overwhelming majority of typed responses—when they occur—are based on prompts that the facilitator witnessed, and not on prompts that only the typist witnessed, strongly suggesting that the facilitator is the actual author. As for assistance via held-up letterboards, practitioners have yet to participate in rigorous, published, message passing experiments.52. Moore, S., Donovan, B., & Hudson, A. Facilitator-suggested conversational evaluation of facilitated communication. Journal of Autism and Developmental Disorders, 23, 541–552 (1993).
3. Wheeler, D. L., Jacobson, J. W., Paglieri, R. A., & Schwartz, A. A. An experimental assessment of facilitated communication. Mental Retardation, 31, 49–59 (1993).
4. Saloviita, T., Lepannen, M., & Ojalammi, U. Authorship in facilitated communication: An analysis of 11 cases. Augmentive and Alternative Communication, 3, 213-25 (2014).
5. Tostanoski, A., Lang, R., Raulston, T., Carnett, A., & Davis, T. Voices from the past: Comparing the rapid prompting method and facilitated communication. Developmental Rehabilitation, 17, 219–223 (2014).
But for Reviewer II, message passing (MP) means something completely different. Rather than being a means (indeed the best means) to test authorship, it’s a means for interpreting the autistic person’s code.
Having cleared up any confusion about non-stationary letterboards, Reviewer II next moves backwards to this part of my critique of Jaswal et al’s paper:
The authors’ third claim is that the “unfamiliar experimental setting”, combined with “elevated levels of anxiety common in autism”, may help explain difficulties with message passing tests. The anxiety defense is belied by (1) the care taken in many of the tests to make subjects as comfortable as possible12 (in particular, there is no mounting of eye-trackers to subjects’ heads), and (2) the unlikelihood that anxiety would lead—let alone enable—subjects to type out something that the facilitator saw and the typist didn’t. As for “unfamiliar test setting” issues, the authors cite Cardinal et al’s (1996) study in which message passing performance improved over multiple sessions.13 Error rates, however, remained high, and the study has been criticized for several design flaws.14 13. Cardinal, D. N., Hanson, D. & Wakeham, J. Investigation of authorship in facilitated communication. Ment. Retard. 34, 231–242 (1996).
14. Mostert, Mark P. Facilitated Communication Since 1995: A Review of Published Studies. Journal of Autism and Developmental Disorders, Vol. 31, No. 3, 287-313 (2001).
Rather than explain why anxiety would enable subjects to type out something that the facilitator saw but that they themselves didn’t (e.g., by discussing ways in which anxiety might enhance autistic telepathy), Reviewer II invokes the Helsinki Accords:
Having established the depth of their credentials in Brain Machine Interface technologies and pointed out that swallowing food and pointing to letters involve different skills, Reviewer II next (actually, it’s hard to keep track of what’s next, as they’ve gone through my FC critique backwards) turns to my argument that:
Subtle board movements are evident in the study’s supplemental videos, and this highlights a final problem. Even if you accept the authors’ justifications for eschewing message passing tests, and even if you somehow rule out the possibility of board-holding assistants providing cues, a non-stationary letterboard calls into question the study’s central premise: the purported agency of the subjects’ eye gaze. Were subjects intentionally looking at letters, or were letters shifting into their lines of sight? That is something that no eye-tracking equipment, no matter how sophisticated, can answer.
The first part of their response (I’ll go through it in order, rather than backwards) is this:
Moving backwards from the final paragraph of my critique of Jaswal et al’s paper on Facilitated Communication, Reviewer II turns to my statement that:
Jaswal et al, moreover, fail to explain why the subjects were able to deliver the often lengthy responses to the open-ended questions reported in this study when pointing to letters on a letterboard via hunt-and-peck-style typing, but not when articulating words through speech (a much less time-consuming process).
Reviewer II responds by explaining that training people to articulate speech sounds involves techniques that differ from training them to point at letters. This, they suggest, is news to linguists like myself–as well as to the behaviorists who work with autistic children:
As we saw in Parts I and II, the first reviewer of my critique of a paper on Facilitated Communication dismissed it for two reasons. I was, apparently, harping in a petty way on message passing tests (simple tests in which you ask the person being facilitated something that the facilitator doesn’t know the answer to). And I was, apparently, suggesting that there are no language disorders in which people can type things they can’t say.
Reviewer II come up with completely different reasons for dismissing my critique (which they call my “letter”), perhaps because they chose to read it… backwards. “I will start,” they wrote, “from the end of the letter and will work my way up to the top.”
Indeed, their starting point wasn’t even the last sentence of my review, but the final half of that last sentence: “that lets subjects type with their eyes rather than their fingers—as hundreds of children around the world are successfully doing every day.”
My full final sentence was this: