Does AI know what “it” means? A look at Winograd schemas via Melanie Mitchell

I left off my last post with a claim that AI systems–even those that can answer questions and engage in conversations–are eluded by little words like “it.”

To flesh this out, I turn to a fantastic new book I just finished by Melanie Mitchell–Artificial Intelligence: A Guide for Thinking Humans. (Her book first crossed my radar back before my Twitter suspension, when “autism gadfly” Jonathan Mitchell–her brother and my Twitter friend–mentioned it in a tweet.)

In a chapter on the limits of natural language processing, Mitchell discusses miniature language-understanding tests called “Winograd schemas”, named for NLP researcher Terry Winograd. Here is the first of her examples:

SENTENCE 1: “The city council refused the demonstrators a permit because they feared violence.”

QUESTION: Who feared violence?

A. The city council B. The demonstrators

SENTENCE 2: “The city council refused the demonstrators a permit because they advocated violence.”

QUESTION: Who advocated violence?

A. The city council B. The demonstrators

(Mitchell, p. 225)

To figure out what “they” refers to in each sentence, actual understanding of “feared” vs. “advocated” is necessary, along with background knowledge about demonstrators, permits, and city government.

Here are the other examples cited by Mitchell:

SENTENCE 1: “Joe’s uncle can still beat him at tennis, even though he is 30 years older.”

QUESTION: Who is older?

A. Joe B. Joe’s uncle

SENTENCE 2: “Joe’s uncle can still beat him at tennis, even though he is 30 years younger.”

QUESTION: Who is younger?

A. Joe B. Joe’s uncle

(Mitchell, p. 226)

SENTENCE 1: “I poured water from the bottle into the cup until it was full.”

QUESTION: What was full?

A. The bottle B. The cup

SENTENCE 2: “I poured water from the bottle into the cup until it was full.”

QUESTION: What was empty?

A. The bottle B. The cup

(Mitchell, p. 226)

SENTENCE 1: “The table won’t fit through the doorway because it is too wide.”

QUESTION: What was too wide?

A. The Table B. The doorway

SENTENCE 2: “The table won’t fit through the doorway because it is too narrow.”

QUESTION: What was too narrow?

(Mitchell, p. 226)

At the time of Mitchell’s writing, the software program that reported the best performance on a sizable set of Winograd schemas had only 61% accuracy–somewhat better than random guessing. How did it manage this? Not, Mitchell explains, by any actual understanding and real-world background knowledge, but simply by searching the Internet for co-occurrences of the various phrases–e.g., “pour from the bottle,” “pour into the cup,” and “bottle full” vs. “pour from the bottle,” “pour into the cup”, and “cup full.”

Ironically, questions that, for humans, are very straightforward are actually much tougher for AI systems than beating humans at Jeopardy or successfully passing as a human during a brief online chat. Trivia questions can be answered via key words, Internet look-up, and “answer extraction”; Turing Tests can be passed via canned responses and answer evasion. Neither require the kind of common sense and linguistic understanding necessary to make sense of the pronouns in Winograd sets.

AI, Mitchell acknowledges, has come a very long way. But much of that distance is due to the exponential increase in online data, and as Mitchel writes, this only gets you so far:

It seems to me to be extremely unlikely that machines would ever reach the level of humans on translation, reading comprehension, and the like by learning exclusively from online data, with essentially no real understanding of the language they process. Language relies on commonsense knowledge and understanding of the world… A table that is too wide won’t fit through a doorway. If you pour all the water out of a bottle, the bottle thereby becomes empty.

(Mitchell, p. 231)

(Hear that, @Twitter, @TwitterSupport, @TwitterSafety???)

Towards the end of her book, Mitchell adds:

…only the right kind of machine–one that is embodied and active in the world–would have human-level intelligence in its reach.

In the meantime, as Allen Institute for AI director Oren Etzioni, cited by Mitchell, puts it:

When AI can’t determine what ‘it’ refers to in a sentence, it’s hard to believe that it will take over the world.

(Mitchell, p. 228)

As far as that goes, viruses (of the naturally occurring sort) are far, far superior.

One thought on “Does AI know what “it” means? A look at Winograd schemas via Melanie Mitchell

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s