28
Turing’s Imitation Game: Role of Error-making in Intelligent Thought Turing100: Huma Shah (TCAC), Kevin Warwick (TCAC), Ian Bland, Chris Chapman & Marc Allen Turing in Context II: 10-12 October, 2012

Turing’s Imitation Game - frontiersinai.comfrontiersinai.com/turingfiles/October/slides_Shah.pdf · Turing [s Imitation Game •Plurality of views on: –What is it? –How useful/damaging

  • Upload
    buinhan

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

Turing’s Imitation Game: Role of Error-making in Intelligent Thought

Turing100:

Huma Shah (TCAC), Kevin Warwick (TCAC), Ian Bland, Chris Chapman & Marc Allen

Turing in Context II: 10-12 October, 2012

Overview

• Introduction to Turing’s two scenarios for his Imitation Game / Turing test

• University of Reading’s practical imitation game events 2008 & 2012

• Results from 120 (of 276) in which 3 error types observed

• Sample conversation – audience participation

• Why intelligent judges make errors in practical Turing tests

Turing in Context II: 10-12 October, 2012

Before we start

• Turing the man:

– If Turing, the genius, were alive today {and he’d be 100!} he’d be helping GCHQ fight cybercrime. Need more Turings (Iain Lobben, Head of GCHQ: Oct 2012)

– Lessons learnt (Cambridge historian, Prof Christopher Andrew: Sept 2012)

– Posthumous apology for mistreatment from a British PM (Gordon Brown: Sept 2009)

Turing in Context II: 10-12 October, 2012

Turing’s Imitation Game

• Plurality of views on:

– What is it?

– How useful/damaging it is for AI

• All points of view tolerated

• We focus on results from actual imitation games with over 50 judges, 40 hidden humans and 5 machines

Turing in Context II: 10-12 October, 2012

Turing’s Imitation Game: Scenario 1

Turing in Context II: 10-12 October, 2012

Simultaneous test: Judge interrogates two hidden entities in parallel: CMI, Mind 1950

Turing’s Imitation Game: Scenario 2

Turing in Context II: 10-12 October, 2012

Viva voce test- direct questioning : 1952 & 1950: s6.4 The Argument from Consciousness, p. 446 Note: Turing’s revised prediction: “At least 100 years” (1952)

What Turing felt

• Idea of intelligence emotional rather than mathematical (1948)

• Learning language one of the most accomplished human feats (1948)

• Question/answer method “suitable for introducing almost any one of the fields of human endeavour” (1950: p. 435)

• Five minutes (1950: p. 442): first impression (Willis & Todorov, 2005) and thin slice of behaviour (Albrechtsen et. al, 2009) sufficient duration for game

• Thinking examined through satisfactory and sustained responses to any questions (1950: p.447)

Turing in Context II: 10-12 October, 2012

Practical Turing tests reported here

• Scenario 1: simultaneous tests

• 120 machine-human Turing tests:

– 60 conducted at Reading in 2008

– 60 conducted at Bletchley Park in 2012

• Interrogators/Judges made errors in identifying one or both hidden chat partners in a simultaneous test.

Turing in Context II: 10-12 October, 2012

Participants 2008

Turing in Context II: 10-12 October, 2012

Judges Machine developers Hidden humans

Split-screen simultaneous test

Turing in Context II: 10-12 October, 2012

Participants 2012: 23 June Bletchley Park

Turing in Context II: 10-12 October, 2012

Judges

Hidden humans

Human participants • Human participants (interrogators and hidden humans)

recruited via calls including social media in 2012: – Local schools – Local and national newspapers – UoR internal call – STEMNET – Twitter – Facebook – Newsgroups

• Human interrogators and hidden humans included members of the public, comp scientists, philosophers, journalists; males/females; adults/ teenagers, and native/non-native UK English speakers.

Turing in Context II: 10-12 October, 2012

Machines

• 2008: five select machines invited following one-to-one online testing performance prior to main event

• 2012 five machines invited (four of them from 2008 tests): – Cleverbot, Jfred (not in 2008), Elbot, Eugene

Goostman, Ultra Hal

• Each machine successful in previous Turing test contests

Turing in Context II: 10-12 October, 2012

What the Interrogators were told

• Task of each interrogator is to uncover all the machines and hidden-humans.

• At the end of each 5 mins test ‘score’ hidden chatters: – If machine score it for conversation ability 0-100

– If human, identify male/female; age range (child/teenager/adult), FLE/Non-FLE

– ‘Unsure’ score allowed

• Sample score sheet ….

Turing in Context II: 10-12 October, 2012

Sample Judge Score Sheet

Turing in Context II: 10-12 October, 2012

Judge writing scores after Test

Turing in Context II: 10-12 October, 2012

What the Hidden-humans were told

• Please remember that it is the machines in the contest competing to show they are the humans, please do not make it easier for the machines by answering in robotic fashion. If you are not sure what this means, here is a machine's response to a question from a recent Turing test: – I can't deal with that syntactic variant yet.

• Also, please be aware the tests will be open to the public for viewing and the transcripts used for research, i.e. anyone will be able to read your answers to the judges questions! Please do not reveal identity (real name, sex, age range).

Turing in Context II: 10-12 October, 2012

Error types in Practical Turing Tests

• Double error in the same test: human interrogator classifies machine for human (Eliza effect); hidden human (foil for the machine) classified as machine (confederate effect);

• Type a single error – Eliza effect: human interrogator classifies machine as human

• Type b single error – confederate effect: human interrogator classifies hidden human as machine.

• Gender & age blur also feature in tests

Turing in Context II: 10-12 October, 2012

Audience participation

Turing in Context II: 10-12 October, 2012

Turing in Context II: 10-12 October, 2012

Turing in Context II: 10-12 October, 2012

120 Simultaneous Tests

Error

type :

Double-

error

Eliza

effect

Confed-

erate

effect

Unsure Errors

Number

of Judge

errors

9 7 5 2 (1 machine;

1 hidden

human)

23

Turing in Context II: 10-12 October, 2012

Analysing Transcripts

• English language knowledge features in judges’ decisions:

– First language English (FLE) speaking judges misclassified non-FLE hidden humans as machines, and vice versa

• Lack of mutual knowledge

• Subjective opinion on what constitutes satisfactory responses

Turing in Context II: 10-12 October, 2012

Why do intelligent humans err?

• Some humans trust too easily, succumb to deception more than others

• Schulz 2010: – Humans regard being right as natural state

– Making errors make us feel deflated and embarrassed

– Capacity to err is crucial to human cognition

– “wrongness is a window into normal human nature” (p.5)

Turing in Context II: 10-12 October, 2012

Problem with misidentification in cyberspace

• CyberLover chatbot shows some humans susceptible to deception: developed to steal identity and conduct financial fraud in Internet chat rooms

Turing in Context II: 10-12 October, 2012

Practical Uses of Imitation Game

• Encourage more children to take up computer science (problem in the UK)

• Scale progress in machine conversation - chatbots used widely in e-commerce

• Neurology: thought translation for locked-in patients (Stins & Laureys, 2009)

• Raise awareness of malfeasant programmes designed to conduct a particular cybercrime

Turing in Context II: 10-12 October, 2012

References

– A.M. Turing. Intelligent Machinery (1948). In B. J. Copeland (Ed) The Essential Turing: The ideas that gave birth to the computer age. Clarendon Press: Oxford. 2004

– A.M. Turing (1950). Computing Machinery and Intelligence. MIND. October: 59(236), pp 433-460

– A.M. Turing. Can Automatic Calculating Machines be said to Think? (1952). In B. J. Copeland (Ed) The Essential Turing: The ideas that gave birth to the computer age. Clarendon Press: Oxford. 2004

– H. Shah. Deception-detection and Machine Intelligence in Practical Turing Tests. PhD Thesis .The University of Reading. October 2010

– J.F. Stins and S.Laureys (2009). Thought Translation, tennis and Turing tests in the vegetative state. Phenom Cogn Sci: DOI 10.1007/s11097-009-9124-8

– J.S. Albrechtsen, C.A. Meissner and K.J. Susa. (2009) Can intuition improve deception detection performance? Journal of Experimental and Social Psychology. 45: pp 1052-1055

– J. Willis and A. Todorov (2005). First Impressions: Making up your mind after 100ms exposure to a face. Psychological Science. 17(7), pp 592-598

– K. Schulz (2010). Being Wrong. Portobello Books: London

Turing in Context II: 10-12 October, 2012

• For making Turing100 possible, thank you to:

– UoR SSE

– Price Waterhouse Coopers UK

– Women in Technology, UK

– Artificial Solutions, Daden Ltd, Elzware Conversation Systems

• Thank you for listening

Turing in Context II: 10-12 October, 2012