The psychology of knights and knaves Lance J. Rips, University of Chicago, 1989

The psychology of knights and knaves

Lance J. Rips,University of Chicago,

1989

Knights and Knaves

(1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements:

A: B is a knaveB: A and C are of the same type.

What is C?

(Smullyan, 1978, p.22)

Protocol evidence

Subjects attempted to solve problems by considering specific assumptions

Worked forward from their assumptions

Subjects sometimes forgot assumptions

Protocol evidence

Computational model

Based on the idea that people deal with deduction problems by applying mental-deduction rules like those of formal natural deduction systems

Computational model Subject’s performance predicted

on a deduction problem in terms of length of required derivation and availability of rules

The shorter the derivation and more available the rules, the faster and more accurate subjects should be

Computational modelknight(x) – x is a knight, knave(x) – x is a

knavesays(x,p) – person x uttered sentence p

Rule 1:says(x, p) and knight(x) entail p.

Rule 2:says(x,p) and knave(x) entail NOT p.

Rule 3:NOT knave(x) entails knight(x)

Rule 4:NOT knight(x) entails knave(x).

Computational modelPROLOG Program Stores logical form of sentences in problem

and extracts names of individuals (A, B, and C)

Assumes first-mentioned individual is a knight, knight(A) Draws as many inferences as possible from

assumption If contradictory sentences (knight(B) and

knave(B)) it abandons assumption that first-mentioned individual is a knight and continues with assumption knave(A)

Computational model PROLOG Program Revises rule ordering, rules successfully

applied will be tried first on the next round

Continues until it has found all consistent sets of assumptions about the knight / knave status of each individual

Computational modelPROLOG Program

Computational model PROLOG Program All rules operate forward

Assumes subjects error rates and response time depend on length of derivations

Experiment 1Rule 5 (AND Elimination):

p AND q entails p, q.

Rule 6 (Modus Ponens):IF p THEN q and p entail q

Rule 7 (DeMorgan-1):NOT (p OR q) entails NOT p AND NOT q

Rule 8 (DeMorgan-1):NOT (p AND q) entails NOT p OR NOT q

Experiment 1Rule 9 (Disjunctive Syllogism-1):

p OR q and NOT p entail q.p OR q and NOT q entail p.

Rule 10 (Disjunctive Syllogism-2):NOT p OR q and p entail q.p OR NOT q and q entail p.

Rule 11 (Double Negation Elimination)NOT NOT p entails p.

Experiment 1Method

Submitted puzzles to the PROLOG program and counted the number of inference steps it needed to solve them

34 problems Six problems had 2 speakers, 28 had 3 2 speaker problems had 3 or 4 clauses 3 speaker problems had 4, 5, or 9 clauses

Experiment 1Method 4 clause, 3 speaker problems(2) A says, “C is a knave.”

B says, “C is a knave.”C says, “A is a knight and B is a

knave.”

(3) A says, “B is a knight.”B says, “C is a knave or A is a knight.”C says, “A is a knight.”

Experiment 1 - Subjects 34 subjects

3 groups of 10 to 13 individuals University of Arizona Undergraduates

English Speakers, no formal logic courses

10 subjects stopped working on the problems after 15 minutes

Experiment 1 Results and Discussion None of the subjects solved the most

difficult problem and 35% solved the easiest.

24% of problems predicted to be easier, 16% of problems predicted difficult.

Program used a mean of 19.3 steps in solving simpler problems, 24.2 steps on the more difficult problems.

Core subjects solved 32% of the easier problems and 20% of more difficult problems.

Experiment 1Results and Discussion

Percentage of Correct solutions in Experiment 1 as a function of the number of inference steps used by the model

Experiment 1Results and Discussion 3-speaker, 9-clause outlier

(4)A says, “We’re all knaves.”B says, “A, B, or C is a knight.”C says, “A, B, or C is a knave.”

Experiment 1Results and Discussion Prediction that subjects would

score higher on puzzles with smaller number of inference steps consistent with findings.

Experiment 1Results and Discussion Binary Connectives

says(A, ((knave(A) AND knave(B)) AND knave(C ))

N-ary Connectives

AND(knave(A), knave(B), knave(C ))

Experiment 2 Predict the amount of time

subjects take to reach a correct solution based on the number of steps the model needs to find a correct answer.

Experiment 2 Problems were simplified as longer

problems produced longer and more variable times More difficult problems also resulted

in less correct answers. Tighter control on the form of the

problems Eliminate irrelevant effects of

problem wording and response.

Experiment 2 Modified rules to allow program to solve a

wider variety of problems Rules 9 and 10 (Disjunctive Syllogism) Allowed the program to infer p from any of

the following:(a) OR(knight(x), p) and knave(x);(b) OR(knave(x), p) and knight(x);(c) OR(p, knight(x)) and knave(x); and(d) OR(p, knave(x)) and knight(x);

Experiment 2Method Subjects viewed the problems on a

monitor and responded using a response panel.

Monitor presented subjects with feedback about accuracy of their answer and amount of time taken.

Experiment 2Method

Experiment 2Method

Experiment 2Method Submitted problems to the natural-

deduction program and chose 12 of the groups based on output.

Each group had same output but differed in the number of inference steps required to solve Column 1 (small) 13.1 steps Column 2 (small) 13.0 steps Column 3 (large) 16.4 steps

Experiment 2Method The prediction is that the large

step problems within each row will result in longer response times and more errors.

Experiment 2Subjects 53 University of Chicago

Undergraduates Native English speakers, no formal logic $5 bonus – minus 10 cents per trial on

which they made an error

Discarded data from subjects who made errors on more than 40% of trials 30 subjects succeeded

Experiment 2Results and Discussion The problems with a larger number

of predicted inference steps took longer for the subjects to solve.

Subjects took 25.5s to 23.9s to solve the two types of small-step problems, but 29.5s on the large-step problems.

Experiment 2Results and Discussion Error Rates

1st Small step 15.8% 2nd Small step 9% Large step 14.4%

Experiment 2Results and Discussion Knight-knave Problems

Took longer to solve and most difficult Knight-knight 24.8s 14.4% errors Knight-knave 29.4s 17.5% Knave-knight 24.0s 8% Knave-knave 26.8s 12.2%

But only a small difference in the number of steps necessary for the program to solve.

Experiment 2Results and Discussion Attributed increase in knight-knave

problems to the small-step items Subjects incorrectly assume character

is lying when they state “I am a knave…”

This would result in knave(A)-knight(B) response

Experiment 2Results and Discussion Effects of negatives

Subjects took longer to read and comprehend negative sentences

The model adds extra steps are necessary to transform these negatives to positives

Rule 3 – NOT(knave(x)) to knight(x)

23.4s to solve no negative problems with 10.6% error rate

27.2 to solve problems with one negative with 13.9% error rate

General DiscussionNatural-deduction model People carry out deduction tasks

by constructing mental proofs Represent information Make further assumptions Draw inferences Make conclusions on basis of

derivation

General DiscussionNatural-deduction model The knights and knaves problems

extend model compared to previous experiments which judge validity of arguments Depend on logical properties but do

not have premise-conclusion format

General DiscussionNatural-deduction model Protocol

Participants followed assume-and-deduce strategy

Experiment 1 Predict probability of subjects solving

a set of moderately complex and varied puzzles

Experiment 2 Response times increased with the

number of inference steps

General DiscussionNatural-deduction model Limitations

A large minority found the simpler problems to be extremely difficult and performed below chance level of performance

Results were interpreted using only the natural-deduction framework

General Discussion Subjects who did not complete the

task

Large variation Experiment 1 – some achieved 80%

correct, other subjects missed all

General Discussion Individual Differences

OR Introduction Avoided problems dependent on OR

Introduction

Lack of availability of Knight-knave rules Subjects do not understand that what a

knight says is true and what a knave says is false

General DiscussionAlternative Theories Deduction by heuristic

By responding knave if a character says “I am a knave” and responding knight otherwise

Results in 25% correct versus obtained 87%

No apparent “non-logical” short cuts

General DiscussionAlternative Theories Deduction by pragmatic schemas

Knights and knaves does not follow the real world schema

Very few situations in which people always tell the truth or always lie

May help with Wason selection task (permission / restrictions)

But no case for people using schemas on most deduction problems

General DiscussionAlternative Theories Deduction by mental models

Subject surveys model for potential conclusion and if found attempts to find a counter example by altering the model.

If no counterexample found the subject adopts initial conclusion as correct.

If counterexample is found, conclusion is rejected and another conclusion is examined.

Continues until acceptable conclusion is found or it is decided that no conclusion is valid.

General DiscussionAlternative Theories

(1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements:

A: B is a knaveB: A and C are of the same type.

What is C?


Subject use tokens for each character.knightA

knaveB

knaveC

Conclusion that C is a knave, continue with counterexamples.


knaveA

knightB

knaveC

Since conclusion stands in both then C is a knave.

General DiscussionAlternative Theories None of the speak aloud subjects

mentioned tokens Could be a difficulty with describing

mental models.

The theory does not account for the process that produces and evaluates the model

General DiscussionAlternative Theories Deny that it is due to mental inference

rules or non-logical heuristics What cognitive mechanism is

responsible for these insights? Could be put together in a haphazard

manner and checked for consistency. Fails to give a good account of systematic

protocols Shifts burden of explanation to consistency

checker

Q&A Questions? Thoughts?

General Discussion Natural-deduction explains where

the items come from using intermediate sentences

Challenge to mental modelers

Documents

The psychology of knights and knaves Lance J. Rips, University of Chicago, 1989