Upload
arianna-mores
View
247
Download
0
Embed Size (px)
Citation preview
The psychology of knights and knaves
Lance J. Rips,University of Chicago,
1989
Knights and Knaves
(1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements:
A: B is a knaveB: A and C are of the same type.
What is C?
(Smullyan, 1978, p.22)
Protocol evidence
Subjects attempted to solve problems by considering specific assumptions
Worked forward from their assumptions
Subjects sometimes forgot assumptions
Protocol evidence
Computational model
Based on the idea that people deal with deduction problems by applying mental-deduction rules like those of formal natural deduction systems
Computational model Subject’s performance predicted
on a deduction problem in terms of length of required derivation and availability of rules
The shorter the derivation and more available the rules, the faster and more accurate subjects should be
Computational modelknight(x) – x is a knight, knave(x) – x is a
knavesays(x,p) – person x uttered sentence p
Rule 1:says(x, p) and knight(x) entail p.
Rule 2:says(x,p) and knave(x) entail NOT p.
Rule 3:NOT knave(x) entails knight(x)
Rule 4:NOT knight(x) entails knave(x).
Computational modelPROLOG Program Stores logical form of sentences in problem
and extracts names of individuals (A, B, and C)
Assumes first-mentioned individual is a knight, knight(A) Draws as many inferences as possible from
assumption If contradictory sentences (knight(B) and
knave(B)) it abandons assumption that first-mentioned individual is a knight and continues with assumption knave(A)
Computational model PROLOG Program Revises rule ordering, rules successfully
applied will be tried first on the next round
Continues until it has found all consistent sets of assumptions about the knight / knave status of each individual
Computational modelPROLOG Program
Computational model PROLOG Program All rules operate forward
Assumes subjects error rates and response time depend on length of derivations
Experiment 1Rule 5 (AND Elimination):
p AND q entails p, q.
Rule 6 (Modus Ponens):IF p THEN q and p entail q
Rule 7 (DeMorgan-1):NOT (p OR q) entails NOT p AND NOT q
Rule 8 (DeMorgan-1):NOT (p AND q) entails NOT p OR NOT q
Experiment 1Rule 9 (Disjunctive Syllogism-1):
p OR q and NOT p entail q.p OR q and NOT q entail p.
Rule 10 (Disjunctive Syllogism-2):NOT p OR q and p entail q.p OR NOT q and q entail p.
Rule 11 (Double Negation Elimination)NOT NOT p entails p.
Experiment 1Method
Submitted puzzles to the PROLOG program and counted the number of inference steps it needed to solve them
34 problems Six problems had 2 speakers, 28 had 3 2 speaker problems had 3 or 4 clauses 3 speaker problems had 4, 5, or 9 clauses
Experiment 1Method 4 clause, 3 speaker problems(2) A says, “C is a knave.”
B says, “C is a knave.”C says, “A is a knight and B is a
knave.”
(3) A says, “B is a knight.”B says, “C is a knave or A is a knight.”C says, “A is a knight.”
Experiment 1 - Subjects 34 subjects
3 groups of 10 to 13 individuals University of Arizona Undergraduates
English Speakers, no formal logic courses
10 subjects stopped working on the problems after 15 minutes
Experiment 1 Results and Discussion None of the subjects solved the most
difficult problem and 35% solved the easiest.
24% of problems predicted to be easier, 16% of problems predicted difficult.
Program used a mean of 19.3 steps in solving simpler problems, 24.2 steps on the more difficult problems.
Core subjects solved 32% of the easier problems and 20% of more difficult problems.
Experiment 1Results and Discussion
Percentage of Correct solutions in Experiment 1 as a function of the number of inference steps used by the model
Experiment 1Results and Discussion 3-speaker, 9-clause outlier
(4)A says, “We’re all knaves.”B says, “A, B, or C is a knight.”C says, “A, B, or C is a knave.”
Experiment 1Results and Discussion Prediction that subjects would
score higher on puzzles with smaller number of inference steps consistent with findings.
Experiment 1Results and Discussion Binary Connectives
says(A, ((knave(A) AND knave(B)) AND knave(C ))
N-ary Connectives
AND(knave(A), knave(B), knave(C ))
Experiment 2 Predict the amount of time
subjects take to reach a correct solution based on the number of steps the model needs to find a correct answer.
Experiment 2 Problems were simplified as longer
problems produced longer and more variable times More difficult problems also resulted
in less correct answers. Tighter control on the form of the
problems Eliminate irrelevant effects of
problem wording and response.
Experiment 2 Modified rules to allow program to solve a
wider variety of problems Rules 9 and 10 (Disjunctive Syllogism) Allowed the program to infer p from any of
the following:(a) OR(knight(x), p) and knave(x);(b) OR(knave(x), p) and knight(x);(c) OR(p, knight(x)) and knave(x); and(d) OR(p, knave(x)) and knight(x);
Experiment 2Method Subjects viewed the problems on a
monitor and responded using a response panel.
Monitor presented subjects with feedback about accuracy of their answer and amount of time taken.
Experiment 2Method
Experiment 2Method
Experiment 2Method Submitted problems to the natural-
deduction program and chose 12 of the groups based on output.
Each group had same output but differed in the number of inference steps required to solve Column 1 (small) 13.1 steps Column 2 (small) 13.0 steps Column 3 (large) 16.4 steps
Experiment 2Method The prediction is that the large
step problems within each row will result in longer response times and more errors.
Experiment 2Subjects 53 University of Chicago
Undergraduates Native English speakers, no formal logic $5 bonus – minus 10 cents per trial on
which they made an error
Discarded data from subjects who made errors on more than 40% of trials 30 subjects succeeded
Experiment 2Results and Discussion The problems with a larger number
of predicted inference steps took longer for the subjects to solve.
Subjects took 25.5s to 23.9s to solve the two types of small-step problems, but 29.5s on the large-step problems.
Experiment 2Results and Discussion Error Rates
1st Small step 15.8% 2nd Small step 9% Large step 14.4%
Experiment 2Results and Discussion Knight-knave Problems
Took longer to solve and most difficult Knight-knight 24.8s 14.4% errors Knight-knave 29.4s 17.5% Knave-knight 24.0s 8% Knave-knave 26.8s 12.2%
But only a small difference in the number of steps necessary for the program to solve.
Experiment 2Results and Discussion Attributed increase in knight-knave
problems to the small-step items Subjects incorrectly assume character
is lying when they state “I am a knave…”
This would result in knave(A)-knight(B) response
Experiment 2Results and Discussion Effects of negatives
Subjects took longer to read and comprehend negative sentences
The model adds extra steps are necessary to transform these negatives to positives
Rule 3 – NOT(knave(x)) to knight(x)
23.4s to solve no negative problems with 10.6% error rate
27.2 to solve problems with one negative with 13.9% error rate
General DiscussionNatural-deduction model People carry out deduction tasks
by constructing mental proofs Represent information Make further assumptions Draw inferences Make conclusions on basis of
derivation
General DiscussionNatural-deduction model The knights and knaves problems
extend model compared to previous experiments which judge validity of arguments Depend on logical properties but do
not have premise-conclusion format
General DiscussionNatural-deduction model Protocol
Participants followed assume-and-deduce strategy
Experiment 1 Predict probability of subjects solving
a set of moderately complex and varied puzzles
Experiment 2 Response times increased with the
number of inference steps
General DiscussionNatural-deduction model Limitations
A large minority found the simpler problems to be extremely difficult and performed below chance level of performance
Results were interpreted using only the natural-deduction framework
General Discussion Subjects who did not complete the
task
Large variation Experiment 1 – some achieved 80%
correct, other subjects missed all
General Discussion Individual Differences
OR Introduction Avoided problems dependent on OR
Introduction
Lack of availability of Knight-knave rules Subjects do not understand that what a
knight says is true and what a knave says is false
General DiscussionAlternative Theories Deduction by heuristic
By responding knave if a character says “I am a knave” and responding knight otherwise
Results in 25% correct versus obtained 87%
No apparent “non-logical” short cuts
General DiscussionAlternative Theories Deduction by pragmatic schemas
Knights and knaves does not follow the real world schema
Very few situations in which people always tell the truth or always lie
May help with Wason selection task (permission / restrictions)
But no case for people using schemas on most deduction problems
General DiscussionAlternative Theories Deduction by mental models
Subject surveys model for potential conclusion and if found attempts to find a counter example by altering the model.
If no counterexample found the subject adopts initial conclusion as correct.
If counterexample is found, conclusion is rejected and another conclusion is examined.
Continues until acceptable conclusion is found or it is decided that no conclusion is valid.
General DiscussionAlternative Theories
(1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements:
A: B is a knaveB: A and C are of the same type.
What is C?
General DiscussionAlternative Theories
Subject use tokens for each character.knightA
knaveB
knaveC
Conclusion that C is a knave, continue with counterexamples.
General DiscussionAlternative Theories
knaveA
knightB
knaveC
Since conclusion stands in both then C is a knave.
General DiscussionAlternative Theories None of the speak aloud subjects
mentioned tokens Could be a difficulty with describing
mental models.
The theory does not account for the process that produces and evaluates the model
General DiscussionAlternative Theories Deny that it is due to mental inference
rules or non-logical heuristics What cognitive mechanism is
responsible for these insights? Could be put together in a haphazard
manner and checked for consistency. Fails to give a good account of systematic
protocols Shifts burden of explanation to consistency
checker
Q&A Questions? Thoughts?
General Discussion Natural-deduction explains where
the items come from using intermediate sentences
Challenge to mental modelers