Upload
kenneth-moody
View
223
Download
3
Embed Size (px)
Citation preview
1
Final Catch-up, Review
2
Outline
• Inference in First-Order Logic• Knowledge Representation using First-Order Logic• Propositional Logic• Constraint Satisfaction Problems• Game-Playing & Adversarial Search
• Questions on any topic
3
Review: Schematic for Follows, Entails, and Derives
If KB is true in the real world,
then any sentence entailed by KB
and any sentence derived from KB by a sound inference procedure
is also true in the real world.
Sentences SentenceDerives
Inference
4
Schematic Example: Follows, Entails, and Derives
Inference
“Mary is Sue’s sister and Amy is Sue’s daughter.” “Mary is
Amy’s aunt.”Representation
Derives
Entails
FollowsWorld
Mary Sue
Amy
“Mary is Sue’s sister and Amy is Sue’s daughter.”
“An aunt is a sister of a parent.”
“An aunt is a sister of a parent.”
Sister
Daughter
Mary
Amy
Aunt
“Mary is Amy’s aunt.”
Is it provable?
Is it true?
Is it the case?
5
Inference in First-Order Logic --- Summary
• FOL inference techniques– Unification– Generalized Modus Ponens
• Forward-chaining • Backward-chaining
– Resolution-based inference• Refutation-complete
6
Unification
• Recall: Subst(θ, p) = result of substituting θ into sentence p
• Unify algorithm: takes 2 sentences p and q and returns a unifier if one exists
Unify(p,q) = θ where Subst(θ, p) = Subst(θ, q)
• Example: p = Knows(John,x) q = Knows(John, Jane)
Unify(p,q) = {x/Jane}
•
7
Unification examples
• simple example: query = Knows(John,x), i.e., who does John know?
p q θ Knows(John,x) Knows(John,Jane) {x/Jane}Knows(John,x) Knows(y,OJ) {x/OJ,y/John}Knows(John,x) Knows(y,Mother(y)) {y/John,x/Mother(John)}Knows(John,x) Knows(x,OJ) {fail}
• Last unification fails: only because x can’t take values John and OJ at the same time
– But we know that if John knows x, and everyone (x) knows OJ, we should be able to infer that John knows OJ
• Problem is due to use of same variable x in both sentences
• Simple solution: Standardizing apart eliminates overlap of variables, e.g., Knows(z,OJ)
8
Unification
• To unify Knows(John,x) and Knows(y,z),
θ = {y/John, x/z } or θ = {y/John, x/John, z/John}
• The first unifier is more general than the second.
• There is a single most general unifier (MGU) that is unique up to renaming of variables.
MGU = { y/John, x/z }
• General algorithm in Figure 9.1 in the text–
9
Hard matching example
• To unify the grounded propositions with premises of the implication you need to solve a CSP!
• Colorable() is inferred iff the CSP has a solution• CSPs include 3SAT as a special case, hence matching is NP-hard
Diff(wa,nt) Diff(wa,sa) Diff(nt,q) Diff(nt,sa) Diff(q,nsw) Diff(q,sa) Diff(nsw,v) Diff(nsw,sa) Diff(v,sa) Colorable()
Diff(Red,Blue) Diff (Red,Green) Diff(Green,Red) Diff(Green,Blue) Diff(Blue,Red) Diff(Blue,Green)
10
Inference appoaches in FOL
• Forward-chaining– Uses GMP to add new atomic sentences – Useful for systems that make inferences as information streams in– Requires KB to be in form of first-order definite clauses
• Backward-chaining– Works backwards from a query to try to construct a proof– Can suffer from repeated states and incompleteness– Useful for query-driven inference– Requires KB to be in form of first-order definite clauses
• Resolution-based inference (FOL)– Refutation-complete for general KB
• Can be used to confirm or refute a sentence p (but not to generate all entailed sentences)
– Requires FOL KB to be reduced to CNF– Uses generalized version of propositional inference rule
• Note that all of these methods are generalizations of their propositional equivalents
11
Generalized Modus Ponens (GMP)
p1', p2', … , pn', ( p1 p2 … pn q)
Subst(θ,q)
Example:p1' is King(John) p1 is King(x)
p2' is Greedy(y) p2 is Greedy(x)
θ is {x/John,y/John} q is Evil(x) Subst(θ,q) is Evil(John)
• Implicit assumption that all variables universally quantified
where we can unify pi‘ and pi for all i
12
Completeness and Soundness of GMP
• GMP is sound– Only derives sentences that are logically entailed– See proof in text on p. 326 (3rd ed.; p. 276, 2nd ed.)
• GMP is complete for a KB consisting of definite clauses– Complete: derives all sentences that are entailed– OR…answers every query whose answers are entailed by such a KB
– Definite clause: disjunction of literals of which exactly 1 is positive, e.g., King(x) AND Greedy(x) -> Evil(x) NOT(King(x)) OR NOT(Greedy(x)) OR Evil(x)
13
Properties of forward chaining
• Sound and complete for first-order definite clauses
• Datalog = first-order definite clauses + no functions
• FC terminates for Datalog in finite number of iterations
• May not terminate in general if α is not entailed
• Incremental forward chaining: no need to match a rule on iteration k if a premise wasn't added on iteration k-1 match each rule whose premise contains a newly added positive literal
14
Properties of backward chaining
• Depth-first recursive proof search:– Space is linear in size of proof.
• Incomplete due to infinite loops fix by checking current goal against every goal on stack
• Inefficient due to repeated subgoals (both success and failure) fix using caching of previous results (memoization)
• Widely used for logic programming
• PROLOG:backward chaining with Horn clauses + bells & whistles.
15
Resolution in FOL
• Full first-order version:l1 ··· lk, m1 ··· mn
Subst(θ , l1 ··· li-1 li+1 ··· lk m1 ··· mj-1 mj+1 ··· mn)
where Unify(li, mj) = θ.
• The two clauses are assumed to be standardized apart so that they share no variables.
• For example,
Rich(x) Unhappy(x), Rich(Ken)Unhappy(Ken)
with θ = {x/Ken}• Apply resolution steps to CNF(KB α); complete for FOL
•
16
Resolution proof
17
Converting FOL sentences to CNF
Original sentence: Everyone who loves all animals is loved by someone:
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
1. Eliminate biconditionals and implicationsx [y Animal(y) Loves(x,y)] [y Loves(y,x)]
2. Move inwards: Recall: x p ≡ x p, x p ≡ x p
x [y (Animal(y) Loves(x,y))] [y Loves(y,x)]
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
x [y Animal(y) Loves(x,y)] [y Loves(y,x)]
–•
–
18
Conversion to CNF contd.
3. Standardize variables:each quantifier should use a different one
x [y Animal(y) Loves(x,y)] [z Loves(z,x)]
4. Skolemize: a more general form of existential instantiation.
Each existential variable is replaced by a Skolem function of the enclosing universally quantified variables:
x [Animal(F(x)) Loves(x,F(x))] Loves(G(x),x)
(reason: animal y could be a different animal for each x.)
–•
3.
19
Conversion to CNF contd.
5. Drop universal quantifiers:
[Animal(F(x)) Loves(x,F(x))] Loves(G(x),x)
(all remaining variables assumed to be universally quantified)
6. Distribute over : [Animal(F(x)) Loves(G(x),x)] [Loves(x,F(x)) Loves(G(x),x)]
Original sentence is now in CNF form – can apply same ideas to all sentences in KB to convert into CNF
Also need to include negated query
Then use resolution to attempt to derive the empty clause which show that the query is entailed by the KB
20
Knowledge Representation using First-Order Logic
• Propositional Logic is Useful --- but has Limited Expressive Power
• First Order Predicate Calculus (FOPC), or First Order Logic (FOL).– FOPC has greatly expanded expressive power, though still limited.
• New Ontology– The world consists of OBJECTS (for propositional logic, the world was facts).– OBJECTS have PROPERTIES and engage in RELATIONS and FUNCTIONS.
• New Syntax– Constants, Predicates, Functions, Properties, Quantifiers.
• New Semantics– Meaning of new syntax.
• Knowledge engineering in FOL
21
Semantics: Interpretation
• An interpretation of a sentence (wff) is an assignment that maps – Object constant symbols to objects in the world, – n-ary function symbols to n-ary functions in the world,– n-ary relation symbols to n-ary relations in the world
• Given an interpretation, an atomic sentence has the value “true” if it denotes a relation that holds for those individuals denoted in the terms. Otherwise it has the value “false.”– Example: Kinship world:
• Symbols = Ann, Bill, Sue, Married, Parent, Child, Sibling, …– World consists of individuals in relations:
• Married(Ann,Bill) is false, Parent(Bill,Sue) is true, …
22
Review: Models (and in FOL, Interpretations)
• Models are formal worlds in which truth can be evaluated
• We say m is a model of a sentence α if α is true in m
• M(α) is the set of all models of α
• Then KB ╞ α iff M(KB) M(α)– E.g. KB, = “Mary is Sue’s sister
and Amy is Sue’s daughter.”– α = “Mary is Amy’s aunt.”
• Think of KB and α as constraints, and of models m as possible states.
• M(KB) are the solutions to KB and M(α) the solutions to α.
• Then, KB ╞ α, i.e., ╞ (KB a) , when all solutions to KB are also solutions to α.
23
Review: Wumpus models
• KB = all possible wumpus-worlds consistent with the observations and the “physics” of the Wumpus world.
24
Review: Wumpus models
α1 = "[1,2] is safe", KB ╞ α1, proved by model checking.
Every model that makes KB true also makes α1 true.
25
Review: Syntax of FOL: Basic elements
• Constants KingJohn, 2, UCI,...
• Predicates Brother, >,...
• Functions Sqrt, LeftLegOf,...
• Variables x, y, a, b,...
• Connectives , , , ,
• Equality =
• Quantifiers ,
26
Syntax of FOL: Basic syntax elements are symbols
• Constant Symbols:– Stand for objects in the world.
• E.g., KingJohn, 2, UCI, ...
• Predicate Symbols– Stand for relations (maps a tuple of objects to a truth-value)
• E.g., Brother(Richard, John), greater_than(3,2), ...– P(x, y) is usually read as “x is P of y.”
• E.g., Mother(Ann, Sue) is usually “Ann is Mother of Sue.”
• Function Symbols– Stand for functions (maps a tuple of objects to an object)
• E.g., Sqrt(3), LeftLegOf(John), ...
• Model (world) = set of domain objects, relations, functions• Interpretation maps symbols onto the model (world)
– Very many interpretations are possible for each KB and world!– Job of the KB is to rule out models inconsistent with our knowledge.
27
Syntax of FOL: Terms
• Term = logical expression that refers to an object
• There are two kinds of terms:
– Constant Symbols stand for (or name) objects:• E.g., KingJohn, 2, UCI, Wumpus, ...
– Function Symbols map tuples of objects to an object:• E.g., LeftLeg(KingJohn), Mother(Mary), Sqrt(x)• This is nothing but a complicated kind of name
– No “subroutine” call, no “return value”
28
Syntax of FOL: Atomic Sentences
• Atomic Sentences state facts (logical truth values).– An atomic sentence is a Predicate symbol, optionally
followed by a parenthesized list of any argument terms– E.g., Married( Father(Richard), Mother(John) )– An atomic sentence asserts that some relationship (some
predicate) holds among the objects that are its arguments.
• An Atomic Sentence is true in a given model if the relation referred to by the predicate symbol holds among the objects (terms) referred to by the arguments.
29
Syntax of FOL: Connectives & Complex Sentences
• Complex Sentences are formed in the same way, and are formed using the same logical connectives, as we already know from propositional logic
• The Logical Connectives: biconditional implication and or negation
• Semantics for these logical connectives are the same as we already know from propositional logic.
30
Syntax of FOL: Variables
• Variables range over objects in the world.
• A variable is like a term because it represents an object.
• A variable may be used wherever a term may be used.– Variables may be arguments to functions and predicates.
• (A term with NO variables is called a ground term.)• (A variable not bound by a quantifier is called free.)
31
Syntax of FOL: Logical Quantifiers
• There are two Logical Quantifiers:– Universal: x P(x) means “For all x, P(x).”
• The “upside-down A” reminds you of “ALL.”– Existential: x P(x) means “There exists x such that, P(x).”
• The “upside-down E” reminds you of “EXISTS.”
• Syntactic “sugar” --- we really only need one quantifier. x P(x) x P(x) x P(x) x P(x)– You can ALWAYS convert one quantifier to the other.
• RULES: and
• RULE: To move negation “in” across a quantifier,change the quantifier to “the other quantifier”and negate the predicate on “the other side.”
x P(x) x P(x) x P(x) x P(x)
32
Combining Quantifiers --- Order (Scope)
The order of “unlike” quantifiers is important.
x y Loves(x,y) – For everyone (“all x”) there is someone (“exists y”) whom they love
y x Loves(x,y) - there is someone (“exists y”) whom everyone loves (“all x”)
Clearer with parentheses: y ( x Loves(x,y) )
The order of “like” quantifiers does not matter.x y P(x, y) y x P(x, y)x y P(x, y) y x P(x, y)
33
De Morgan’s Law for Quantifiers
( )
( )
( )
( )
x P x P
x P x P
x P x P
x P x P
( )
( )
( )
( )
P Q P Q
P Q P Q
P Q P Q
P Q P Q
De Morgan’s Rule Generalized De Morgan’s Rule
Rule is simple: if you bring a negation inside a disjunction or a conjunction,always switch between them (or and, and or).
34
Inference in Formal Symbol Systems:Ontology, Representation, Inference
• Formal Symbol Systems– Symbols correspond to things/ideas in the
world– Pattern matching corresponds to
inference
• Ontology: What exists in the world?– What must be represented?
• Representation: Syntax vs. Semantics– What’s Said vs. What’s Meant
• Inference: Schema vs. Mechanism– Proof Steps vs. Search Strategy
35
Ontology:What kind of things exist in the world?What do we need to describe and reason about?
Reasoning
Representation-------------------A Formal Symbol System
Inference---------------------Formal Pattern Matching
Syntax---------What is said
Semantics-------------What it means
Schema-------------Rules of Inference
Execution-------------Search Strategy
Preceding lecture This lecture
36
Review
• Definitions:– Syntax, Semantics, Sentences, Propositions, Entails,
Follows, Derives, Inference, Sound, Complete, Model, Satisfiable, Valid (or Tautology)
• Syntactic Transformations:– E.g., (A B) (A B)
• Semantic Transformations:– E.g., (KB |= ) (|= (KB )
• Truth Tables– Negation, Conjunction, Disjunction,
Implication, Equivalence (Biconditional)– Inference by Model Enumeration
37
Review: Schematic perspective
If KB is true in the real world,
then any sentence entailed by KB
is also true in the real world.
38
So --- how do we keep it from“Just making things up.” ?
“Einstein Simplified:Cartoons on Science”by Sydney Harris, 1992,Rutgers University Press
39
Schematic perspective
If KB is true in the real world,
then any sentence derived from KB by a sound inference procedure
is also true in the real world.
Sentences SentenceDerives
Inference
40
Logical inference
• The notion of entailment can be used for logic inference.– Model checking (see wumpus example):
enumerate all possible models and check whether is true.
• Sound (or truth preserving):The algorithm only derives entailed sentences.– Otherwise it just makes things up.
i is sound iff whenever KB |-i it is also true that KB|= – E.g., model-checking is sound
• Complete:The algorithm can derive every entailed sentence.
i is complete iff whenever KB |= it is also true that KB|-i
41
Proof methods
• Proof methods divide into (roughly) two kinds:
Application of inference rules:Legitimate (sound) generation of new sentences from old.– Resolution– Forward & Backward chaining
Model checkingSearching through truth assignments.
• Improved backtracking: Davis--Putnam-Logemann-Loveland (DPLL)• Heuristic search in model space: Walksat.
42
Propositional Logic --- Summary
• Logical agents apply inference to a knowledge base to derive new information and make decisions
• Basic concepts of logic:– syntax: formal structure of sentences– semantics: truth of sentences wrt models– entailment: necessary truth of one sentence given another– inference: deriving sentences from other sentences– soundness: derivations produce only entailed sentences– completeness: derivations can produce all entailed sentences– valid: sentence is true in every model (a tautology)
• Logical equivalences allow syntactic manipulations
• Propositional logic lacks expressive power– Can only state specific facts about the world.– Cannot express general rules about the world (use First Order Predicate
Logic)
43
Propositional Logic --- Review
• Definitions:– Syntax, Semantics, Sentences, Propositions, Entails,
Follows, Derives, Inference, Sound, Complete, Model, Satisfiable, Valid (or Tautology)
• Syntactic Transformations:– E.g., (A B) (A B)
• Semantic Transformations:– E.g., (KB |= ) (|= (KB )
• Truth Tables– Negation, Conjunction, Disjunction,
Implication, Equivalence (Biconditional)– Inference by Model Enumeration
44
Entailment
• Entailment means that one thing follows from another:
KB ╞ α
• Knowledge base KB entails sentence α if and only if α is true in all worlds where KB is true
– E.g., the KB containing “the Giants won and the Reds won” entails “The Giants won”.
– E.g., x+y = 4 entails 4 = x+y– E.g., “Mary is Sue’s sister and Amy is Sue’s
daughter” entails “Mary is Amy’s aunt.”
45
Models
• Logicians typically think in terms of models, which are formally structured worlds with respect to which truth can be evaluated
• We say m is a model of a sentence α if α is true in m
• M(α) is the set of all models of α
• Then KB ╞ α iff M(KB) M(α)– E.g. KB = Giants won and Reds
won α = Giants won
• Think of KB and α as collections of constraints and of models m as possible states. M(KB) are the solutions to KB and M(α) the solutions to α. Then, KB ╞ α when all solutions to KB are also solutions to α.
46
Logical inference
• The notion of entailment can be used for logic inference.– Model checking (see wumpus example):
enumerate all possible models and check whether is true.
• Sound (or truth preserving):The algorithm only derives entailed sentences.– Otherwise it just makes things up.
i is sound iff whenever KB |-i it is also true that KB|= – E.g., model-checking is sound
• Complete:The algorithm can derive every entailed sentence.
i is complete iff whenever KB |= it is also true that KB|-i
47
Inference Procedures
• KB ├i α = sentence α can be derived from KB by procedure i
• Soundness: i is sound if whenever KB ├i α, it is also true that KB╞ α (no wrong inferences, but maybe not all inferences)
• Completeness: i is complete if whenever KB╞ α, it is also true that KB ├i α (all inferences can be made, but maybe some wrong extra ones as well)
48
Recap propositional logic: Syntax
• Propositional logic is the simplest logic – illustrates basic ideas
• The proposition symbols P1, P2 etc are sentences
– If S is a sentence, S is a sentence (negation)– If S1 and S2 are sentences, S1 S2 is a sentence
(conjunction)– If S1 and S2 are sentences, S1 S2 is a sentence (disjunction)– If S1 and S2 are sentences, S1 S2 is a sentence
(implication)– If S1 and S2 are sentences, S1 S2 is a sentence
(biconditional)
49
Recap propositional logic: Semantics
Each model/world specifies true or false for each proposition symbolE.g. P1,2 P2,2 P3,1
false true falseWith these symbols, 8 possible models, can be enumerated automatically.
Rules for evaluating truth with respect to a model m:S is true iff S is false S1 S2 is true iff S1 is true and S2 is trueS1 S2 is true iff S1is true or S2 is trueS1 S2 is true iff S1 is false or S2 is true i.e., is false iff S1 is true and S2 is falseS1 S2 is true iff S1S2 is true andS2S1 is
true
Simple recursive process evaluates an arbitrary sentence, e.g.,
P1,2 (P2,2 P3,1) = true (true false) = true true = true
50
Recap truth tables for connectives
OR: P or Q is true or both are true.XOR: P or Q is true but not both.
Implication is always truewhen the premises are False!
51
Inference by enumeration
• Enumeration of all models is sound and complete.
• For n symbols, time complexity is O(2n)...
• We need a smarter way to do inference!
• In particular, we are going to infer new logical sentences from the data-base and see if they match a query.
52
Logical equivalence
• To manipulate logical sentences we need some rewrite rules.
• Two sentences are logically equivalent iff they are true in same models: α ≡ ß iff α╞ β and β╞ α
You need to know these !
53
Validity and satisfiability
A sentence is valid if it is true in all models,e.g., True,A A, A A, (A (A B)) B
Validity is connected to inference via the Deduction Theorem:KB ╞ α if and only if (KB α) is valid
A sentence is satisfiable if it is true in some modele.g., A B, C
A sentence is unsatisfiable if it is false in all modelse.g., AA
Satisfiability is connected to inference via the following:KB ╞ α if and only if (KB α) is unsatisfiable(there is no model for which KB=true and is false)
54
Propositional Logic --- Summary
• Logical agents apply inference to a knowledge base to derive new information and make decisions
• Basic concepts of logic:– syntax: formal structure of sentences– semantics: truth of sentences wrt models– entailment: necessary truth of one sentence given another– inference: deriving sentences from other sentences– soundness: derivations produce only entailed sentences– completeness: derivations can produce all entailed
sentences
• Resolution is refutation complete for propositional logic.Forward and backward chaining are linear-time, complete for Horn clauses
• Propositional logic lacks expressive power
55
Constraint Satisfaction Problems• What is a CSP?
– Finite set of variables X1, X2, …, Xn
– Nonempty domain of possible values for each variable D1, D2, …, Dn
– Finite set of constraints C1, C2, …, Cm
• Each constraint Ci limits the values that variables can take, • e.g., X1 ≠ X2
– Each constraint Ci is a pair <scope, relation>• Scope = Tuple of variables that participate in the constraint.• Relation = List of allowed combinations of variable values.
May be an explicit list of allowed combinations.May be an abstract relation allowing membership testing and listing.
• CSP benefits– Standard representation pattern– Generic goal and successor functions– Generic heuristics (no domain specific expertise).
56
CSPs --- what is a solution?
• A state is an assignment of values to some or all variables.– An assignment is complete when every variable has a
value. – An assignment is partial when some variables have no
values.
• Consistent assignment– assignment does not violate the constraints
• A solution to a CSP is a complete and consistent assignment.
• Some CSPs require a solution that maximizes an objective function.
57
CSP as a standard search problem
• A CSP can easily be expressed as a standard search problem.
• Incremental formulation
– Initial State: the empty assignment {}
– Actions (3rd ed.), Successor function (2nd ed.): Assign a value to an unassigned variable provided that it does not violate a constraint
– Goal test: the current assignment is complete (by construction it is consistent)
– Path cost: constant cost for every step (not really relevant)
• Can also use complete-state formulation– Local search techniques (Chapter 4) tend to work well
58
Improving CSP efficiency
• Previous improvements on uninformed search introduce heuristics
• For CSPS, general-purpose methods can give large gains in speed, e.g.,– Which variable should be assigned next?– In what order should its values be tried?– Can we detect inevitable failure early?– Can we take advantage of problem structure?
Note: CSPs are somewhat generic in their formulation, and so the heuristics are more general compared to methods in Chapter 4
60
Minimum remaining values (MRV)
var SELECT-UNASSIGNED-VARIABLE(VARIABLES[csp],assignment,csp)
• A.k.a. most constrained variable heuristic
• Heuristic Rule: choose variable with the fewest legal moves– e.g., will immediately detect failure if X has no legal values
61
Degree heuristic for the initial variable
• Heuristic Rule: select variable that is involved in the largest number of constraints on other unassigned variables.
• Degree heuristic can be useful as a tie breaker.
• In what order should a variable’s values be tried?
62
Least constraining value for value-ordering
• Least constraining value heuristic
• Heuristic Rule: given a variable choose the least constraining value– leaves the maximum flexibility for subsequent variable assignments
63
Forward checking
• Assign {Q=green}
• Effects on other variables connected by constraints with WA– NT can no longer be green– NSW can no longer be green– SA can no longer be green
• MRV heuristic would automatically select NT or SA next
64
Forward checking
• If V is assigned blue
• Effects on other variables connected by constraints with WA– NSW can no longer be blue– SA is empty
• FC has detected that partial assignment is inconsistent with the constraints and backtracking can occur.
65
Arc consistency
• An Arc X Y is consistent iffor every value x of X there is some value y consistent with x
(note that this is a directed property)
• Consider state of search after WA and Q are assigned:
SA NSW is consistent ifSA=blue and NSW=red
66
Arc consistency
• X Y is consistent iffor every value x of X there is some value y consistent with x
• NSW SA is consistent ifNSW=red and SA=blueNSW=blue and SA=???
67
Arc consistency
• Can enforce arc-consistency:Arc can be made consistent by removing blue from NSW
• Continue to propagate constraints….– Check V NSW– Not consistent for V = red – Remove red from V
68
Arc consistency
• Continue to propagate constraints….
• SA NT is not consistent– and cannot be made consistent
• Arc consistency detects failure earlier than FC
69
Arc consistency algorithm (AC-3)
function AC-3(csp) return the CSP, possibly with reduced domainsinputs: csp, a binary csp with variables {X1, X2, …, Xn}
local variables: queue, a queue of arcs initially the arcs in csp
while queue is not empty do(Xi, Xj) REMOVE-FIRST(queue)
if REMOVE-INCONSISTENT-VALUES(Xi, Xj) then
for each Xk in NEIGHBORS[Xi ] do
add (Xi, Xj) to queue
function REMOVE-INCONSISTENT-VALUES(Xi, Xj) return true iff we remove a value
removed falsefor each x in DOMAIN[Xi] do
if no value y in DOMAIN[Xj] allows (x,y) to satisfy the constraints between Xi and Xj
then delete x from DOMAIN[Xi]; removed true
return removed
(from Mackworth, 1977)
70
Local search for CSP
function MIN-CONFLICTS(csp, max_steps) return solution or failureinputs: csp, a constraint satisfaction problem
max_steps, the number of steps allowed before giving up
current an initial complete assignment for cspfor i = 1 to max_steps do
if current is a solution for csp then return currentvar a randomly chosen, conflicted variable from VARIABLES[csp]value the value v for var that minimize
CONFLICTS(var,v,current,csp)set var = value in current
return failure
71
Graph structure and problem complexity
• Solving disconnected subproblems– Suppose each subproblem has c variables out of a total of n.
– Worst case solution cost is O(n/c dc), i.e. linear in n• Instead of O(d n), exponential in n
• E.g. n= 80, c= 20, d=2– 280 = 4 billion years at 1 million nodes/sec.– 4 * 220= .4 second at 1 million nodes/sec
72
Tree-structured CSPs
• Theorem: – if a constraint graph has no loops then the CSP can be solved
in O(nd 2) time– linear in the number of variables!
• Compare difference with general CSP, where worst case is O(d n)
73
Algorithm for Solving Tree-structured CSPs
– Choose some variable as root, order variables from root to leaves such that every node’s parent precedes it in the ordering.
• Label variables from X1 to Xn)
• Every variable now has 1 parent
– Backward Pass
• For j from n down to 2, apply arc consistency to arc [Parent(Xj), Xj) ]
• Remove values from Parent(Xj) if needed
– Forward Pass
• For j from 1 to n assign Xj consistently with Parent(Xj )
74
Tree CSP complexity
• Backward pass– n arc checks– Each has complexity d2 at worst
• Forward pass– n variable assignments, O(nd)
Overall complexity is O(nd 2)
Algorithm works because if the backward pass succeeds, then every variable by definition has a legal assignment in the forward pass
75
What about non-tree CSPs?
• General idea is to convert the graph to a tree
2 general approaches
1. Assign values to specific variables (Cycle Cutset method)
2. Construct a tree-decomposition of the graph- Connected subproblems (subgraphs) form a
tree structure
76
Cycle-cutset conditioning
• Choose a subset S of variables from the graph so that graph without S is a tree– S = “cycle cutset”
• For each possible consistent assignment for S– Remove any inconsistent values from
remaining variables that are inconsistent with S
– Use tree-structured CSP to solve the remaining tree-structure
• If it has a solution, return it along with S• If not, continue to try other assignments for S
77
78
Tree Decompositions
79
Rules for a Tree Decomposition
• Every variable appears in at least one of the subproblems
• If two variables are connected in the original problem, they must appear together (with the constraint) in at least one subproblem
• If a variable appears in two subproblems, it must appear in each node on the path connecting those subproblems.
80
Summary
• CSPs – special kind of problem: states defined by values of a fixed set of variables,
goal test defined by constraints on variable values
• Backtracking=depth-first search with one variable assigned per node
• Heuristics– Variable ordering and value selection heuristics help significantly
• Constraint propagation does additional work to constrain values and detect inconsistencies
– Works effectively when combined with heuristics
• Iterative min-conflicts is often effective in practice.
• Graph structure of CSPs determines problem complexity– e.g., tree structured CSPs can be solved in linear time.
81
Game-Playing & Adversarial Search --- Overview
• Minimax Search with Perfect Decisions– Impractical in most cases, but theoretical basis for analysis
• Minimax Search with Cut-off– Replace terminal leaf utility by heuristic evaluation function
• Alpha-Beta Pruning– The fact of the adversary leads to an advantage in search!
• Practical Considerations– Redundant path elimination, look-up tables, etc.
• Game Search with Chance– Expectiminimax search
82
Games as Search
• Two players: MAX and MIN
• MAX moves first and they take turns until the game is over– Winner gets reward, loser gets penalty.– “Zero sum” means the sum of the reward and the penalty is a constant.
• Formal definition as a search problem:– Initial state: Set-up specified by the rules, e.g., initial board configuration of
chess.– Player(s): Defines which player has the move in a state.– Actions(s): Returns the set of legal moves in a state.– Result(s,a): Transition model defines the result of a move.– (2nd ed.: Successor function: list of (move,state) pairs specifying legal
moves.)– Terminal-Test(s): Is the game finished? True if finished, false otherwise.– Utility function(s,p): Gives numerical value of terminal state s for player p.
• E.g., win (+1), lose (-1), and draw (0) in tic-tac-toe.• E.g., win (+1), lose (0), and draw (1/2) in chess.
• MAX uses search tree to determine next move.
83
An optimal procedure: The Min-Max method
Designed to find the optimal strategy for Max and find best move:
• 1. Generate the whole game tree, down to the leaves.
• 2. Apply utility (payoff) function to each leaf.
• 3. Back-up values from leaves through branch nodes:– a Max node computes the Max of its child values– a Min node computes the Min of its child values
• 4. At root: choose the move leading to the child of highest value.
84
Game Trees
85
Pseudocode for Minimax Algorithm
function MINIMAX-DECISION(state) returns an action inputs: state, current state in game
return arg maxaACTIONS(state) MIN-VALUE(Result(state,a))
function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v +∞ for a in ACTIONS(state) do v MIN(v,MAX-VALUE(Result(state,a))) return v
function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v −∞ for a in ACTIONS(state) do v MAX(v,MIN-VALUE(Result(state,a))) return v
86
Properties of minimax
• Complete? – Yes (if tree is finite).
• Optimal? – Yes (against an optimal opponent).
• No. (Why not?)
• Time complexity?– O(bm)
• Space complexity?– O(bm) (depth-first search, generate all actions at once)– O(m) (depth-first search, generate actions one at a time)
– Can it be beaten by an opponent playing sub-optimally?
87
Static (Heuristic) Evaluation Functions
• An Evaluation Function:– Estimates how good the current board configuration is for a player.– Typically, evaluate how good it is for the player, how good it is for
the opponent, then subtract the opponent’s score from the player’s.
– Othello: Number of white pieces - Number of black pieces– Chess: Value of all white pieces - Value of all black pieces
• Typical values from -infinity (loss) to +infinity (win) or [-1, +1].
• If the board evaluation is X for a player, it’s -X for the opponent– “Zero-sum game”
88
89
General alpha-beta pruning
• Consider a node n in the tree ---
• If player has a better choice at:– Parent node of n– Or any choice point further up
• Then n will never be reached in play.
• Hence, when that much is known about n, it can be pruned.
90
Alpha-beta Algorithm
• Depth first search– only considers nodes along a single path from root at any time
= highest-value choice found at any choice point of path for MAX
(initially, = −infinity) = lowest-value choice found at any choice point of path for MIN
(initially, = +infinity)
• Pass current values of and down to child nodes during search.
• Update values of and during search:– MAX updates at MAX nodes– MIN updates at MIN nodes
• Prune remaining branches at a node when ≥
91
When to Prune
• Prune whenever ≥ .
– Prune below a Max node whose alpha value becomes greater than or equal to the beta value of its ancestors.
• Max nodes update alpha based on children’s returned values.
– Prune below a Min node whose beta value becomes less than or equal to the alpha value of its ancestors.
• Min nodes update beta based on children’s returned values.
94
Effectiveness of Alpha-Beta Search
• Worst-Case– branches are ordered so that no pruning takes place. In this case
alpha-beta gives no improvement over exhaustive search
• Best-Case– each player’s best move is the left-most child (i.e., evaluated first)– in practice, performance is closer to best rather than worst-case– E.g., sort moves by the remembered move values found last time.– E.g., expand captures first, then threats, then forward moves, etc.– E.g., run Iterative Deepening search, sort by value last iteration.
• In practice often get O(b(d/2)) rather than O(bd) – this is the same as having a branching factor of sqrt(b),
• (sqrt(b))d = b(d/2),i.e., we effectively go from b to square root of b
– e.g., in chess go from b ~ 35 to b ~ 6• this permits much deeper search in the same amount of time
95
96
97
Game-Playing & Adversarial Search --- Summary
• Game playing is best modeled as a search problem
• Game trees represent alternate computer/opponent moves
• Evaluation functions estimate the quality of a given board configuration for the Max player.
• Minimax is a procedure which chooses moves by assuming that the opponent will always choose the move which is best for them
• Alpha-Beta is a procedure which can prune large parts of the search tree and allow search to go deeper
• For many well-known games, computer algorithms based on heuristic search match or out-perform human world experts.
98
Outline
• Inference in First-Order Logic• Knowledge Representation using First-Order Logic• Propositional Logic• Constraint Satisfaction Problems• Game-Playing & Adversarial Search
• Questions on any topic