Ai complete note

Chapter -1: Introduction

What is artificial intelligence?

It is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.

It is Duplication of human thought process by machine Learning from experience Interpreting ambiguities Rapid response to varying situations Applying reasoning to problem-solving Manipulating environment by applying knowledge Thinking and reasoning

Yes, but what is intelligence?

Intelligence is the computational part of the ability to achieve goals in the world. Varying kinds and degrees of intelligence occur in people, many animals and some machines.

Isn't there a solid definition of intelligence that doesn't depend on relating it to human intelligence?

Not yet. The problem is that we cannot yet characterize in general what kinds of computational procedures we want to call intelligent. We understand some of the mechanisms of intelligence and not others.

Acting humanly: The Turing Test approach

Fig. The imitation game

Abridged history of AI(summary)

1943 McCulloch & Pitts: Boolean circuit model of brain1950 Turing's "Computing Machinery and Intelligence"1956 Dartmouth meeting: "Artificial Intelligence" adopted1950s Early AI programs, including Samuel's checkers program, Newell & Simon's

Logic Theorist, Gelernter's Geometry Engine1965 Robinson's complete algorithm for logical reasoning1966—73 AI discovers computational complexity, neural network research almost

disappears1969—79early development of knowledge-based systems1980-- AI becomes an industry 1986-- Neural networks return to popularity1987-- AI becomes a science 1995-- The emergence of intelligent agents

Prepared By: Najar Aryal, BCT(III/II), KEC,Kalimati Page 1

√ Goals of AI

Replicate human intelligence

"AI is the study of complex information processing problems that often have their roots in some aspect of biological information processing. The goal of the subject is to identify solvable and interesting information processing problems, and solve them." -- David Marr

Solve knowledge-intensive tasks

"AI is the design, study and construction of computer programs that behave intelligently." -- Tom Dean

"... to achieve their full impact, computer systems must have more than processing power--they must have intelligence. They need to be able to assimilate and use large bodies of information and collaborate with and help people find new ways of working together effectively. The technology must become more responsive to human needs and styles of work, and must employ more natural means of communication." -- Barbara Grosz and Randall Davis

Intelligent connection of perception and action

AI not centered around representation of the world, but around action in the world. Behavior-based intelligence. (see Rod Brooks in the movie Fast, Cheap and Out of Control)

Enhance human-human, human-computer and computer-computer interaction/communication

Computer can sense and recognize its users, see and recognize its environment, respond visually and audibly to stimuli. New paradigms for interacting productively with computers using speech, vision, natural language, 3D virtual reality, 3D displays, more natural and powerful user interfaces, etc. (See, for example, projects in Microsoft's "Advanced Interactivity and Intelligence" group.)

Some Application Areas of AI

Game Playing Deep Blue Chess program beat world champion Gary Kasparov

Speech Recognition PEGASUS spoken language interface to American Airlines' EAASY SABRE reseration system, which allows users to obtain flight information and make reservations over the telephone. The 1990s has seen significant advances in speech recognition so that limited systems are now successful.

Computer Vision Face recognition programs in use by banks, government, etc. The ALVINN system from CMU autonomously drove a van from Washington, D.C. to San Diego (all but 52 of 2,849 miles), averaging 63 mph day and night, and in all weather conditions. Handwriting recognition, electronics and manufacturing inspection, photointerpretation, baggage inspection, reverse engineering to automatically construct a 3D geometric model.

Expert Systems Application-specific systems that rely on obtaining the knowledge of human experts in an area and programming that knowledge into a system.

o Diagnostic Systems Microsoft Office Assistant in Office 97 provides customized help by decision-theoretic reasoning about an individual user. MYCIN system for diagnosing bacterial infections of the blood and suggesting treatments. Intellipath pathology diagnosis


system (AMA approved). Pathfinder medical diagnosis system, which suggests tests and makes diagnoses. Whirlpool customer assistance center.

o System Configuration DEC's XCON system for custom hardware configuration. Radiotherapy treatment planning.

o Financial Decision Making Credit card companies, mortgage companies, banks, and the U.S. government employ AI systems to detect fraud and expedite financial transactions. For example, AMEX credit check. Systems often use learning algorithms to construct profiles of customer usage patterns, and then use these profiles to detect unusual patterns and take appropriate action.

o Classification Systems Put information into one of a fixed set of categories using several sources of information. E.g., financial decision making systems. NASA developed a system for classifying very faint areas in astronomical images into either stars or galaxies with very high accuracy by learning from human experts' classifications.

Mathematical Theorem Proving Use inference methods to prove new theorems.

Natural Language Understanding AltaVista's translation of web pages. Translation of Catepillar Truck manuals into 20 languages. (Note: One early system translated the English sentence "The spirit is willing but the flesh is weak" into the Russian equivalent of "The vodka is good but the meat is rotten.")

Scheduling and Planning Automatic scheduling for manufacturing. DARPA's DART system used in Desert Storm and Desert Shield operations to plan logistics of people and supplies. American Airlines rerouting contingency planner. European space agency planning and scheduling of spacecraft assembly, integration and verification.

Some AI "Grand Challenge" Problems

Translating telephone Accident-avoiding car Aids for the disabled Smart clothes Intelligent agents that monitor and manage information by filtering, digesting, abstracting Tutors Self-organizing systems, e.g., that learn to assemble something by observing a human do it.

A Framework for Building AI Systems Perception Intelligent biological systems are physically embodied in the world and experience the world through their sensors (senses). For an autonomous vehicle, input might be images from a camera and range information from a rangefinder. For a medical diagnosis system, perception is the set of symptoms and test results that have been obtained and input to the system manually. Includes areas of vision, speech processing, natural language processing, and signal processing (e.g., market data and acoustic data).

Reasoning Inference, decision-making, classification from what is sensed and what the internal "model" is of the world. Might be a neural network, logical deduction system, Hidden Markov Model induction, heuristic searching a problem space, Bayes Network inference, genetic algorithms, etc. Includes areas of knowledge representation, problem solving, decision theory, planning, game theory, machine learning, uncertainty reasoning, etc.


http://babelfish.altavista.digital.com/cgi-bin/translate?

Action Biological systems interact within their environment by actuation, speech, etc. All behavior is centered around actions in the world. Examples include controlling the steering of a Mars rover or autonomous vehicle, or suggesting tests and making diagnoses for a medical diagnosis system. Includes areas of robot actuation, natural language generation, and speech synthesis.

Some Fundamental Issues for Most AI Problems

Representation Facts about the world have to be represented in some way, e.g., mathematical logic is one language that is used in AI. Deals with the questions of what to represent and how to represent it. How to structure knowledge? What is explicit, and what must be inferred? How to encode "rules" for inferencing so as to find information that is only implicitly known? How to deal with incomplete, inconsistent, and probabilistic knowledge? Epistemology issues (what kinds of knowledge are required to solve problems).

Example: "The fly buzzed irritatingly on the window pane. Jill picked up the newspaper." Inference: Jill has malicious intent; she is not intending to read the newspaper, or use it to start a fire, or ...

Example: Given 17 sticks in 3 x 2 grid, remove 5 sticks to leave exactly 3 squares.

Search Many tasks can be viewed as searching a very large problem space for a solution. For example, Checkers has about 1040 states, and Chess has about 10120 states in a typical games. Use of heuristics (meaning "serving to aid discovery") and constraints.

Inference From some facts others can be inferred. Related to search. For example, knowing "All elephants have trunks" and "Clyde is an elephant," can we answer the question "Does Clyde hae a trunk?" What about "Peanuts has a trunk, is it an elephant?" Or "Peanuts lives in a tree and has a trunk, is it an elephant?" Deduction, abduction, non-monotonic reasoning, reasoning under uncertainty.

Learning Inductive inference, neural networks, genetic algorithms, artificial life, evolutionary approaches.

Planning Starting with general facts about the world, facts about the effects of basic actions, facts about a particular situation, and a statement of a goal, generate a strategy for achieving that goals in terms of a sequence of primitive steps or actions.

The State of the Art Computer beats human in a chess game. Computer-human conversation using speech recognition. Computer program can chat with human Expert system controls a spacecraft. Robot can walk on stairs and hold a cup of water. Language translation for webpages. Home appliances use fuzzy logic.


Agent and Environment An agent is anything that can be viewed as perceiving its environment through

sensors and acting upon that environment through actuators. A human agent has eyes, ears, and other organs for sensors and hands, legs, mouth, and other body parts for actuators. A robotic agent might have cameras and infrared range finders for sensors and various motors for actuators. A software agent receives keystrokes, file contents, and network packets as sensory inputs and acts on the environment by displaying on the screen, writing files, and sending network packets. We will make the general assumption that every agent can perceive its own actions (but not always the effects).

We use the term percept to refer to the agent's perceptual inputs at any given instant. An agent's percept sequence is the complete history of everything the agent has ever perceived. In general, an agent's choice of action at any given instant can depend on the entire percept sequence observed to date

If we can specify the agent's choice of action for every possible percept sequence, then we have said more or less everything there is to say about the agent. Mathematically speaking, we say that an agent's behavior is described by the agent function that maps any given percept sequence to an action.

f : P * A The agent program runs on the physical architecture to produce f


Fig. Agents interact with environments through sensors and actuators

Fig. Vacuum cleaner world

Percepts: location and contents, e.g., [A, Dirty]Actions: Left, Right, Suck, NoOpFor Vacuum Cleaner Agent:

Percept sequence Action

[A, Clean] Right

[A, Dirty] Suck

[B, Clean] Left

[B, Dirty] Suck

[A, Clean], [A, Clean] Right

[A, Clean], [A, Dirty] Suck

…

function Reflex-Vacuum-Agent( [location,status]) returns an action

if status = Dirty then return Suck else if location =A then return Right else if location = B then return Left

Rationality

Definition of Rational Agent:

For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has.

Rational ≠ omniscient (percepts may not supply all relevant information) Rational ≠ clairvoyant (action outcomes may not be as expected)Hence, rational ≠ successful


Rational exploration, learning, autonomy

PEAS (Performance measure, Environment, Actuators, Sensors)

To design a rational agent, we must specify the task environments. Task environments are essentially the "problems" to which rational agents are the "solutions." Those task environments come in a variety of flavors and the flavor of the task environment directly affects the appropriate design for the agent program.

Consider, e.g., the task of designing an automated taxi:

Agent Type Performance Measure

Environment Actuators Sensors

Taxi driver Safe, fast, legal, comfortable trip, maximize profits

Roads, other traffic, pedestrians, customers

Steering, accelerator, brake, signal, horn, display

Cameras, sonar, speedometer, GPS, odometer, accelerometer, engine sensors, keyboard

Figure PEAS description of the task environment for an automated taxi.

Agent Type Performance Measure

Environment Actuators Sensors

Medicaldiagnosis system

Healthy patient, minimize costs, lawsuits

Patient, hospital, staff Displayquestions, tests, diagnoses, treatments, referrals

Keyboard entry of symptoms, findings, patient's answers

Internet Shopping Agent

Price, quality, appropriateness, efficiency

www sites, vendors, shippers

Display to user, follow URL, fill in form

HTML pages (text, graphics, scripts)

The range of task environments that might arise in AI is obviously vast. We can, however, identify a fairly small number of dimensions along which task environments can be catego-rized.

Fully observable vs. partially observable:

If an agent's sensors give it access to the complete state of the environment at each point in time, then we say that the task environment is fully observable. An environment might be partially observable because of noisy and inaccurate sensors or because parts of the state are simply missing from the sensor data For example, a vacuum agent with only a local dirt sensor cannot tell whether there is dirt in other squares, and an automated taxi cannot see what other drivers are thinking.

Deterministic vs. stochastic.If the next state of the environment is completely determined by the current state and the action executed by the agent, then we say the environment is deterministic; otherwise, it is stochastic.

Episodic vs. sequential.In an episodic task environment, the agent's experience is divided into atomic episodes.


Each episode consists of the agent perceiving and then performing a single action. Crucially, the next episode does not depend on the actions taken in previous episodes. In episodic environments, the choice of action in each episode depends only on the episode itself. In sequential environments, on the other hand, the current decision could affect all future decisions. Chess and taxi driving are sequential: in both cases, short-term actions can have long-term consequences. Episodic environments are much simpler than sequential environments because the agent does not need to think ahead.

Static vs. dynamic.If the environment can change while an agent is deliberating, then we say the environ-ment is dynamic for that agent; otherwise, it is static. If the environment itself does not change with the passage of time but the agent's performance score does, then we say the environment is semidynamic. Taxi driving is clearly dynamic: the other cars and the taxi itself keep moving while the driving algorithm dithers about what to do next. Chess, when played with a clock, is semidynamic. Crossword puzzles are static.

Discrete vs. continuous.The discrete/continuous distinction can be applied to the state of the environment, to the way time is handled, and to the percepts and actions of the agent. For example, a discrete-state environment such as a chess game has a finite number of distinct states. Chess also has a discrete set of percepts and actions. Taxi driving is a continuous-state

Single agent vs. multiagent.Single agent and multiagent environment is differentiated by observing no. of agents in the environment. For example, an agent solving a crossword puzzle by itself is clearly in a single-agent environment, whereas an agent playing chess is in a two-agent environment.

As one might expect, the hardest case is partially observable, stochastic, sequential, dynamic, continuous, and multiagent. The real world is partially observable, stochastic, sequential, dynamic, continuous, multi-agent.

There are four basic kinds of agent program that embody the principles underlying almost all intelligent systems. All these can be turned into learning agents

• Simple reflex agents;

• Model-based reflex agents;

• Goal-based agents; and

• Utility-based agents.

All these can be turned into learning agents.

Agent types; simple reflex Select action on the basis of only the current percept.E.g. the vacuum-agent Large reduction in possible percept/action situations. Implemented through condition-action rules If dirty then suck


function REFLEX-VACUUM-AGENT ([location, status]) return an actionif status == Dirty then return Suckelse if location == A then return Rightelse if location == B then return LeftReduction from 4T to 4 entries

Agent types; reflex and state To tackle partially observable environments.

Maintain internal state Over time update state using world knowledge

How does the world change. How do actions affect world.⇒Model of World

Agent types; goal-based The agent needs a goal to know which situations are desirable.

o Things become difficult when long sequences of actions are required to find the goal.

Typically investigated in search and planning research. Major difference: future is taken into account Is more flexible since knowledge is represented explicitly and can be manipulated.


Agent types; utility-based Certain goals can be reached in different ways.

o Some are better, have a higher utility. Utility function maps a (sequence of) state(s) onto a real number. Improves on goals:

o Selecting between conflicting goalso Select appropriately between several goals based on likelihood of success.

Agent types; learning All previous agent-programs describe methods for selecting actions.

o Yet it does not explain the origin of these programs.o Learning mechanisms can be used to perform this task.o Teach them instead of instructing them.o Advantage is the robustness of the program toward initially unknown

environments.

Learning element: introduce improvements in performance element. Critic provides feedback on agents performance based on fixed performance standard. Performance element: selecting actions based on percepts. Corresponds to the previous agent programs


Problem generator: suggests actions that will lead to new and informative experiences.

Exploration vs. exploitation

KNOWLEDGE• Data = collection of facts, measurements, statistics• Information = organized data• Knowledge = contextual, relevant, actionable information

– Strong experiential and reflective elements– Good leverage and increasing returns– Dynamic– Branches and fragments with growth– Difficult to estimate impact of investment– Uncertain value in sharing– Evolves over time with experience

• Explicit knowledge– Objective, rational, technical– Policies, goals, strategies, papers, reports– Codified – Leaky knowledge

• Tacit knowledge– Subjective, cognitive, experiential learning– Highly personalized– Difficult to formalize– Sticky knowledge


Chapter 2 :Problem Solving

Problem-solving agent Four general steps in problem solving:

Goal formulationo What are the successful world states

Problem formulationo What actions and states to consider to give the goal

Searcho Determine the possible sequence of actions that lead to the states of known

values and then choosing the best sequence. Execute

o Give the solution perform the actions.

function SIMPLE-PROBLEM-SOLVING-AGENT(percept) return an actionstatic: seq, an action sequencestate, some description of the current world stategoal, a goalproblem, a problem formulationstateUPDATE-STATE(state, percept)if seq is empty thengoal FORMULATE-GOAL(state)problemFORMULATE-PROBLEM(state,goal)seq SEARCH(problem)action FIRST(seq)seq REST(seq)return action

EXAMPLE:

On holiday in Romania; currently in Arado Flight leaves tomorrow from Bucharest

Formulate goalo Be in Bucharest

Formulate problemo States: various citieso Actions: drive between cities

Find solutiono Sequence of cities; e.g. Arad, Sibiu, Fagaras, Bucharest, …


Selecting a state space Real world is absurdly complex.

State space must be abstracted for problem solving. (Abstract) state = set of real states. (Abstract) action = complex combination of real actions.

o e.g. Arad ®Zerind represents a complex set of possible routes, detours, rest stops, etc.o The abstraction is valid if the path between two states is reflected in the real world.

(Abstract) solution = set of real paths that are solutions in the real world. _ Each abstract action should be “easier” than the real problem.

Formulating Problem as a Graph

In the graph

each node represents a possible state; a node is designated as the initial state; one or more nodes represent goal states, states in which the agent’s goal is considered

accomplished. each edge represents a state transition caused by specific agent action; associated to each edge is the cost of performing that transition.

State space graph of vacuum worldExample: vacuum world

States?? two locations with or without dirt: 2 x 22=8 states. Initial state?? Any state can be initial Actions?? {Left, Right, Suck} Goal test?? Check whether squares are clean.

o Path cost?? Number of actions to reach goal.

Example: 8-puzzle States?? Integer location of each tile Initial state?? Any state can be initial Actions?? {Left, Right, Up, Down} Goal test?? Check whether goal configuration is reached

o Path cost?? Number of actions to reach goal


Problem Solving as Search

Search space: set of states reachable from an initial state S0 via a (possibly empty/finite/infinite) sequence of state transitions.

To achieve the problem’s goal

search the space for a (possibly optimal) sequence of transitions starting from S0 and leading to a goal state;

execute (in order) the actions associated to each transition in the identified sequence.

Depending on the features of the agent’s world the two steps above can be interleaved.

How do we reach a goal state?

There may be several possible ways. Or none!

Factors to consider:

cost of finding a path; cost of traversing a path.

Problem Solving as Search Reduce the original problem to a search problem. A solution for the search problem is a path initial state–goal state. The solution for the original problem is either

o the sequence of actions associated with the path o Or the description of the goal state.

Example: The 8-puzzleIt can be generalized to 15-puzzle, 24-puzzle, or (n2 − 1)-puzzle for n ≥ 6.

States: configurations of tilesOperators: move one tile Up/Down/Left/Right


There are 9! = 362, 880 possible states (all permutations of {⊓⊔, 1, 2, 3, 4, 5, 6, 7, 8}).

There are 16! possible states for 15-puzzle. Not all states are directly reachable from a given state.

(In fact, exactly half of them are reachable from a given state.)

How can an artificial agent represent the states and the statespace for this problem?

Go from state S to state G.

Problem formulation A problem is defined by:

o An initial state, e.g. Arado Successor function S(X)= set of action-state pairs

e.g. S(Arad)={<Arad ® Zerind, Zerind>,…} intial state + successor function = state space

o Goal test, can be Explicit, e.g. x=‘at bucharest’ Implicit, e.g. checkmate(x)

o Path cost (additive) e.g. sum of distances, number of actions executed, … c(x,a,y) is the step cost, assumed to be >= 0

A solution is a sequence of actions from initial to goal state.Optimal solution has the lowest path cost.

Problem formulation1. Choose an appropriate data structure to represent the world states.2. Define each operator as a precondition/effects pair where the precondition holds exactly in the

states the operator applies to, effects describe how a state changes into a successor state by the application of the operator.

3. Specify an initial state.4. Provide a description of the goal (used to check if a reached state is a goal state).


Formulating the 8-puzzle ProblemStates: each represented by a 3 × 3 array of numbers in [0 . . . 8], where value 0 is for the empty cell.

Operators: 24 operators of the form Op(r,c,d) where r, c {∈ 1, 2, 3}, d {∈ L,R,U,D}. Op(r,c,d) moves the empty space at position (r, c) in the direction d.

Example: Op(3,2,R)

We have 24 operators in this problem formulation . . .20 too many!

Problem types

Deterministic, fully observable ⇒single state problemo Agent knows exactly which state it will be in; solution is a sequence.

Partial knowledge of states and actions:o Non-observable ⇒sensorless or conformant problem

Agent may have no idea where it is; solution (if any) is a sequence.o Nondeterministic and/or partially observable ⇒contingency problem

Percepts provide new information about current state; solution is a tree or policy; often interleave search and execution.

If uncertainty is caused by actions of another agent: adversarial problemo Unknown state space ⇒exploration problem (“online”)

When states and actions of the environment are unknown. Problem Solutions need Well-Defined Problems, and Well Defined Problems need to

embody explicit solutions on possible solutions: well defined problems must define the space of possible solutions.We use searching to solve well defined problems.


Constraint satisfaction problems

What is a CSP?

Finite set of variables V1, V2, …, Vn Finite set of constraints C1, C2, …, Cm Nonemtpy domain of possible values for each variables DV1, DV2, … DVn Each constraint Ci limits the values that variables can take,

e.g., V1 ≠ V2 A state is defined as an assignment of values to some or all variables. Consistent assignment: assignment does not not violate the constraints. An assignment is complete when every value is mentioned. A solution to a CSP is a complete assignment that satisfies all constraints. Some CSPs require a solution that maximizes an objective function. Applications: Scheduling the time of observations on the Hubble Space Telescope, Floor planning,

Map coloring, Cryptography CSPs are a special kind of problem: states defined by values of a fixed set of variables, goal test

defined by constraints on variable values

Varieties of Constraints Unary constraints involve a single variable.

e.g. SA ¹ green Binary constraints involve pairs of variables.

e.g. SA ¹ WA Higher-order constraints involve 3 or more variables.

e.g. cryptharithmetic column constraints. Preference (soft constraints) e.g. red is better than greenoften representable by a cost for each

variable assignment constrained optimization problems.

CSP example: map coloring

Variables: WA, NT, Q, NSW, V, SA, T Domains: Di={red,green,blue} Constraints: adjacent regions must have different colors.

o E.g. WA ¹ NT (if the language allows this)o E.g. (WA,NT) ¹ {(red,green),(red,blue),(green,red),…}

Solutions are assignments satisfying all constraints, e.g.


{WA=red,NT=green,Q=red,NSW=green,V=red,SA=blue,T=green}

Constraint graphCSP benefits

Standard representation pattern Generic goal and successor functions Generic heuristics (no domain specific expertise).

Constraint graph = nodes are variables, edges show constraints. Graph can be used to simplify search.

o e.g. Tasmania is an independent subproblem.

Cryptarithmetic conventions

Each letter or symbol represents only one digit throughout the problem; When letters are replaced by their digits, the resultant arithmetical operation must be

correct; The numerical base, unless specifically stated, is 10; Numbers must not begin with a zero; There must be only one solution to the problem.

1. S E N D

+ M O R E ------------ M O N E Y

We see at once that M in the total must be 1, since the total of the column SM cannot reach as high as 20. Now if M in this column is replaced by 1, how can we make this column total as much as 10 to provide the 1 carried over to the left below? Only by making S very large: 9 or 8. In either case the letter O must stand for zero: the summation of SM could produce only 10 or 11, but we cannot use 1 for letter O as we have already used it for M.

If letter O is zero, then in column EO we cannot reach a total as high as 10, so that there will be no 1 to carry over from this column to SM. Hence S must positively be 9.

Since the summation EO gives N, and letter O is zero, N must be 1 greater than E and the column NR must total over 10. To put it into an equation: E + 1 = N

From the NR column we can derive the equation: N + R + (+ 1) = E + 10

We have to insert the expression (+ 1) because we don’t know yet whether 1 is carried over from column DE. But we do know that 1 has to be carried over from column NR to EO.

Subtract the first equation from the second: R + (+1) = 9

We cannot let R equal 9, since we already have S equal to 9. Therefore we will have to make R equal to 8; hence we know that 1 has to be carried over from column DE.

Column DE must total at least 12, since Y cannot be 1 or zero. What values can we give D and E to reach this total? We have already used 9 and 8 elsewhere. The only digits left that are high enough are 7, 6 and 7, 5. But remember that one of these has to be E, and N is 1 greater than E. Hence E must be 5, N must be 6, while D is 7. Then Y turns out to be 2, and the puzzle is completely solved.


S E N D 9 5 6 7 + M O R E 1 0 8 5 --------- M O N E Y 1 0 6 5 2

2.T W O

+ T W O_____

F O U R

Since, Lets first check with F as 0.Now imagine O with highest possible value 9.Now R must be 8 and T should be 4. Now among the remaining numbers if we check then we get U as 3.Thus W must be 6,

T W O4 6 9

+ T W O4 6 9_____

F O U R 0 9 3 8

Game PlayingSummary

Games are fun (and dangerous) They illustrate several important points about AI Perfection is unattainable -> approximation Good idea what to think about Uncertainty constrains the assignment of values to states Games are to AI as grand prix racing is to automobile design.

Games are a form of multi-agent environment


Initial state

M=1

S= 8 or 9

O = 0 or 1 -> O =0

N= E or E+1 -> N= E+1

C2 = 1

N+R >8

E<> 9

N=3

R= 8 or 9

2+D = Y or 2+D = 10 +Y2+D =Y

N+R = 10+E

R =9

S =8

2+D = 10 +Y

D = 8+Y

D = 8 or 9Y= 0 ; Conflict

Y =1 ; Conflict

E=2

C1= 0

C1= 1

D=8

o What do other agents do and how do they affect our success?o Cooperative vs. competitive multi-agent environments.o Competitive multi-agent environments give rise to adversarialproblems a.k.a. games

Why study games?o Fun; historically entertainingo Interesting subject of study because they are hard

Chess game: average branch factor: 35, each player: 50 moves-> Search tree: 35100 nodes

Relation of Search and Games

Search – no adversary Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal solution Evaluation function: estimate of cost from start to goal through given node Examples: path planning, scheduling activities

Games – adversary Solution is strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate “goodness” of game position Examples: chess, checkers, Othello, backgammon

Types Of Games

Multiplayer Games allow more than one player

Game setup

Two players: MAX and MIN MAX moves first and they take turns until the game is over. Winner gets award, looser gets

penalty. Games as search:

o Initial state: e.g. board configuration of chess

o Successor function: list of (move,state) pairs specifying legal moves.

o Terminal test: Is the game finished?

o Utility function: Gives numerical value of terminal states.

E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next) MAX uses search tree to determine next move. Partial Game Tree for Tic Tac Toe


Optimal strategies Find the contingent strategy for MAX assuming an infallible MIN opponent. Assumption: Both players play optimally !! Given a game tree, the optimal strategy can be determined by using the minimax value of

each node:MINIMAX-VALUE(n)=

Two-Ply Game Tree

Minimax maximizes the worst-case outcome for max.

Production System


Production systems are applied to problem solving programs that must perform a wide-range of

seaches. Production ssytems are symbolic AI systems. The difference between these two terms is

only one of semantics. A symbolic AI system may not be restricted to the very definition of production

systems, but they can't be much different either.

Production systems are composed of three parts, a global database, production rules and a control

structure.

A production system (or production rule system) is a computer program typically used to provide some form of artificial intelligence, which consists primarily of a set of rules about behavior. These rules, termed productions, are a basic representation found useful in automated planning, expert systems and action selection. A production system provides the mechanism necessary to execute productions in order to achieve some goal for the system.

Productions consist of two parts: a sensory precondition (or "IF" statement) and an action (or "THEN"). If a production's precondition matches the current state of the world, then the production is said to betriggered. If a production's action is executed, it is said to have fired.

The first production systems were done by Newell and Simon in the 1950s, and the idea was written up in their (1972).

"Production" in the title of these notes (or "production rule") is a synonym for "rule", i.e. for a condition-action rule (see below). The term seems to have originated with the term used for rewriting rules in the Chomsky hierarchy of grammar types, where for example context-free grammar rules are sometimes referred to as context-free productions.

Rules

These are also called condition-action rules.These components of a rule-based system have the form:

if <condition> then <conclusion>

or

if <condition> then <action>

Example:if patient has high levels of the enzyme ferritin in their blood and patient has the Cys282→Tyr mutation in HFE genethen conclude patient has haemochromatosis*

* medical validity of this rule is not asserted here

Rules can be evaluated by:

backward chaining forward chaining


http://www.cse.unsw.edu.au/~billw/cs9414/notes/nlp/grampars/grampars.html#chomsky

http://en.wikipedia.org/wiki/Execution_(computers)

http://en.wikipedia.org/wiki/State_(computer_science)

http://en.wikipedia.org/wiki/Action_selection

http://en.wikipedia.org/wiki/Expert_systems

http://en.wikipedia.org/wiki/Expert_systems

http://en.wikipedia.org/wiki/Automated_planning_and_scheduling

http://en.wikipedia.org/wiki/Knowledge_representation

http://en.wikipedia.org/wiki/Artificial_intelligence

Backward Chaining

To determine if a decision should be made, work backwards looking for justifications for the decision.

Eventually, a decision must be justified by facts.

Forward Chaining

Given some facts, work forward through inference net. Discovers what conclusions can be derived from data.

Forward Chaining 2

Until a problem is solved or no rule's 'if' part is satisfied by the current situation:

1. Collect rules whose 'if' parts are satisfied.2. If more than one rule's 'if' part is satisfied, use a conflict resolution strategy to eliminate all

but one.3. Do what the rule's 'then' part says to do.

Production Rules

A production rule system consists of


a set of rules working memory that stores temporary data a forward chaining inference engine

Match-Resolve-Act Cycle

The match-resolve-act cycle is what the inference engine does.

loop

match conditions of rules with contents of working memoryif no rule matches then stopresolve conflictsact (i.e. perform conclusion part of rule)

end loop

Chapter-3

3.1. Uninformed Search

3.1.1 Breadth-first search (BFS)

Description

A simple strategy in which the root is expanded first then all the root successors are expanded next, then

their successors.

We visit the search tree level by level that all nodes are expanded at a given depth before any nodes at

the next level are expanded.

Order in which nodes are expanded.

Performance Measure:

Completeness:

it is easy to see that breadth-first search is complete that it visit all levels given that d factor is finite, so

in some d it will find a solution.

Optimality:

breadth-first search is not optimal until all actions have the same cost.

Space complexity and Time complexity:

Consider a state space where each node as a branching factor b, the root of the tree generates b

nodes, each of which generates b nodes yielding b2 each of these generates b3 and so on.

In the worst case, suppose that our solution is at depth d, and we expand all nodes but the last node

at level d, then the total number of generated nodes is: b + b2 + b3 + b4 + bd+1 – b = O(bd+1), which is

the time complexity of BFS.


http://mhesham.files.wordpress.com/2010/04/300pxbreadthfirsttree-svg_.png

As all the nodes must retain in memory while we expand our search, then the space complexity is like

the time complexity plus the root node = O(bd+1).

Conclusion:

We see that space complexity is the biggest problem for BFS than its exponential execution time.

Time complexity is still a major problem, to convince your-self look at the table below.

3.1.2. Depth-first search (DFS)

Description:

DFS progresses by expanding the first child node of the search tree that appears and thus going deeper

and deeper until a goal node is found, or until it hits a node that has no children. Then the

search backtracks, returning to the most recent node it hasn’t finished exploring.

Order in which nodes are expanded


Completeness:

DFS is not complete, to convince yourself consider that our search start expanding the left sub tree of

the root for so long path (may be infinite) when different choice near the root could lead to a solution,

now suppose that the left sub tree of the root has no solution, and it is unbounded, then the search

will continue going deep infinitely, in this case we say that DFS is not complete.

Optimality:

Consider the scenario that there is more than one goal node, and our search decided to first expand

the left sub tree of the root where there is a solution at a very deep level of this left sub tree, in the

same time the right sub tree of the root has a solution near the root, here comes the non-optimality of

DFS that it is not guaranteed that the first goal to find is the optimal one, so we conclude that DFS is

not optimal.

Time Complexity:

Consider a state space that is identical to that of BFS, with branching factor b, and we start the search

from the root.


http://en.wikipedia.org/wiki/Backtracking

http://mhesham.files.wordpress.com/2010/04/image4.png

http://mhesham.files.wordpress.com/2010/04/300pxdepthfirsttree-svg_.png

In the worst case that goal will be in the shallowest level in the search tree resulting in generating all

tree nodes which are O(bm).

Space Complexity:

Unlike BFS, our DFS has a very modest memory requirements, it needs to story only the path from

the root to the leaf node, beside the siblings of each node on the path, remember that BFS needs to

store all the explored nodes in memory.

DFS removes a node from memory once all of its descendants have been expanded.

With branching factor b and maximum depth m, DFS requires storage of only bm + 1 nodes which

areO(bm) compared to the O(bd+1) of the BFS.

Conclusion:

DFS may suffer from non-termination when the length of a path in the search tree is infinite, so we

perform DFS to a limited depth which is called Depth-limited Search.

3.1.3 Depth Limited Search

• Breadth first has computational, especially, space problems. Depth first can run off down a very long (or infinite) path..

• Idea: introduce a depth limit on branches to be expanded.• Don’t expand a branch below this depth.• Most useful if you know the maximum depth of the solution.

Perform depth first search but only to a pre-specified depth limit L. No node on a path that is more than L steps from the initial state is placed on the Frontier.

We “truncate” the search by looking only at paths of length L or less.

Description:

The unbounded tree problem appeared in DFS can be fixed by imposing a limit on the depth that DFS

can reach, this limit we will call depth limit l, this solves the infinite path problem.


Completeness:

The limited path introduces another problem which is the case when we choose l < d, in

which is our DLS will never reach a goal, in this case we can say that DLS is not complete.

Optimality:

One can view DFS as a special case of the depth DLS, that DFS is DLS with l = infinity.

DLS is not optimal even if l > d.

Time Complexity: O(bl)

Space Complexity: O(bl)

Conclusion:

DLS can be used when the there is a prior knowledge to the problem, which is always not the case,

Typically, we will not know the depth of the shallowest goal of a problem unless we solved this

problem before.

It is Depth First -search with depth limit l. i.e. nodes at depth l have no successors. Problem knowledge can be used

Solves the infinite-path problem. If l < d then incompleteness results. If l > d then not optimal. Time complexity: O(bl ) Space complexity: O(bl )

Advantages


Will always terminate Will find solution if there is one in the depth bound

Disadvantages• Too small a depth bound misses solutions• Too large a depth bound may find poor solutions when there are better ones

3.1.4. Search Strategies’ Comparison:

Here is a table that compares the performance measures of each search strategy.

3.2. Informed Search- more powerful than uninformed- Informed = use problem-specific knowledge

3.2.1. Hill Climbing Here feedback from the test procedure is used to help the generator decide which direction to

move in search space. The test function is augmented with a heuristic function that provides an estimate of how

close a given state is to the goal state. Computation of heuristic function can be done with negligible amount of computation. Greedy local search

Hill climbing is often used when a good heuristic function is available for evaluating states but when no other useful knowledge is available

Loop that continuously moves in the direction of increasing value Terminates when it reaches a “Peak”


http://mhesham.files.wordpress.com/2010/04/image6.png

(a)

(b)

(c)

Figure 5.9 Local maxima, Plateaus and ridge situation for Hill Climbing

Problem: depending on initial state, can get stuck in local maxima

This simple policy has three well-known drawbacks:

1. Local Maxima: a local maximum as opposed to global maximum.

2. Plateaus: An area of the search space where evaluation function is flat, thus requiring random walk.

3. Ridge: Where there are steep slopes and the search direction is not towards the top but towards the side.Variations of Hill Climbing

Stochastic hill-climbingo Random selection among the uphill moves.o The selection probability can vary with the steepness of the uphill move.

First-choice hill-climbingo cfr. stochastic hill climbing by generating successors randomly until a better one is

found. Random-restart hill-climbing

o Tries to avoid getting stuck in local maxima.

3.2.2. Best First Search General approach of informed search:

o Best-first search: node is selected for expansion based on an evaluation function f(n)

Idea: evaluation function measures distance to the goal.o Choose node which appears best

Implementation:o fringe is queue sorted in decreasing order of desirability.

o Special cases: greedy search, A* search

Best First Search is a general search strategy Uses an evaluation function f(n) in deciding which node (in queue) to expand next Note: “best” could be misleading (it is relative, not absolute) Greedy search is one type of Best First Search

3.2.2.1.Greedy Search

Use a heuristic h() (cost estimate to goal) as the evaluation function Example: straight-line distance in finding a path from one city to another


Evaluation function f(n) = h(n) (heuristic)= (estimate of cost from n to goal)e.g., hSLD(n) = straight-line distance from n to Bucharest

Greedy best-first search expands the node that appears to be closest to goal Complete? No – can get stuck in loops, e.g., Iasi Neamt Iasi Neamt

Time? O(bm), but a good heuristic can give dramatic improvementSpace? O(bm) -- keeps all nodes in memory

Optimal? No But can be acceptable in practice

3.2.2. A* Search Best-known form of best-first search. Idea: avoid expanding paths that are already expensive. Evaluation function f(n)=g(n) + h(n)

o g(n) the cost (so far) to reach the node.

o h(n) estimated cost to get from the node to the goal.

o f(n) estimated total cost of path through n to goal.

A* search uses an admissible heuristico A heuristic is admissible if it never overestimates the cost to reach the goal

o Are optimistic

Formally:1. h(n) <= h*(n) where h*(n) is the true cost from n2. h(n) >= 0 so h(G)=0 for any goal G.e.g. hSLD(n) never overestimates the actual road distance

example: Find Bucharest starting at Aradf(Arad) = c(??,Arad)+h(Arad)=0+366=366Initial State:


Expand Arrad and determine f(n) for each node f(Sibiu)=c(Arad,Sibiu)+h(Sibiu)=140+253=393 f(Timisoara)=c(Arad,Timisoara)+h(Timisoara)=118+329=447 f(Zerind)=c(Arad,Zerind)+h(Zerind)=75+374=449 Best choice is Sibiu

And so on…

Admissible Heuristic A heuristic h(n) is admissible if for every node n,

h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n. An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic Example: hSLD(n) (never overestimates the actual road distance) Theorem: If h(n) is admissible, A* using TREE-SEARCH is optimal

A* Search Evaluation Completeness: YES Time complexity: (exponential with path length) Space complexity:(all nodes are stored) Optimality: YES

Cannot expand fi+1 until fi is finished. A* expands all nodes with f(n)< C* A* expands some nodes with f(n)=C* A* expands no nodes with f(n)>C*


Also optimally efficient (not including ties)

3.2.3. Adversarial SearchMINMAX procedure

Perfect play for deterministic games Idea: choose move to position with highest minimax value = best achievable payoff against

best play E.g., 2-ply game:

MINMAX Algorithm minimax(player,board) if(game over in current board position)

return winnerchildren = all legal moves for player from this board if(max's turn)

return maximal score of calling minimax on all the children else (min's turn) return minimal score of calling minimax on all the children

Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games

exact solution completely infeasible

Alpha Beta Pruning

ALPHA-BETA pruning is a method that reduces the number of nodes explored in Minimax strategy.

It reduces the time required for the search and it must be restricted so that no time is to be wasted searching moves that are obviously bad for the current player.

The exact implementation of alpha-beta keeps track of the best move for each side as it moves throughout the tree.


Properties of α-β Pruning does not affect final result Good move ordering improves effectiveness of pruning


With "perfect ordering," time complexity = O(bm/2) doubles depth of search

Why it is called alpha-beta? A simple example of the value of reasoning about which computations are relevant α is the value of the best (i.e., highest-value) choice found so far at any choice point along the

path for max If v is worse than α, max will avoid it

prune that branch Define β similarly for min

Chapter 4

4.1.1 Logics are formal languages for formalizing reasoning, in particular for representing information such that conclusions can be drawn Logic involves:

– A language with a syntax for specifying what is a legal expression in the language; syntax defines well formed sentences in the language

– Semantics for associating elements of the language with elements of some subject matter. Semantics defines the "meaning" of sentences (link to the world); i.e., semantics defines the truth of a sentence with respect to each possible world

– Inference rules for manipulating sentences in the language

4.1.2. Syntax (grammar, internal structure of the language)– Vocabulary: grammatical categories– Identifying Well-Formed Formulae (“WFFs”)

4.1.3 Semantics (pertaining to meaning and truth value)– Translation– Truth functions– Truth tables for the connectives

4.1.4. Connectives (“Sentence-Forming Operators”)~ negation “not,” “it is not the case that” ⋅ conjunction “and” ∨ disjunction “or” (inclusive)⊃ conditional “if – then,” “implies” ≣ biconditional “if and only if,” “iff”

• Connect to sentences to make new sentences• Negation attaches to one sentence

– It is not raining ∼ R


• Conjunction, disjunction, conditional and biconditional attach two sentences together– It is raining and it is cold R ∙ C– If it rains then it pours R ⊃ P

4.1.5. Well-Formed Formulae

Rules for WFF1. A sentence letter by itself is a WFF

A B Z2. The result of putting ~ immediately in front of a WFF is a WFF

~A ~ B ~ ~ B ~ (A Ú B) ~ (~ C D)3. The result of putting , Ú , É , or º between two WFFs and surrounding the whole thing with

parentheses is a WFF (A É B) (~ ~ C Ú D) ((~ ~ C Ú D) É (E (F º ~ G)))

4. Outside parentheses may be dropped A É B ~ ~ C Ú D (~ ~ C Ú D) É (E (F º ~ G))

A sentence that can be constructed by applying the rules for constructing WFFs one at a time is a WFFA sentence which can't be so constructed is not a WFF.

– Atomic sentences are wffs: Propositional symbol (atom)Examples: P, Q, R, BlockIsRed, SeasonIsWinter

– Complex or compound wffs.Given w1 and w2 wffs: Ø w1 (negation)(w1 Ù w2) (conjunction)(w1 Ú w2) (disjunction)(w1 ® w2) (implication; w1 is the antecedent; w2 is the consequent)(w1 « w2) (biconditional)

4.1.6. TautologyIf a wff is True under all the interpretations of its constituents atoms, we say that the wff is valid or it is a tautology.Examples:

1 P ® P 2 Ø(P Ù ØP) 3 [P ® (Q ® P)] 4 [(P ® Q) ®P) ®P]An inconsistent sentence or contradiction is a sentence that is False under all interpretations. The world is never like what it describes, as in “It’s raining and it’s not raining.”

4.1.7.ValidityAn argument is valid whenever the truth of all its premises implies the truth of its conclusion.

An argument is a sequence of propositions. The final proposition is called the conclusion of the argument while the other proposition are called the premises or hypotheses of the argument.one can use the rules of inference to show the validity of an argument.Note that p1, p2, … q are generally compound propositions or wffs.

4.2.

Intelligent agents should have capacity for: Perceiving: acquiring information from environment,


Knowledge Representation: representing its understanding of the world, Reasoning: inferring the implications of what it knows and of the choices it has, and Acting: choosing what it want to do and carry it out.

4.2.1.Knowledge Base Representation of knowledge and the reasoning processes that brings knowledge to life –

center to entire field of AI Knowledge and reasoning also play a crucial role in dealing partially observable

environments Central component of Knowledge-based agent is its knowledge base.

Knowledge base = set of sentences in a formal language Declarative approach to building an agent (or other system):

TELL it what it needs to know Then it can Ask itself what to do

- answers should follow from the KB

4.2.2.Entailment Entailment means that one thing follows from another:

KB ╞ α Knowledge base KB entails sentence α if and only if α is true in all worlds where KB is true

o e.g., the KB containing “the Giants won” and “the Reds won” entails “Either the

Giants won or the Reds won”o E.g., x+y = 4 entails 4 = x+y

o Entailment is a relationship between sentences (i.e., syntax) that is based on

semanticsInference

Notation :KB ├i α = sentence α can be derived from KB by procedure i Soundness: i is sound if whenever KB ├i α, it is also true that KB╞ α Completeness: i is complete if whenever KB╞ α, it is also true that KB ├i α

Sound Rules of InferenceHere are some examples of sound rules of inference

A rule is sound if its conclusion is true whenever the premise is trueEach can be shown to be sound using a truth tableRULE PREMISE CONCLUSION Modus Ponens A, A ® B B And Introduction A, B A Ù BAnd Elimination A Ù B ADouble Negation ØØA A Unit Resolution A Ú B, ØB AResolution A Ú B, ØB Ú C A Ú C

Soundness of Modus Ponens


(P Q) = (P

Q)

A B A → B OK?

True True True Ö

True False

False Ö

False True True Ö

False False

True Ö

Horn ClauseA Horn sentence or Horn clause has the form:P1 Ù P2 Ù P3 ... Ù Pn ® Qor alternatively ØP1 Ú Ø P2 Ú Ø P3 ... Ú Ø Pn Ú Qwhere Ps and Q are non-negated atoms

• To get a proof for Horn sentences, apply Modus Ponens repeatedly until nothing can be done• We will use the Horn clause form later

4.2.3.Propositional LogicPropositional Logic Syntax

Propositional logic is the simplest logic – illustrates basic ideas All objects described are fixed or unique

E.g. "John is a student" student(john) ; Here John refers to one unique person. In propositional logic (PL) an user defines a set of propositional symbols, like P and Q. User

defines the semantics of each of these symbols. For example, P means "It is hot" Q means "It is humid“ R means "It is raining"

The proposition symbols: S, S1, S2 etc are sentences_ If S is a sentence, ØS is a sentence (negation )_ If S1 and S2 are sentences, S1 Ù S2 is a sentence (conjunction )_ If S1 and S2 are sentences, S1 Ú S2 is a sentence (disjunction )_ If S1 and S2 are sentences, S1 => S2 is a sentence (implication )_ If S1 and S2 are sentences, S1 S2 is a sentence (biconditional )

Propositional Logic Semantics Each model specifies true/false for each proposition symbol

With these symbols, 8 possible models, can be enumerated automatically.Rules for evaluating truth with respect to a model m:

ØS is true iff S is false S1 Ù S2 is true iff S1 is true and S2 is trueS1 Ú S2 is true iff S1is true or S2 is trueS1 Þ S2 is true iff S1 is false or S2 is true i.e., is false iff S1 is true and S2 is false


S1 Û S2 is true iff S1ÞS2 is true and S2ÞS1 is true Simple recursive process evaluates an arbitrary sentence, e.g.,

ØP1,2 Ù (P2,2 Ú P3,1) = true Ù (true Ú false) = true Ù true = true

Truth Table for Connectives

Validity and satisfiabilityA sentence is valid if it is true in all models,e.g., True, A ÚØA, A Þ A, (A Ù (A Þ B)) Þ BValidity is connected to inference via the Deduction Theorem:KB ╞ α if and only if (KB Þ α) is validA sentence is satisfiable if it is true in some modele.g., AÚ B, CA sentence is unsatisfiable if it is true in no modelse.g., AÙØASatisfiability is connected to inference via the following:KB ╞ α if and only if (KB ÙØα) is unsatisfiable

Logical Equivalence Two sentences are logically equivalent iff true in same models: α ≡ ß iff α╞ β and β╞ α

Resolution Conjunctive Normal Form (CNF)

o conjunction of disjunctions of literals clauses

E.g., (A Ú ØB) Ù (B Ú ØC Ú ØD) Resolution is sound and complete for propositional logic Conversion to CNF

B1,1 Û (P1,2 Ú P2,1)β1. Eliminate Û, replacing α Û β with (α Þ β)Ù(β Þ α).(B1,1 Þ (P1,2 Ú P2,1)) Ù ((P1,2 Ú P2,1) Þ B1,1)


2. Eliminate Þ, replacing α Þ β with ØαÚ β.(ØB1,1 Ú P1,2 Ú P2,1) Ù (Ø(P1,2 Ú P2,1) Ú B1,1)

3. Move Ø inwards using de Morgan's rules and double-negation:(ØB1,1 Ú P1,2 Ú P2,1) Ù ((ØP1,2 Ú ØP2,1) Ú B1,1)

4. Apply distributivity law (Ù over Ú) and flatten:(ØB1,1 Ú P1,2 Ú P2,1) Ù (ØP1,2 Ú B1,1) Ù (ØP2,1 Ú B1,1)

Resolution Algorithm Proof by contradiction, i.e., show KBÙØα unsatisfiable

Proportional Resolution

Advantages of propositional logic:· Simple.· No decidability problems.

Limitations of Propositional Calculus An argument may not be provable using propositional logic, but may be provable using

predicate logic. e.g. All horses are animals.

Therefore, the head of a horse is the head of an animal.


We know that this argument is correct and yet it cannot be proved under propositional logic, but it can be proved under predicate logic.

Limited representational power. Simple statements may require large and awkward representations.

4.2.4.First Order Predicate Logic (FOPL)

Predicate Logic (FOPL) providesi) A language to express assertions (axioms) about certain "worlds ".ii) An inference system or deductive apparatus whereby we may draw conclusions fromsuch assertions andiii) A semantics based on set theory.

The language of FOPL consists ofi) A set of constant symbols (to name particular individuals such as table, a,b,c,d,e etc. - these depend on the application)ii) A set of variables (to refer to arbitrary individuals)iii) A set of predicate symbols (to represent relations such as On, Above etc. -these depend on the application)iv) A set of function symbols (to represent functions - these depend on the application)v) The logical connectives −, . , υ ,ω , ¬ (to capture and, or, implies, iff and not)vi) The Universal Quantifier, ∀ : and the Existential Quantifer, ∃ :(to capture “all”, “every”, “some”, “few”, “there exists” etc.)vii) Normally a special binary relation of equality (=) is considered (at least in mathematics) as part of the language.QuantificationUniversal Qunatification

"<variables> <sentence>Everyone at KEC is smart:

"x At(x,KEC) Þ Smart(x) "x P is true in a model m iff P is true with x being each possible object in the model

Roughly speaking, equivalent to the conjunction of instantiations of P At(KingJohn,KEC) Þ Smart(KingJohn) At(Richard,KEC) Þ Smart(Richard)

Common mistake to avoid: Typically, Þ is the main connective with " Common mistake: using Ù as the main connective with ":

"x At(x,KEC) Ù Smart(x) means “Everyone is at KEC and everyone is smartExistential Quantification

$<variables> <sentence> Someone at KEC is smart:

$x At(x,KEC) Ù Smart(x) $x P is true in a model m iff P is true with x being some

possible object in the model Typically, Ù is the main connective with $

Common mistake: using Þ as the main connective with $:$x At(x,KEC) Þ Smart(x) is true if there is anyone who is not at KEC

Properties of Quantifiers


"x "y is the same as "y "x $x $y is the same as $y $x $x "y is not the same as "y $x

$x "y Loves(x,y) “There is a person who loves everyone in the world”

"y $x Loves(x,y) “Everyone in the world is loved by at least one person”

Quantifier duality: each can be expressed using the other "x Likes(x,IceCream) Ø$x ØLikes(x,IceCream) $x Likes(x,Broccoli) Ø"x ØLikes(x,Broccoli)

Example 1For example, Suppose we wish to represent in FOPL the following sentencesa) “Everyone loves Janet”b) “Not everyone loves Daphne”c) “Everyone is loved by their mother”Introducing constant symbols j and d to represent Janet and Daphne respectively; a binarypredicate symbol L to represent loves and the unary function symbol1 m to represent themother of a person given as argument.The above sentences may now be represented in FOPL bya) ∀x.L(x,j)

b) ∃x.¬L(x,d)

c) ∀x.L(m(x),x)

Example 2We will express the following in first order predicate calculus“sam is Kind”“Every kind person has someone who loves them”“sam loves someone”The non-logical symbols of our language arethe constant sam andthe unary predicate (or property) Kind andthe binary predicate Loves.We may represent the above sentences as1. Kind(sam)2. ∀x.(Kind(x) υ ∃y.Loves(y,x))3. ∃y Loves(sam,y)

Some Semantic Issues

An interpretation (of the language of FOPL) consists of

a) a non empty set of objects (the Universe of Discourse, D) containing designated individuals named by the constant symbolsb) for each function symbol in the language of FOPL, a corresponding function over D.c) for each predicate symbol in the language of FOPL, a corresponding relation over D.

An interpretation is said to be a model for a set of sentences Γ, if each sentence of Γ is trueunder the given interpretation.


The interpretation of a formula F in first order predicate logic consists of fixing a domain of values (non empty) D and of an association of values for every constant, function and predicate in the formula F as follows:

(1) Every constant has an associated value in D. (2) Every function f, of arity n, is defined by the correspondence

where (3) Every predicate of arity n, is defined by the correspondence Interpretation Example

Using FOL Brothers are siblings

"x,y Brother(x,y) Û Sibling(x,y) One's mother is one's female parent

"m,c Mother(c) = m Û (Female(m) Ù Parent(m,c)) “Sibling” is symmetric

"x,y Sibling(x,y) Û Sibling(y,x) Marcus was a man

Man(Marcus) Marcus was a Pompeian

Pompeian(Marcus) All Pompeians were Romans

x:Pompeian(x)Roman(x) All Romans were either loyal to Caesar or hated him

x:Roman(x) loyalto(x,Caesar) V hate(x, Caesar) Everyone is loyal to someone

x: y: loyalto(x,y) People only try to assassinate rulers they are not loyal to

x: y: person(x) AND ruler(y) AND tryassassinate(x,y) ~loyalto(x,y)


a

2

f (1) f (2)

2 1

A(2,1) A(2,2) B(1) B(2) C C D D( ) ( ) ( ) ( )1 2 1 2

a f a f a f f a

(( ) )a f a fÚ Ù ®

(( ) )f a f aÚ Ù ®

X=1

X=2

( x)(((A(a,x) B(f (x))) C(x)) D(x))" Ú Ù ®

D={1,2}

P:Dn→{a , f }Dn={( x1 , .. . ,xn )|x1∈D,. .. ,xn∈D}

4.3

Inference Rules Complex deductive arguments can be judged valid or invalid based on whether or not the steps in that argument follow the nine basic rules of inference. These rules of inference are all relatively simple, although when presented in formal terms they can look overly complex.Conjunction:1. P2. Q3. Therefore, P and Q.1. It is raining in New York.2. It is raining in Boston3. Therefore, it is raining in both New York and BostonSimplification1. P and Q.2. Therefore, P.1. It is raining in both New York and Boston.2. Therefore, it is raining in New York.Addition1. P2. Therefore, P or Q.1. It is raining2. Therefore, either either it is raining or the sun is shining.Absorption1. If P, then Q.2. Therfore, If P then P and Q.1. If it is raining, then I will get wet.2. Therefore, if it is raining, then it is raining and I will get wet.Modus Ponens1. If P then Q.2. P.3. Therefore, Q.1. If it is raining, then I will get wet.2. It is raining.3. Therefore, I will get wet.


All purple mushrooms are poisonous x (Purple(x) Mushroom(x)) Poisonous(x) x Purple(x) (Mushroom(x) Poisonous(x)) x Mushroom (x) (Purple (x) Poisonous(x))

Modus Tollens1. If P then Q.2. Not Q. (~Q).3. Therefore, not P (~P).1. If it had rained this morning, I would have gotten wet.2. I did not get wet.3. Therefore, it did not rain this morning.Hypothetical Syllogism1. If P then Q.2. If Q then R.3. Therefore, if P then R.1. If it rains, then I will get wet.2. If I get wet, then my shirt will be ruined.3. If it rains, then my shirt will be ruined.Disjunctive Syllogism1. Either P or Q.2. Not P (~P).3. Therefore, Q.1. Either it rained or I took a cab to the movies.2. It did not rain.3. Therefore, I took a cab to the movies.Constructive Dilemma1. (If P then Q) and (If R then S).2. P or R.3. Therefore, Q or S.1. If it rains, then I will get wet and if it is sunny, then I will be dry.2. Either it will rain or it will be sunny.3. Therefore, either I will get wet or I will be dry.

The above rules of inference, when combined with the rules of replacement, mean that propositional calculus is "complete." Propositional calculus is simply another name for formal logic

Unification

I in computer science and logic, is an algorithmic process by which one attempts to solve

the satisfiability problem. The goal of unification is to find a substitution which demonstrates that two

seemingly different terms are in fact either identical or just equal. Unification is widely used

in automated reasoning, logic programming and programming language type system implementation.

Several kinds of unification are commonly studied: that for theories without any equations (the empty

theory) is referred to as syntactic unification: one wishes to show that (pairs of) terms are identical.

If one has a non-empty equational theory, then one is typically interested in showing the equality of (a

pair of) terms; this is referred to as semantic unification. Since substitutions can be ordered into

a partial order, unification can be understood as the procedure of finding a join on a lattice.

We also need some way of binding variables to values in a consistent way so that components of

sentences can be matched. This is the process of Unification.


http://en.wikipedia.org/wiki/Lattice_(order)

http://en.wikipedia.org/wiki/Join_(mathematics)

http://en.wikipedia.org/wiki/Partial_order

http://en.wikipedia.org/wiki/Equality_(mathematics)

http://en.wikipedia.org/wiki/Equational_theory

http://en.wikipedia.org/wiki/Identity_(mathematics)

http://en.wikipedia.org/wiki/Empty_theory

http://en.wikipedia.org/wiki/Empty_theory

http://en.wikipedia.org/wiki/Theory_(mathematical_logic)

http://en.wikipedia.org/wiki/Type_system

http://en.wikipedia.org/wiki/Logic_programming

http://en.wikipedia.org/wiki/Automated_reasoning

http://en.wikipedia.org/wiki/Equality_(mathematics)

http://en.wikipedia.org/wiki/Identity_(mathematics)

http://en.wikipedia.org/wiki/Term_(logic)

http://en.wikipedia.org/wiki/Substitution_(logic)

http://en.wikipedia.org/wiki/Satisfiability

http://en.wikipedia.org/wiki/Logic

http://en.wikipedia.org/wiki/Computer_science

Binding

A binding list is a set of enteries of the form v = e where v is a variable and e is an object. Given an

expression p and a binding list we write for the instantiation of p using bindings in.

Unifier

Given two expressions p and q, a unifier is a binding list such that

= .

Most General Unifier

MGU is a unifier that binds the fewest variables or binds them to less specific expressions.

Most General Unifier (MGU) Algorithm for expressions p and q

1. If either p or q is either an object constant or a variable, then:

i). If p=q, then p and q already unify and we return { }.

ii). If either p or q is a variable, then return the result binding that variable to the other expression.

iii). Otherwise return failure.

2.If neither p nor q is an object constant or a variable, then they must both be compound expressions

(suppose each is made up ofp1,......pn and q1,......qm) and must be unified one component at a time.

i).If the types and any function/relation constant are not equal, return failure.

ii).If , then return failure.

iii).Otherwise and do the following

a).Set = { }, k = 0.

b).If k = n then stop and return as the mgu of p and q.

c).Otherwise, increment k and apply mgu recursively to and .

If and unify, add new bindings to and return to step 2(c)ii.

If and fail to unify then return failure for unification of p and q.

Resolution Refutation System Resolution is a technique for proving theorems in predicate calculus Resolution is a sound inference rule that, when used to produce a refutation, is also complete In an important practical application resolution theorem proving particularly the resolution

refutation system, has made the current generation of Prolog interpreters possible The resolution principle, describes a way of finding contradictions in a data base of clauses

with minimum substitution Resolution Refutation proves a theorem by negating the statement to be proved and adding

the negated goal to the set of axioms that are known or have been assumed to be true It then uses the resolution rule of inference to show that this leads to a contradiction Steps in Resolution Refutation Proof

1. Put the premises or axioms into clause form


2. Add the negations of what is to be proved in clause form, to the set of axioms3. Resolve these clauses together, producing new clauses that logically follow from them4. Produce a contradiction by generating the empty clause

Discussion on Steps Resolution Refutation proofs require that the axioms and the negation of the goal be placed in

a normal form called the clause form Clausal form represents the logical database as a set of disjunctions of literals Resolution is applied to two clauses when one contains a literal and the other its negation The substitutions used to produce the empty clause are those under which the opposite of the

negated goal is true If these literals contain variables, they must be unified to make them equivalent

A new clause is then produced consisting of the disjunction of all the predicates in the two clauses minus the literal and its negative instance (which are said to have been “resolved away”)

Example: We wish to prove that “Fido will die” from the statements that“Fido is a dog” and “all dogs are animals” and “all animals will die”Convert these predicates to clause form

Predicate Form Clause Form

x: [dog(x)animal(x)] ¬ dog(x) V animal(x)

Dog(fido) Dog(fido)

y:[animal(y) die(y)] ¬ animal(y) V die(y)

Apply Resolution

Q.1. Anyone passing the Artificial Intelligence exam and winning the lottery is happy. But anyone who studies or is lucky can pass all their exams. Ali did not study be he is lucky. Anyone who is lucky wins the lottery. Is Ali happy?Anyone passing the AI Exam and winning the lottery is happy

X:[pass(x,AI) Λ win(x, lottery) happy(x)]

Anyone who studies or is lucky can pass all their examsX Y [studies(x) V lucky(x) pass(x,y)]

Ali did not study but he is lucky¬ study(ali) Λ lucky(ali) Anyone who is lucky wins the lottery

X: [lucky(x) win(x,lottery)]


Negate the conclusion that fido will die ¬die(fido)¬dog(x) V animal(x)

¬animal(y) V die(y)[Y/X]

dog(fido) ¬dog(y) V die(y)

[fido/Y]die(fido)

¬die(fido)

a Null clauseHence fido will die

Change to clausal form1. ¬pass(X,AI) V ¬win(X,lottery) V happy(X)2. ¬study(Y) V pass(Y,Z)3. ¬lucky(W) V pass(W,V)4. ¬study(ali)5. Lucky(ali)6. ¬lucky(u) V win(u,lottery)7. Add negation of the conclusion ¬happy(ali)

4.4.

Symbolic versus statistical reasoningThe (Symbolic) methods basically represent uncertainty belief as being

True, False, or Neither True nor False.

Some methods also had problems with Incomplete Knowledge Contradictions in the knowledge.

Statistical methods provide a method for representing beliefs that are not certain (or uncertain) but for which there may be some supporting (or contradictory) evidence.Statistical methods offer advantages in two broad scenarios:Genuine Randomness

-- Card games are a good example. We may not be able to predict any outcomes with certainty but we have knowledge about the likelihood of certain items (e.g. like being dealt an ace) and we can exploit this.

Exceptions-- Symbolic methods can represent this. However if the number of exceptions is large such system tend to break down. Many common sense and expert reasoning tasks for example. Statistical techniques can summarise large exceptions without resorting enumeration.


¬pass(X,AI) V ¬win(X,lottery) V happy(X)

win(u,lottery) V ¬lucky(u)

u/v¬pass(u,AI) V happy(u) V ¬lucky(u)

ali/u ¬happy(ali)

¬pass(ali,AI) V ¬lucky(ali)lucky(ali)

¬ pass(ali,AI)Ali/v,AI/w ¬lucky(v)

V pass(V,W) ¬lucky(ali)

lucky(ali)

Basic Statistical methods -- ProbabilityThe basic approach statistical methods adopt to deal with uncertainty is via the axioms of probability:

Probabilities are (real) numbers in the range 0 to 1. A probability of P(A) = 0 indicates total uncertainty in A, P(A) = 1 total certainty and values

in between some degree of (un)certainty. Probabilities can be calculated in a number of ways.

Very SimplyProbability = (number of desired outcomes) / (total number of outcomes)

So given a pack of playing cards the probability of being dealt an ace from a full normal deck is 4 (the number of aces) / 52 (number of cards in deck) which is 1/13. Similarly the probability of being dealt a spade suit is 13 / 52 = 1/4.

If you have a choice of number of items k from a set of items n then the formula is applied to find the number of ways of making this choice. (! = factorial).

So the chance of winning the national lottery (choosing 6 from 49) is to 1.

Conditional probability, P(A|B), indicates the probability of of event A given that we know event B has occurred.

Bayes Theorem This states:

o This reads that given some evidence E then probability that hypothesis is true is

equal to the ratio of the probability that E will be true given times the a

priori evidence on the probability of and the sum of the probability of E over the set of all hypotheses times the probability of these hypotheses.

o The set of all hypotheses must be mutually exclusive and exhaustive.o Thus to find if we examine medical evidence to diagnose an illness. We must know

all the prior probabilities of find symptom and also the probability of having an illness based on certain symptoms being observed.

Bayesian statistics lie at the heart of most statistical reasoning systems.How is Bayes theorem exploited?

The key is to formulate problem correctly:P(A|B) states the probability of A given only B's evidence. If there is other relevant evidence then it must also be considered.

Herein lies a problem: All events must be mutually exclusive. However in real world problems events are not

generally unrelated. For example in diagnosing measles, the symptoms of spots and a fever are related. This means that computing the conditional probabilities gets complex.In general if a prior evidence, p and some new observation, N then computing

grows exponentially for large sets of p All events must be exhaustive. This means that in order to compute all probabilities the set of

possible events must be closed. Thus if new information arises the set must be created afresh and all probabilities recalculated.

Thus Simple Bayes rule-based systems are not suitable for uncertain reasoning. Knowledge acquisition is very hard. Too many probabilities needed -- too large a storage space. Computation time is too large.


Updating new information is difficult and time consuming. Exceptions like ``none of the above'' cannot be represented. Humans are not very good probability estimators.

However, Bayesian statistics still provide the core to reasoning in many uncertain reasoning systems with suitable enhancement to overcome the above problems.We will look at three broad categories:

Certainty factors, Dempster-Shafer models, Bayesian networks.

Belief Models and Certainty FactorsThis approach has been suggested by Shortliffe and Buchanan and used in their famous medical diagnosis MYCIN system.MYCIN is essentially and expert system. Here we only concentrate on the probabilistic reasoning aspects of MYCIN.

MYCIN represents knowledge as a set of rules. Associated with each rule is a certainty factor

A certainty factor is based on measures of belief B and disbelief D of an hypothesis given evidence E as follows:

where is the standard probability.

The certainty factor C of some hypothesis given evidenceE is defined as:

Reasoning with Certainty factors

Rules expressed as if evidence list then there is suggestive evidence with

probability, p for symptom . MYCIN uses rules to reason backward to clinical data evidence from its goal of predicting a

disease-causing organism. Certainty factors initially supplied by experts changed according to previous formulae. How do we perform reasoning when several rules are chained together?

Measures of belief and disbelief given several observations are calculated as follows:

How about our belief about several hypotheses taken together? Measures of belief given several hypotheses and to be combined logically are calculated as follows:

Disbelief is calculated similarly.

Bayesian networksThese are also called Belief Networks or Probabilistic Inference Networks. Initially developed by Pearl (1988).


The basic idea is: Knowledge in the world is modular -- most events are conditionally independent of most

other events. Adopt a model that can use a more local representation to allow interactions between events

that only affect each other. Some events may only be unidirectional others may be bidirectional -- make a distinction

between these in model. Events may be causal and thus get chained together in a network.

Implementation A Bayesian Network is a directed acyclic graph:

o A graph where the directions are links which indicate dependencies that exist between nodes.

o Nodes represent propositions about events or events themselves.o Conditional probabilities quantify the strength of dependencies.

Consider the following example:

The probability, that my car won't start. If my car won't start then it is likely that

o The battery is flat oro The staring motor is broken.

In order to decide whether to fix the car myself or send it to the garage I make the following decision: If the headlights do not work then the battery is likely to be flat so i fix it myself. If the starting motor is defective then send car to garage. If battery and starting motor both gone send car to garage.

The network to represent this is as follows:

Fig. A simple Bayesian network

Reasoning in Bayesian(belief) nets Probabilities in links obey standard conditional probability axioms. Therefore follow links in reaching hypothesis and update beliefs accordingly. A few broad classes of algorithms have been used to help with this:

o Pearls's message passing method.o Clique triangulation.o Stochastic methods.o Basically they all take advantage of clusters in the network and use their limits on the

influence to constrain the search through net.o They also ensure that probabilities are updated correctly.

Since information is local information can be readily added and deleted with minimum effect on the whole network. ONLY affected nodes need updating.

Exampleo Consider problem: “block-lifting”o B: the battery is charged.o L: the block is liftable.


o M: the arm moves.o G: the gauge indicates that the battery is charged

oo p(G,M,B,L) = p(G|M,B,L)p(M|B,L)p(B|L)p(L)= p(G|B)p(M|B,L)p(B)p(L)o Specification:

Traditional: 16 rows BayessianNetworks: 8 rows

Reasoning: top-downo Example:o if the block is liftable, compute the probability of arm moving.o I.e., Compute p(M | L)o Solution:

Insert parent nodes:p(M|L) = p(M,B|L) + p(M,¬B|L)Use chain rule:p(M|L) = p(M|B,L)p(B|L) + p(M|,¬B,L)p(¬B|L)Remove independent node:p(B|L) =p(B) : B does not have PARENTp(¬B|L) = p(¬B) = 1 – p(B)p(M|L) = p(M|B,L)p(B) + p(M|,¬B,L)(1 – p(B))

= 0.9´0.95 + 0.0 ´(1 – 0.95)= 0.855

Reasoning: bottom-upExample:If the arm cannot moveCompute the probability that the block is not liftable.

I.e., Compute: p(¬L|¬M)Use Bayesian Rule:Compute top-down reasoningp(¬M|¬L) = 0.9525 –exercisep(¬L) = 1- p(L) = 1- 0.7 = 0.3


Chapter-5

Knowledge Representation.solving complex AI problems requires large amounts of knowledge and mechanisms for manipulating that knowledge. The inference mechanisms that operate on knowledge, relay on the ways knowledge is represented. A good knowledge representation model allows for more powerful inference mechanisms that operate on them. While representing knowledge one has to consider two things. 1. Facts, which are truths in some relevant world. 2. Representation of facts in some chosen formalism . These are the things which are actually manipulated by inference mechanism.

Knowledge representation schemes are useful only if there are functions that map facts to representations and vice versa. AI is more concerned with a natural language representation of facts and the functions which map natural language sentences into some representational formalism. An appealing way of representing facts is using the language of logic. Logical formalism provides a way of deriving new knowledge from the old through mathematical deduction. In this formalism, we can conclude that a new statement is true by proving that it follows from the statements already known to be facts.

STRUCTURED REPRESNTATION OF KNOWLEDGE

Representing knowledge using logical formalism, like predicate logic, has several advantages. They can be combined with powerful inference mechanisms like resolution, which makes reasoning with facts easy. But using logical formalism complex structures of the world, objects and their relationships, events, sequences of events etc. can not be described easily.

A good system for the representation of structured knowledge in a particular domain should posses the following four properties:

(i) Representational Adequacy:- The ability to represent all kinds of knowledge that are needed in that domain.

(ii) Inferential Adequacy :- The ability to manipulate the represented structure and infer new structures.

(iii) Inferential Efficiency:- The ability to incorporate additional information into the knowledge structure that will aid the inference mechanisms.

(iv) Acquisitional Efficiency :- The ability to acquire new information easily, either by direct insertion or by program control.

The techniques that have been developed in AI systems to accomplish these objectives fall under two categories:

1. Declarative Methods:- In these knowledge is represented as static collection of facts which are manipulated by general procedures. Here the facts need to be stored only one and they can be used in


any number of ways. Facts can be easily added to declarative systems without changing the general procedures.

2. Procedural Method:- In these knowledge is represented as procedures. Default reasoning and probabilistic reasoning are examples of procedural methods. In these, heuristic knowledge of “How to do things efficiently “can be easily represented.

In practice most of the knowledge representation employ a combination of both. Most of the knowledge representation structures have been developed to handle programs that handle natural language input. One of the reasons that knowledge structures are so important is that they provide a way to represent information about commonly occurring patterns of things . such descriptions are some times called schema. One definition of schema is

“Schema refers to an active organization of the past reactions, or of past experience, which must always be supposed to be operating in any well adapted organic response”.By using schemas, people as well as programs can exploit the fact that the real world is not random. There are several types of schemas that have proved useful in AI programs. They include

Approaches to Knowledge RepresentationWe briefly survey some representation schemes.

Simple relational knowledge Inheritable knowledge Inferential knowledge Procedural knowledge

Simple relational knowledgeThe simplest way of storing facts is to use a relational method where each fact about a set of objects is set out systematically in columns. This representation gives little opportunity for inference, but it can be used as the knowledge basis for inference engines.

Simple way to store facts. Each fact about a set of objects is set out systematically in columns (Fig. 7). Little opportunity for inference. Knowledge basis for inference engines. Exist in form of tables, like tables in database _Each relation (row in table) itself can provide very weak inferential capabilities. _Relations may serve as the input to powerful inference engines

Figure: Simple Relational KnowledgeWe can ask things like:

Who is dead? Who plays Jazz/Trumpet etc.?

This sort of representation is popular in database systems.

Inheritable knowledgeRelational knowledge is made up of objects consisting of

attributes corresponding associated values.

We extend the base more by allowing inference mechanisms: Property inheritance

o elements inherit values from being members of a class.o data must be organised into a hierarchy of classes (Fig. 8).


http://www.cs.cf.ac.uk/Dave/AI2/node37.html#figrelate

Fig. 8 Property Inheritance Hierarchy Boxed nodes -- objects and values of attributes of objects. Values can be objects with attributes and so on. Arrows -- point from object to its value. This structure is known as a slot and filler structure, semantic network or a collection of

frames.The algorithm to retrieve a value for an attribute of an instance object:

1. Find the object in the knowledge base2. If there is a value for the attribute report it3. Otherwise look for a value of instance if none fail4. Otherwise go to that node and find a value for the attribute and then report it5. Otherwise search through using isa until a value is found for the attribute.

Inferential knowledge Facts represented in a logical form, which facilitates reasoning. An inference engine is required.

Procedural knowledge Coding actions to be performed when a condition satisfied.

o Exampleo IF

Has fever more than 39oC


Be lazy eating Skin has red dots

o THEN Suspect petechial fever

Implementation:o Writing actions in LISP Programming Languageo Writing actions in Production System Framework, like CLISP, JESS

Issue in Knowledge Representation

Below are listed issues that should be raised when using a knowledge representation technique:Important Attributes

-- Are there any attributes that occur in many different types of problem?There are two instance and isa and each is important because each supports property inheritance.

Relationships-- What about the relationship between the attributes of an object, such as, inverses, existence, techniques for reasoning about values and single valued attributes. We can consider an example of an inverse inband(John Zorn,Naked City)This can be treated as John Zorn plays in the band Naked City or John Zorn's band is Naked City.Another representation is band = Naked Cityband-members = John Zorn, Bill Frissell, Fred Frith, Joey Barron,

Granularity-- At what level should the knowledge be represented and what are the primitives. Choosing the Granularity of Representation Primitives are fundamental concepts such as holding, seeing, playing and as English is a very rich language with over half a million words it is clear we will find difficulty in deciding upon which words to choose as our primitives in a series of situations.

At what level of detail should knowledge be represented?o Balance the trade-off

High-level facts may not be adequate for inference Low-level primitives may require a lot of storage.

How should sets of objects be represented?o By names.o By extensional definition.o By intensional definition

Given a large amount of knowledge stored, how can relevant parts be accessed?o Selecting an initial structure.o Revising the choice.

If Tom feeds a dog then it could become:feeds(tom, dog)If Tom gives the dog a bone like:gives(tom, dog,bone) Are these the same?In any sense does giving an object food constitute feeding?If give(x, food) feed(x) then we are making progress.But we need to add certain inferential rules.In the famous program on relationships Louise is Bill's cousin How do we represent this? louise = daughter (brother or sister (father or mother( bill))) Suppose it is Chris then we do not know if it is Chris as a male or female and then son applies as well.Clearly the separate levels of understanding require different levels of primitives and these need many rules to link together apparently similar primitives.Obviously there is a potential storage problem and the underlying question must be what level of comprehension is needed.


What are Semantic networks and frames? Explain with suitable examples.

A semantic network is often used as a form of knowledge representation. It is a directed graph consisting of vertices which represent concepts and edges which represent semantic relations between the concepts. The following semantic relations are commonly represented in a semantic net.

Meronymy (A is part of B)

Holonymy (B has A as a part of itself)

Hyponymy (or troponymy) (A is subordinate of B; A is kind of B)

Hypernymy (A is superordinate of B)

Synonymy (A denotes the same as B)

Antonymy (A denotes the opposite of B)

Example

The physical attributes of a person can be represented as in the following figure using a semantic net.

These values can also be represented in logic as: isa(person, mammal), instance(Mike-Hall, person) team(Mike-Hall, Cardiff)

( For detail of semantic network please follow lecture slides)

Frames Frames are descriptions of conceptual individuals. Frames can exist for ``real'' objects such as ``The Everest Hotel'', sets of objects such as ``Hotels'', or more ``abstract'' objects such as ``Cola-Wars'' or ``Gulfwar''.

A Frame system is a collection of objects. Each object contains a number of slots. A slot represents an attribute. Each slot has a value. The value of an attribute can be another object. Frames are essentially defined by their relationships with other frames. Relationships between frames are represented using slots. If a frame f is in a relationship r to a frame g, then we put the value g in the r slot of f.

For example, suppose we are describing the following genealogical tree:


The frame describing Adam might look something like:

Adam: sex: Male spouse: Beth child: (Charles Donna Ellen) where sex, spouse, and child are slots.

The genealogical tree would then be described by (at least) seven frames, describing the following individuals: Adam, Beth, Charles, Donna, Ellen, Male, and Female.

A frame can be considered just a convenient way to represent a set of predicates applied to constant symbols (e.g. ground instances of predicates.). For example, the frame above could be written:

sex(Adam,Male) spouse(Adam,Beth) child(Adam,Charles) child(Adam,Donna) child(Adam,Ellen)

Frames can also be regarded as an extension to Semantic nets. Indeed it is not clear where the distinction between a semantic net and a frame ends. Semantic nets initially we used to represent labelled connections between objects. As tasks became more complex the representation needs to be more structured. The more structured the system it becomes more beneficial to use frames. A frame is a collection of attributes or slots and associated values that describe some real world entity.

Semantic Networks First introduced by Quillian back in the late-60s

M. Ross Quillian. "Semantic Memories", In M. M. Minsky, editor, SemanticInformation Processing, pages 216-270. Cambridge, MA: MIT Press, 1968

Semantic network is simple representation scheme which uses a graph of labeled nodes and labeled directed arcs to encode knowledge Nodes – objects, concepts, events Arcs – relationships between nodes

Graphical depiction associated with semantic networks is a big reason for their popularity The idea behind a semantic network is that knowledge is often best understood as a set of

concepts that are related to one another. Arcs define binary relations which hold between objects denoted by the nodes.

inheritance reasoning in semantic nets

follow MemberOf & SubsetOf links up the hierarchy


Sue John 5

Max34

mother age

fatherage

mother (john, sue)age (john, 5)wife (sue, max)age (Sue, 34)…

stop at the category with a property link to infer the property for an individual

the semantic net advantages

simplicity of inference ease of visualizing, even for large nets ease of representing default values for categories & ease of overriding defaults by more specific values

but, awkward or impossible

to capture many of FOL's representational capabilities negation, disjunction, existential quantification, ... when extended to do so, it loses its attractive simplicity

Frames

Frames – semantic net with properties A frame represents an entity as a set of slots (attributes) and associated values A frame can represent a specific entry, or a general concept Frames are implicitly associated with one another because the value of a slot can be another

frame 3 components of a frame

· frame name· attributes (slots)· values (fillers: list of values, range, string, etc.)

More natural support of values then semantic nets Can be easily implemented using object-oriented programming techniques Inheritance is easily controlled Similar to Object-Oriented programming paradigm


Book Frame

Slot Filler

Title AI. A modern Approach Author Russell & Norvig Year 2003

Benefits: Makes programming easier by grouping related knowledge Easily understood by non-developers Expressive power Easy to set up slots for new properties and relations Easy to include default information and detect missing values

Drawbacks: No standards More of a general methodology than a specific representation:

• Frame for a class-room will be different for a professor and for a maintenance worker

No associated reasoning/inference mechanisms

Conceptual Dependency (CD)This representation is used in natural language processing in order to represent them earning of the sentences in such a way that inference we can be made from the sentences. It is independent of the language in which the sentences were originally stated. CD representations of a sentence is built out of primitives , which are not words belonging to the language but are conceptual , these primitives are combined to form the meaning s of the words. As an example consider the event represented by the sentence.

In the above representation the symbols have the following meaning:

Arrows indicate direction of dependencyDouble arrow indicates two may link between actor and the actionP indicates past tenseATRANS is one of the primitive acts used by the theory . it indicates transfer of possession0 indicates the object case relationR indicates the recipient case relation

Conceptual dependency provides a str5ucture in which knowledge can be represented and also a set of building blocks from which representations can be built. A typical set of primitive actions are

ATRANS - Transfer of an abstract relationship(Eg: give)


http://3.bp.blogspot.com/_ZGzaqHb40vU/TE6bvkys2zI/AAAAAAAAAIo/wgaYxrVNCQM/s1600/CD-Diagram.jpg

PTRANS - Transfer of the physical location of an object(Eg: go)PROPEL - Application of physical force to an object (Eg: push)MOVE - Movement of a body part by its owner (eg : kick)GRASP - Grasping of an object by an actor(Eg: throw)INGEST - Ingesting of an object by an animal (Eg: eat)EXPEL - Expulsion of something from the body of an animal (cry)MTRANS - Transfer of mental information(Eg: tell)MBUILD - Building new information out of old(Eg: decide)SPEAK - Production of sounds(Eg: say)ATTEND - Focusing of sense organ toward a stimulus (Eg: listen)

A second set of building block is the set of allowable dependencies among the conceptualization describe in a sentence.Advantages of CD:

Using these primitives involves fewer inference rules. Many inference rules are already represented in CD structure. The holes in the initial structure help to focus on the points still to be established.

Disadvantages of CD: Knowledge must be decomposed into fairly low level primitives. Impossible or difficult to find correct set of primitives. A lot of inference may still be required. Representations can be complex even for relatively simple actions. Consider:

Dave bet Frank five pounds that Wales would win the Rugby World Cup.Complex representations require a lot of storage

Applications of CD:MARGIE

(Meaning Analysis, Response Generation and Inference on English) -- model natural language understanding.

SAM(Script Applier Mechanism) -- Scripts to understand stories. See next section.

PAM(Plan Applier Mechanism) -- Scripts to understand stories.

ScriptA structured representation of background world knowledge. This structure contains knowledge about objects, actions, and situations that are described in the input text. If we consider the Knowledge about Shooping or Entering into the Restraunt. This kind of stored Knowledge about stereotypical events is called a Script.A script is a structure that prescribes a set of circumstances which could be expected to follow on from one another.It is similar to a thought sequence or a chain of situations which could be anticipated.It could be considered to consist of a number of slots or frames but with more specialised roles.Scripts are beneficial because:

Events tend to occur in known runs or patterns. Causal relationships between events exist. Entry conditions exist which allow an event to take place Prerequisites exist upon events taking place. E.g. when a student progresses through a degree

scheme or when a purchaser buys a house.The components of a script include:Entry Conditions

-- these must be satisfied before events in the script can occur.Results

-- Conditions that will be true after events in script occur.Props

-- Slots representing objects involved in events.Roles

-- Persons involved in the events.


Track-- Variations on the script. Different tracks may share components of the same script.

Scenes-- The sequence of events that occur. Events are represented in conceptual dependency form.

Advantages of Scripts: Ability to predict events. A single coherent interpretation may be build up from a collection of observations.

Disadvantages: Less general than frames. May not be suitable to represent all kinds of knowledge.

Scripts are useful in describing certain situations such as robbing a bank.

Chapter-6

Refer pulchowk notes+


What is an artificial neural network?An artificial neural network is a system based on the operation of biological neural networks, in other words, is an emulation of biological neural system. Why would be necessary the implementation of artificial neural networks? Although computing these days is truly advanced, there are certain tasks that a program made for a common microprocessor is unable to perform; even so a software implementation of a neural network can be made with their advantages and disadvantages.Advantages:

A neural network can perform tasks that a linear program can not. When an element of the neural network fails, it can continue without any problem by their

parallel nature. A neural network learns and does not need to be reprogrammed. It can be implemented in any application. It can be implemented without any problem.

Disadvantages: The neural network needs training to operate. The architecture of a neural network is different from the architecture of microprocessors

therefore needs to be emulated. Requires high processing time for large neural networks.

Expert System Architecture

Explanation Based Learning

The EBL HypothesisBy understanding why an example is a member of a concept, can learn the essential properties of theconcept

Trade-offthe need to collect many examples for the ability to “explain” single examples (a “domain” theory)

Learning by Generalizing ExplanationsGiven– Goal (e.g., some predicate calculus statement)– Situation Description (facts)– Domain Theory (inference rules)– Operationality Criterion


inference engine

world model

knowledge base

user

knowledge base editor

preceptors

explanation subsystem

explanation subsystem

Use problem solver to justify, using the rules, the goal in terms of the facts.Generalize the justification as much as possible.The operationality criterion states which other terms can appear in the generalized result.

• An explanation is an inter-connected collection of “pieces” of knowledge (inference rules, rewriterules, etc.)• These “rules” are connected using unification, as in Prolog• The generalization task is to compute the most general unifier that allows the “knowledge pieces”to be connected together as generally as possible

Machine Vision

The goal of Machine Vision is to create a model of the real world from images– A machine vision system recovers useful information about a scene from its two

dimensional projectionsThe world is three dimensional– Two dimensional digitized images

Knowledge about the objects (regions) in a scene and projection geometry is required. The information which is recovered differs depending on the application

– Satellite, medical images etc. Processing takes place in stages:

– Enhancement, segmentation, image analysis and matching (pattern recognition).


Machine Vision Stages

Boltzmann Machine

A Boltzmann machine is a type of stochastic recurrent neural network invented by Geoffrey

Hinton and Terry Sejnowski. Boltzmann machines can be seen as

the stochastic, generative counterpart of Hopfield nets. They were one of the first examples of a

neural network capable of learning internal representations, and are able to represent and (given

sufficient time) solve difficult combinatoric problems. However, due to a number of issues discussed

below, Boltzmann machines with unconstrained connectivity have not proven useful for practical


http://en.wikipedia.org/wiki/Hopfield_net

http://en.wikipedia.org/wiki/Generative_model

http://en.wikipedia.org/wiki/Stochastic_process

http://en.wikipedia.org/wiki/Terry_Sejnowski

http://en.wikipedia.org/wiki/Geoffrey_Hinton

http://en.wikipedia.org/wiki/Geoffrey_Hinton

http://en.wikipedia.org/wiki/Stochastic_neural_network

problems in machine learning or inference. They are still theoretically intriguing, however, due to the

locality and Hebbian nature of their training algorithm, as well as their parallelism and the

resemblance of their dynamics to simple physical processes. If the connectivity is constrained, the

learning can be made efficient enough to be useful for practical problems.

They are named after the Boltzmann distribution in statistical mechanics, which is used in their

sampling function.

A Boltzmann machine, like a Hopfield network, is a network of units with an "energy" defined for the network. It also has binary units, but unlike Hopfield nets, Boltzmann machine units are stochastic. The global energy, , in a Boltzmann machine is identical in form to that of a Hopfield network:

Where:

is the connection strength between unit and unit .

is the state, , of unit .

is the threshold of unit .The connections in a Boltzmann machine have two restrictions:

. (No unit has a connection with itself.)

. (All connections are symmetric.)


http://en.wikipedia.org/wiki/Symmetric

http://en.wikipedia.org/wiki/Stochastic

http://en.wiktionary.org/wiki/binary

http://en.wikipedia.org/wiki/Hopfield_net

http://en.wikipedia.org/wiki/Boltzmann_distribution

http://en.wikipedia.org/wiki/Hebbian

Education

Ai complete note