32
1 Objectives of Course (Artificial Intelligence) Understand the definition of artificial intelligence Machine Learning Natural Language Expert Systems Neural Network Have a fair idea of the types of problems that can be currently solved by computers and those that are as yet beyond its ability. Introduction to AI Types of AI tasks: One possible classification of AI task is into 3 classes: Mundane tasks, Formal tasks and Experts tasks. 1. Mundane tasks: by mundane tasks, all those tasks which nearly all of us can do routinely in order to act and interact in the world. This includes: perception, vision, speech, natural language (understanding, generation and translation), common sense reasoning, and robot control. 2. Formal tasks: a. Games: Chess, backgammon, GO etc. To solve these problems we must explore a large number of solutions quickly and choose the best one. b. Mathematics: i. Geometry and logic theory: it proved mathematical theorems. It actually proved several theorems form classical math textbooks. ii. Integral calculus: programs such as Mathematica and Mathcad and perform complicated symbolic integration and differentiation. c. Proving properties of programs. E.g. correctness, manipulate symbols and reduce problem. 3. Expert task: by expert tasks means things that only some of people are good and which acquire extensive training. This includes: a. Engineering: Design, Fault finding, Manufacturing b. Planning c. Scientific analysis d. Medical diagnosis e. Financial analysis What is AI? AI is one of the newest disciplines, formally initiated in 1956 by McCarthy when the name was coined. The advent of computers made it possible for the first time for people to test models they proposed for learning, reasoning, perceiving etc. Definition may be organized into four categories: 1. Systems that thinks like humans 2. Systems that act like humans 3. Systems that think rationally 4. Systems that act rationally 1. Systems that thinks like humans: This requires ―getting inside‖ of the human mind to see how it works and then comparing our computer programs to this. This is what cognitive science attempts to do. Another way to do this is to observe a human problem solving and argue that one’s programs go about problem solving in a similar way.

Artificial intelligence

  • View
    288

  • Download
    0

Embed Size (px)

DESCRIPTION

Basic to AI, coignitive science. A subject to Purbanchal university,Nepal

Citation preview

Page 1: Artificial intelligence

1

Objectives of Course (Artificial Intelligence) – Understand the definition of artificial intelligence

– Machine Learning

– Natural Language

– Expert Systems

– Neural Network

– Have a fair idea of the types of problems that can be currently solved by computers and those that are as yet

beyond its ability.

Introduction to AI Types of AI tasks:

One possible classification of AI task is into 3 classes: Mundane tasks, Formal tasks and Experts tasks.

1. Mundane tasks: by mundane tasks, all those tasks which nearly all of us can do routinely in order to

act and interact in the world. This includes: perception, vision, speech, natural language

(understanding, generation and translation), common sense reasoning, and robot control.

2. Formal tasks:

a. Games: Chess, backgammon, GO etc. To solve these problems we must explore a large

number of solutions quickly and choose the best one.

b. Mathematics:

i. Geometry and logic theory: it proved mathematical theorems. It actually proved several

theorems form classical math textbooks.

ii. Integral calculus: programs such as Mathematica and Mathcad and perform

complicated symbolic integration and differentiation.

c. Proving properties of programs. E.g. correctness, manipulate symbols and reduce problem.

3. Expert task: by expert tasks means things that only some of people are good and which acquire

extensive training. This includes:

a. Engineering: Design, Fault finding, Manufacturing

b. Planning

c. Scientific analysis

d. Medical diagnosis

e. Financial analysis

What is AI?

AI is one of the newest disciplines, formally initiated in 1956 by McCarthy when the name was coined. The

advent of computers made it possible for the first time for people to test models they proposed for learning,

reasoning, perceiving etc.

Definition may be organized into four categories:

1. Systems that thinks like humans

2. Systems that act like humans

3. Systems that think rationally

4. Systems that act rationally

1. Systems that thinks like humans:

This requires ―getting inside‖ of the human mind to see how it works and then comparing our computer

programs to this. This is what cognitive science attempts to do. Another way to do this is to observe a human

problem solving and argue that one’s programs go about problem solving in a similar way.

Page 2: Artificial intelligence

2

For example, General Problem Solver (GPS) was an early computer program that attempted to model human

thinking. The developers were not so interested in whether or not GPS solved problems correctly. They were

more interested in showing that it solved problems like people, going through the same steps and taking

around the same amount of time to perform those steps.

2. Systems that act like humans:

The first proposal for success in building a program and acts humanly was the Turing Test. To be considered

intelligent a program must be able to act sufficiently like a human to fool an interrogator. The machine and

the human are isolated from the person carrying out the test and messages are exchanged via a keyboard and

screen. If the person cannot distinguish between the computer and the human being, then the computer must

be intelligent. To pass this test requires: NLP (natural language processing), knowledge representation,

automated reasoning, machine learning. A total Turing test also requires computer vision and robotics.

3. Systems that think rationally:

Aristotle was one of the first to attempt to codify ―thinking‖. For example, all computers use energy. Using

energy always generates heat. Therefore, all computers generate heat.

This initiates the field of logic. Formal logic was developed in the late nineteenth century. This was the first

step toward enabling computer programs to reason logically. By 1965, programs existed that could, given

enough time and memory, take a description of the problem in logical notation and find the solution, if one

existed.

4. Systems that act rationally:

Acting rationally means acting so as to achieve one’s goals, given one’s beliefs. An agent is just something

that perceives and acts.

In the logical approach to AI, the emphasis is on correct inferences. This is often part of being a rational agent

because one way to act rationally is to reason logically and then act on ones conclusions.

Foundation of AI:

Philosophy: Logic, methods of reasoning, mind as physical system foundations of learning, language,

rationality.

Mathematics: Formal representation and proof algorithms, computation, (un) decidability, (in)

tractability, probability. Philosophers staked out most of the important ideas of AI, but to move to a

formal science requires a level of mathematical formalism in three main areas: computation, logic and

probability. Mathematicians proved that there exist5s an algorithm to prove any true statement in first-

order logic. Analogously, Turing showed that there are some functions that no Turing machine can

compute. Although un-decidability and non-computability are important in the understanding of

computation, the notion of intractability has had much greater impact on computer science and AI. A

class of problems in called intractable if the time required to solve instances of the class grows at least

exponentially with the size of the instances.

Economics: utility, decision theory.

Neuroscience: physical substrate for mental activity.

Psychology: phenomena of perception and motor control, experimental techniques. The principle

characteristic of cognitive psychology is that the brain processes and processes information. The claim

is that beliefs, goals, and reasoning steps can be useful components of a theory of human behavior.

The knowledge-based agent has three key steps:

o Stimulus is translated into an internal representation.

o The representation is manipulated by cognitive processes to derive new internal representations

o These are translated into actions.

Page 3: Artificial intelligence

3

Computer engineering: building fast computers.

Control theory: design systems that maximize an objective function over time.

Linguistics: knowledge representation, grammar. Having a theory of how humans successfully process

natural language is an AI-complete problem- if we could solve this problem then we would have

created a model of intelligence.

AI History:

Intellectual roots of AI date back to the early studies of the nature of knowledge and reasoning. The dream of

making a computer imitate humans also has a very early history.

The concept of intelligent machines is found in Greek mythology. There is a story in the 8th

century A.D

about Pygmalion Olio, the legendary king of Cyprus.

Aristotle (384-322 BC) developed an informal system of syllogistic logic, which is the basis of the first formal

deductive reasoning system.

Early in the 17th

century, Descartes proposed that bodies of animals are nothing more than complex machines.

Pascal in 1642 made the first mechanical digital calculating machine.

In 1943: McCulloch and Pits propose modeling neurons using on/off devices.

In 1950’s: Claude Shannon and Alan Turing try to write chess playing programs.

In 57: John McCarthy thinks of the name ―Artificial Intelligence‖.

In 1960’s: Logic Theorist, GPS (General Problem Solver), micro worlds, neural networks.

In 1971: NP-Completeness theory casts doubt on general applicability of AI methods.

In 1970’s: Knowledge based system and Expert systems were developed.

In 1980’s: AI techniques in widespread use, neural networks rediscovered.

The early AI systems used general systems, little knowledge. AI researchers realized that specialized

knowledge is required for rich tasks to focus reasoning.

The 1990's saw major advances in all areas of AI including the following:

• Machine learning, data mining

• Intelligent tutoring,

• Case-based reasoning,

• Multi-agent planning, scheduling,

• Uncertain reasoning,

• Natural language understanding and translation,

• Vision, virtual reality, games, and other topics.

In 2000, the Nomad robot explores remote regions of Antarctica looking for meteorite samples.

Limits of AI Today

Today’s successful AI systems operate in well-defined domains and employ narrow, specialized knowledge.

Common sense knowledge is needed to function in complex, open-ended worlds. Such a system also needs to

understand unconstrained natural language. However these capabilities are not yet fully present in today’s

intelligent systems.

What can AI systems do?

Today’s AI systems have been able to achieve limited success in some of these tasks.

• In Computer vision, the systems are capable of face recognition

Page 4: Artificial intelligence

4

• In Robotics, we have been able to make vehicles that are mostly autonomous.

• In Natural language processing, we have systems that are capable of simple machine translation.

• Today’s Expert systems can carry out medical diagnosis in a narrow domain

• Speech understanding systems are capable of recognizing several thousand words continuous speech

• Planning and scheduling systems had been employed in scheduling experiments with the Hubble Telescope.

• The Learning systems are capable of doing text categorization into about a 1000 topics

• In Games, AI systems can play at the Grand Master level in chess (world champion), checkers, etc.

What can AI systems NOT do yet?

• Understand natural language robustly (e.g., read and understand articles in a newspaper)

• Surf the web

• Interpret an arbitrary visual scene

• Learn a natural language

• Construct plans in dynamic real-time domains

• Exhibit true autonomy and intelligence

Goals in problem solving:

Goal schemas: To build a system to solve a particular problem, we need to do four things:

1. Define the problem precisely. This definition must include precise specifications of what the initial

situations will be as well as what final situations constitute acceptable solutions to the problem.

2. Analyze the problem. A few very important features can have an immense impact on the

appropriateness of various possible techniques for solving the problem.

3. Isolate and represent the task knowledge that is necessary to solve the problem.

4. Choose the best problem-solving techniques and apply them to the particular problem.

Search Problem

It is characterized by an initial state and a goal-state description. The guesses are called the operators where a

single operator transforms a state into another state which is expected to be closer to a goal state. Here the

objective may be to find a goal state or to find a sequence of operators to a goal state. Additionally, the

problem may require finding just any solution or an optimum solution.

Constraint Satisfaction (Problem solving):

Here, the main objective is to discover some problem state, which satisfies a given set of constraints. By

viewing problem as one of constraint satisfaction, substantial amount of search can be possibly reduced as

compared to a method that attempts to form a partial solution directly by choosing specific values for

components of the eventual solution. Constraint satisfaction is a search procedure that operates in a space of

constraint sets. The initial state contains the constraints that are originally given in the problem description.

A goal state is any state that has been constrained ―enough‖, where ―enough‖ must be defined for each

problem. Constraint satisfaction is a two-step process; first, constraints are discovered and propagated as far

as possible throughout the system. Then if there is still not a solution, search begins. A guess about something

is made and added as a new constraint. Propagation can then occur with this new constraint and so forth.

It is fairly easy to see that a CSP can be given an incremental formulation as a standard search problem as follows:

Initial state: the empty assignment, in which all variables are unassigned.

Successor function: a value can be assigned to any unassigned variable, provided that it does not conflict with

previously assigned variables.

Page 5: Artificial intelligence

5

Goal test: the current assignment is complete.

Path cost: a constant cost (e.g., 1) for every step

CSP?

Finite set of variables V1, V2, …, Vn

Nonempty domain of possible values for each variable DV1, DV2, … DVn

Finite set of constraints C1, C2, …, Cm

Each constraint Ci limits the values that variables can take,

e.g., V1 ≠ V2

A state is defined as an assignment of values to some or all variables.

Consistent assignment

assignment does not violate the constraints

CSP benefits

Standard representation pattern

Generic goal and successor functions

Generic heuristics (no domain specific expertise).

An assignment is complete when every variable is mentioned.

A solution to a CSP is a complete assignment that satisfies all constraints.

Some CSPs require a solution that maximizes an objective function.

Examples of Applications:

Scheduling the time of observations on the Hubble Space Telescope

Airline schedules

Cryptography

Computer vision -> image interpretation

Scheduling your MS or PhD thesis exam

Map Coloring

Problem:

F O R T Y + T E N + T E N = S I X T Y

Solution:

(digit 0) 2N = 0 (mod 10)

(digit 1) 2E = 0 (mod 10) no carry from digit 0 possible

Therefore N=0 and E=5.

Then O=9 and I=1 requiring two carries. Further S=F+1.

Digit 2 no gives the equation

R + 2T + 1 = 20 + X

The smallest digit left is 2. So X>=2

Therefore R + 2T >= 21. As not both R and T can be 7

one of then must be larger that is 8.

We are left with some case checkings.

Case R=8:

Then T>=6.5 that is T=7 and X=3. There are no consecutive

numbers left for F and S.

Case T=8:

Then R>=5 and as 5 is in use R>=6.

Case R=6: As X=3 there are no consecutive numbers left.

Case R=7: As X=4 we get F=2 and S=3. The remaining digit 6 will be Y.

Therefore we get a unique solution.

29786

850

850

-----

Page 6: Artificial intelligence

6

31486

Planning:

The purpose of planning is to find a sequence of actions that achieves a given goal when performed starting in a

given state. In other words, given a set of operator instances (defining the possible primitive actions by the agent),

an initial state description, and a goal state description or predicate, the planning agent computes a plan.

Simple Planning Agent: Earlier we saw that problem-solving agents are able to plan ahead - to consider the consequences of sequences of

actions - before acting. We also saw that a knowledge-based agents can select actions based on explicit, logical

representations of the current state and the effects of actions. This allows the agent to succeed in complex,

inaccessible environments that are too difficult for a problem-solving agent.

Problem Solving Agents + Knowledge-based Agents = Planning Agents In this module, we put these two ideas together to build planning agents. At the most abstract level, the task of

planning is the same as problem solving. Planning can be viewed as a type of problem solving in which the agent

uses beliefs about actions and their consequences to search for a solution over the more abstract space of plans

What is a plan? A sequence of operator instances, such that "executing" them in the initial state will change the

world to a state satisfying the goal state description. Goals are usually specified as a conjunction of goals to be

achieved.

Properties of planning algorithm:

Soundness:

A planning algorithm is sound if all solutions found are legal plans

o All preconditions and goals are satisfied.

o No constraints are violated (temporal, variable binding)

Completeness:

A planning algorithm is complete if a solution can be found whenever one actually exists.

A planning algorithm is strictly complete if all solutions are included in the search space.

Optimality:

A planning algorithm is optimal if the order in which solution are found is consistent with some measure

of plan quality.

Page 7: Artificial intelligence

7

Linear planning:

Basic idea: work on one goal until completely solved before moving on to the next goal.

Planning algorithm maintains goal stack.

Implications:

- No interleaving of goal achievement

- Efficient search if goals do not interact (much)

Advantages:

- Reduced search space, since goals are solved one at a time

- Advantageous if goals are (mainly) independent

- Linear planning is sound

Disadvantages:

- Linear planning ma produce suboptimal solutions (based on the number of operators in the plan)

- Linear planning is incomplete.

Sub optimal plans:

Result of linearity, goal interactions and poor goal ordering

Initial state: at(obj1, locA), at(obj2, locA), at(747, locA)

Goals: at(obj1, locB), at(obj2, locB)

Plan: [load(obj1, 747, locA); fly(747, locA, locB); unload(obj1, 747, locB);

fly(747, locB, locA); Load(obj2, 747, locA); fly(747, locA, locB); unload(obj2, 747, locB)]

Concept of non-linear planning:

Use goal set instead of goal stack. Include in the search space all possible sub goal orderings.

Handles goal interactions by interleaving.

Advantages:

- Non-linear planning is sound.

- Non-linear planning is complete.

- Non-linear planning may be optimal with respect to plan length (depending on search strategy employed)

Disadvantages:

- Larger search space, since all possible goal orderings may have to be considered.

- Somewhat more complex algorithm; more bookkeeping.

Non-linear planning algorithm:

NLP (initial-state, goals)

- State = initial state, plan=[]; goalset = goals, opstack = []

- Repeat until goalset is empty

Choose a goal g from the goalset

If g does not match state, then

Choose an operator o whose add-list matches goal g

Push o on the opstack

Page 8: Artificial intelligence

8

Add the preconditions of o to the goalset

While all preconditions of operator on top of opstack are met in

state

Pop operator 0 from top of opstack

State = apply(0 , state)

Plan = [plan, 0]

Means-Ends Analysis

The means-ends analysis concentrates around the detection of differences between the current state and the

goal state. Once such difference is isolated, an operator that can reduce the difference must be found.

However, perhaps that operator cannot be applied to the current state. Hence, we set up a subproblem of

getting to a state in which it can be applied. The kind of backward chaining in which the operators are selected

and then sub goals are setup to establish the preconditions of the operators is known as operator subgoaling.

Just like the other problem-solving techniques, means -ends analysis relies on a set of rules that can transform

one problem state into another. However, these rules usually are not represented with complete state

descriptions on each side. Instead, they are represented as left side, which describes the conditions that must

be met for the rule to be applicable and a right side, which describes those aspects of the problem state that

will be changed by the application of the rule. A separate data structure called a difference table indexes the

rules by the differences that they can be used to reduce.

Algorithm: Means-Ends Analysis

1. Compare CURRENT to GOAL. If there are no differences between them, then return.

2. Otherwise, select the most important difference and reduce it by doing the following until success or failure

is signaled:

a) Select a new operator O, which is applicable to the current difference. If there are no such operators, then

signal failure.

b) Apply O to CURRENT. Generate descriptions of two states: O-START, a state in which O’s preconditions

are satisfied and O-RESULT, the state that would result if O were applied in O-START.

c) If

(FIRST-PART MEA (CURRENT, O-START)) and

(LAST-PART MEA (O-RESULT, GOAL)) are successful, then signal success and return the result of

concatenating FIRST-PART, O and LAST-PART.

Production rules Systems:

Since search is a very important process in the solution of hard problems for which no more direct techniques

are available, it is useful to structure AI programs in a way that enables describing and performing the search

process. Production systems provide such structures. A production system consists of:

· A set of rules, each consisting of a left side that determines the applicability of the rule and a right side that

describes the operation to be performed if the rule is applied.

· One or more knowledge or databases that contain whatever information is appropriate for the particular task.

· A control strategy that specifies the order in which the rules will be compared to the database and a way of

resolving the conflicts that arise when several rules match at once.

· A rule applier.

Characteristics of Control Strategy

· The first requirement of a good control strategy is that it causes motion. Control strategies that do not cause

motion will never lead to a solution. For it, on each cycle, choose at random from among the applicable rules.

It will lead to a solution eventually.

Page 9: Artificial intelligence

9

· The second requirement of a good control strategy is that it be systematic. Despite the control strategy

causes motion, it is likely to arrive at the same state several times during the search process and to use many

more unnecessary steps. Systematic control strategy is necessary for the global motion i.e. over the course of

several steps as well as local steps i.e. over the course of single step.

Combinatorial explosion: is the phenomenon of the time required to find an optimal schedule (solution)

being increased exponentially. In other words, when the time required solving the problem takes more than

the estimated time, the phenomenon is known as combinatorial explosion.

Heuristic Function: is a function that maps from problem state descriptions to measures of desirability,

usually represented as numbers. Well-designed heuristic functions can play an important part in efficiently

guiding a search process toward a solution. The purpose of the heuristic function is to guide the search process

in the most profitable direction by suggesting which path to follow first when more than one is available. The

more accurately the heuristic function estimates the correct direction of each node in the search tree, the more

direct the solution process. In the extreme, the heuristic function would be so good that essentially no search

would be required, the system will move directly to a solution.

Problem Characteristics:

In order to choose the most appropriate method for a particular problem, it is necessary to analyze the problem

of several key dimensions:

· Is the problem decomposable into a set of independent smaller or easier sub problems?

· Can solution steps be ignored or at least undone if they prove unwise?

· Is the problem’s universe predictable?

· Is a good solution to the problem obvious without comparison to all other possible solutions?

· Is the desired solution a state of the world or a path to a state?

· Is a large amount of knowledge absolutely required to solve the problem or is knowledge important only to

constrain the search?

· Can a computer that is simply given the problem return the solution or will the solution of the problem

require interaction between the computer and a person?

Forward chaining systems:

In a forward chaining system the facts in the system are represented in a working memory which is

continually updated. Rules in the system represent possible actions to take when specified conditions hold on

items in the working memory – they are sometimes called condition-action rules. The conditions are usually

patterns that must match items in the working memory, while the actions usually involve adding or deleting

items from the working memory.

The interpreter controls the application of the rules, given the working memory, thus controlling the system’s

activity. It is based on a cycle of activity sometimes known as a recognize-act cycle. The system first checks

to find all the rules whose conditions hold, given the current state of working memory. It then selects one and

performs the actions in the action part of the rule. The actions will result in a new working memory, and the

cycle begins again. This cycle will be repeated until either no rules fire, or some specified goal state is

satisfied.

Backward chaining systems:

Page 10: Artificial intelligence

10

So far we have looked at how rule-based systems can be used to draw new conclusions from existing data,

adding these conclusions to a working memory. This approach is most useful when you know all the initial

facts, but don’t have much idea what the conclusion might be.

If we do know what the conclusion might be, or have some specific hypothesis to test, forward chaining

systems may be inefficient. We could keep on forward chaining until no more rules apply or you have added

your hypothesis to the working memory. But in the process the system is likely to do a lot of irrelevant work,

adding uninteresting conclusions to working memory.

Forward and backward chaining: The restriction to just one positive literal may seem somewhat arbitrary and uninteresting, but it is actually

very important for three reasons:

1. Every Horn clause can be written as an implication whose premise is a conjunction of positive literals and

whose conclusion is a single positive literal. 2. Inference with Horn clauses can be done through the forward chaining and backward chaining

algorithms, which we explain next. Both of these algorithms are very natural, in that the inference steps are

obvious and easy to follow for humans. 3. Deciding entailment with Horn clauses can be done in time that is linear in the size of the knowledge base.

This last fact is a pleasant surprise. It means that logical inference is very cheap for many propositional knowledge

bases that are encountered in practice.

Forward chaining is an example of the general concept of data-driven reasoning-that is, reasoning in which the focus of

attention starts with the known data. It can be used within an agent to derive conclusions from incoming percepts, often

without a specific query in mind. For example, the wumpus agent might TELL its percepts to the knowledge base using

an incremental forward-chaining algorithm in which new facts can be added to the agenda to initiate new inferences. In

humans, a certain amount of data-driven reasoning occurs as new information arrives.

P Q

LɅM P

LɅB M

AɅB L

AɅP L

A

B

Figure. Simple knowledge base of horn clauses and its corresponding graph (AND … OR)

The backward-chaining algorithm, as its name suggests, works backwards from the query. If the query q is known to be

true, then no work is needed. Otherwise, the algorithm finds those implications in the knowledge base that conclude q. If

all the premises of one of those implications can be proved true (by backward chaining), then q is true. When applied to

the query Q in Figure, it works back down the graph until it reaches a set of known facts that forms the basis for a proof.

Backward chaining is a form of goal-directed reasoning. It is useful for answering specific questions such as "What

shall I do now?" and "Where are my keys?" Often, the cost of backward chaining is much less than linear in the size of

the knowledge base, because the process touches only relevant facts. In general, an agent should share the work between

forward and backward reasoning, limiting forward reasoning to the generation of facts that are likely to be relevant to

queries that will be solved by backward chaining.

Page 11: Artificial intelligence

11

MyCIN Style probability and its application

In artificial intelligence, MYCIN was an early expert system designed to identify bacteria causing severe

infections, such as bacteremia and meningitis, and to recommend antibiotics, with the dosage adjusted for

patient's body weight — the name derived from the antibiotics themselves, as many antibiotics have the suffix

"-mycin". The Mycin system was also used for the diagnosis of blood clotting diseases.

MYCIN was developed over five or six years in the early 1970s at Stanford University in Lisp by Edward

Shortliffe. MYCIN was never actually used in practice but research indicated that it proposed an acceptable

therapy in about 69% of cases, which was better than the performance of infectious disease experts who were

judged using the same criteria. MYCIN operated using a fairly simple inference engine, and a knowledge base

of ~600 rules. It would query the physician running the program via a long series of simple yes/no or textual

questions. At the end, it provided a list of possible culprit bacteria ranked from high to low based on the

probability of each diagnosis, its confidence in each diagnosis' probability, the reasoning behind each

diagnosis, and its recommended course of drug treatment.

MYCIN was based on certainty factors rather than probabilities. These certainty factors CF are in the range [–

1,+1] where –1 means certainly false and +1 means certainly true. The system was based on rules of the form:

IF: the patient has signs and symptoms s1 s2 … sn , and certain background conditions t1 t2 … tm

hold

THEN: conclude that the patient had disease di with certainty CR

The idea was to use production rules of this kind in an attempt to approximate the calculation of the

conditional probabilities p( di | s1 s2 … sn), and provide a scheme for accumulating evidence that

approximated the reasoning process of an expert.

Practical use / Application:

MYCIN was never actually used in practice. This wasn't because of any weakness in its performance. As

mentioned, in tests it outperformed members of the Stanford medical school faculty. Some observers raised

ethical and legal issues related to the use of computers in medicine — if a program gives the wrong diagnosis

or recommends the wrong therapy, who should be held responsible? However, the greatest problem, and the

reason that MYCIN was not used in routine practice, was the state of technologies for system integration,

especially at the time it was developed. MYCIN was a stand-alone system that required a user to enter all

relevant information about a patient by typing in response to questions that MYCIN would pose.

Page 12: Artificial intelligence

12

Chapter 2: Intelligence

Introduction to intelligence:

Definition of Artificial Intelligence is concerned with the design of intelligence in an artificial device. The

term was coined by McCarthy in 1956.

There are two ideas in the definition.

1. Intelligence

2. Artificial device

What is intelligence?

Is it that which characterize humans? Or is there an absolute standard of judgment?

Accordingly there are two possibilities:

o A system with intelligence is expected to behave as intelligently as a human

o A system with intelligence is expected to behave in the best possible manner

Secondly what type of behavior are we talking about?

Are we looking at the thought process or reasoning ability of the system?

Or are we only interested in the final manifestations of the system in terms of its actions?

Given this scenario different interpretations have been used by different researchers as defining the scope and

view of Artificial Intelligence.

1. One view is that artificial intelligence is about designing systems that are as intelligent as humans.

This view involves trying to understand human thought and an effort to build machines that emulate the

human thought process. This view is the cognitive science approach to AI.

2. The second approach is best embodied by the concept of the Turing Test. Turing held that in future

computers can be programmed to acquire abilities rivaling human intelligence. As part of his argument Turing

put forward the idea of an 'imitation game', in which a human being and a computer would be interrogated

under conditions where the interrogator would not know which was which, the communication being entirely

by textual messages. Turing argued that if the interrogator could not distinguish them by questioning, then it

would be unreasonable not to call the computer intelligent. Turing's 'imitation game' is now usually called 'the

Turing test' for intelligence.

Common sense reasoning:

Common sense is ability to analyze a situation based on its context, using millions of integrated pieces of

common knowledge. Ability to use common sense knowledge depends on being able to do commonsense

reasoning. Commonsense reasoning is a central part of intelligent behavior.

Example: everyone knows that dropping a glass of water, the glass will break and water will spill. However,

this information is not obtained by formula or equation for a falling body or equations governing fluid flow.

Common sense knowledge: means what everyone knows. Example:

Every person is younger than the person’s mother.

People do not like being repeatedly interrupted.

If you hold a knife by its blade then the blade may cut you.

If you drop paper into a flame then the paper will burn

You start getting hungry again a few hours after eating a meal.

People generally sleep at night.

Common sense reasoning: ability to use common sense knowledge. Example:

If you have a problem, think of a past situation where you solved a similar problem.

Page 13: Artificial intelligence

13

If you fail at something, imagine how you might have done things differently.

If you observe an event, try to infer what prior event might have caused it.

The template is a frame with slots and slots fillers

The template is fed to a script classifier, which classifies what script is active in the template.

The template and the script are passed to a reasoning problem builder specific to the script, which

converts the template into a commonsense reasoning problem.

The problem and a commonsense knowledge base are passed to a commonsense reasoner. It infers and

fills in missing details to produce a model of the input text.

The model provides a deeper representation of the input, than is provided by the template alone.

Agents:

An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that

environment through actuators

• Human agent: eyes, ears, and other organs for sensors; hands, legs, mouth, and other body parts for actuators

• Robotic agent: cameras and infrared range finders for sensors; various motors for actuators Agents and

environments

• The agent program runs on the physical architecture to produce f

• agent = architecture + program

Page 14: Artificial intelligence

14

Rational agents:

An agent should strive to "do the right thing", based on what it can perceive and the actions it can perform.

The right action is the one that will cause the agent to be most successful

• Performance measure: An objective criterion for success of an agent's behavior

• E.g., performance measure of a vacuum-cleaner agent could be amount of dirt cleaned up, amount of time

taken, amount of electricity consumed, amount of noise generated, etc.

Rational Agent: For each possible percept sequence, a rational agent should select an action that is expected to

maximize its performance measure, given the evidence provided by the percept sequence and whatever built-

in knowledge the agent has. Rationality is distinct from omniscience (all knowing with infinite knowledge)

• Agents can perform actions in order to modify future percepts so as to obtain useful information

(information gathering, exploration)

• An agent is autonomous if its behavior is determined by its own experience (with ability to learn and adapt)

PEAS: Performance measure, Environment, Actuators, Sensors

• Must first specify the setting for intelligent agent design

• Consider, e.g., the task of designing an automated taxi driver:

– Performance measure, Environment, Actuators and Sensors.

Must first specify the setting for intelligent agent design

• Consider, e.g., the task of designing an automated taxi driver:

– Performance measure: Safe, fast, legal, comfortable trip, maximize profits

– Environment: Roads, other traffic, pedestrians, customers

– Actuators: Steering wheel, accelerator, brake, signal, horn

– Sensors: Cameras, sonar, speedometer, GPS, odometer, engine sensors, keyboard

Agent: Medical diagnosis system

• Performance measure: Healthy patient, minimize costs, lawsuits

• Environment: Patient, hospital, staff

• Actuators: Screen display (questions, tests, diagnoses, treatments, referrals)

• Sensors: Keyboard (entry of symptoms, findings, patient's answers)

Agent: Part-picking robot

• Performance measure: Percentage of parts in correct bins

• Environment: Conveyor belt with parts, bins

• Actuators: Jointed arm and hand

• Sensors: Camera, joint angle sensors

Environment types:

Fully observable (vs. partially observable): An agent's sensors give it access to the complete state of the

environment at each point in time.

• Deterministic (vs. stochastic): The next state of the environment is completely determined by the current

state and the action executed by the agent. (If the environment is deterministic except for the actions of other

agents, then the environment is strategic)

• Episodic (vs. sequential): The agent's experience is divided into atomic "episodes" (each episode consists of

the agent perceiving and then performing a single action), and the choice of action in each episode depends

only on the episode itself.

Static (vs. dynamic): The environment is unchanged while an agent is deliberating. (The environment is semi-

dynamic if the environment itself does not change with the passage of time but the agent's performance score

does)

• Discrete (vs. continuous): A limited number of distinct, clearly defined percepts and actions.

Page 15: Artificial intelligence

15

• Single agent (vs. multi-agent): An agent operating by itself in an environment.

Agent types:

Four basic types in order of increasing generality:

• Simple reflex agents

• Model-based reflex agents

• Goal-based agents

• Utility-based agents

Simple reflex agents:

The visual input from a simple camera comes in at the rate of 50 mbps, so the lookup table for an hour would

be 260X60X50M.

. However, we can summarize certain portions of the table by noting certain commonly

occurring I/O associations. For example, if the car in front brakes, then the driver should also brake.

In other words some processing is done on visual input to establish the condition, ―Brake lights in front are

on‖ and this triggers some established connection to the action ―start breaking‖. Such a connection is called a

condition-action rule written as:

If condition then action.

Fig. simplex reflex agent.

Model-based reflex agents:

Fig. model based reflex agent.

Simplex reflex agent only works if the correct action can be chosen based only on the current percept. Even

for the simple braking rule above, we need some sort of internal description of the world state. To determine if

Page 16: Artificial intelligence

16

the car in front is braking, we would probably need to compare the current image with the previous to see if

the brake light has come on.

For example, from time to time the driver looks in the rear view mirror to check on the location of nearby

vehicles. When the driver is not looking in the mirror, vehicles in the next lane are invisible. However, in

order to decide on a lane change requires that the driver know the location of vehicles in the next lane.

Goal-based agents:

Knowing about the state of the world is not always enough for the agent to know what to do next. For

example, at an intersection, the taxi driver can either turn left, right, or go straight. Which turn it should make

depends on where it is trying to get to: its goal.

Goal information describes states that are desirable and that the agent should try to achieve. The agent can

combine goal information with information about what its actions achieve to plan sequences of actions that

achieve those goals. Search and planning are the sub fields of AI devoted to finding action sequences that

achieve goals. Decision making of this kind is fundamentally different from condition-action rules, in that it

involves consideration of the future. In the reflex agent design this information is not used because the

designer has pre-computed the correct action for the various cases. A goal-based agent could reason that if the

car in front has its brake lights on, it will slow down. From the way in which the world evolves, the only

action that will achieve the goal of not hitting the braking car is to slow down. To do so requires hitting the

brakes. The goal-based agent is more flexible but takes longer to decide what to do.

Fig. goal based agent.

Utility-based agents:

Fig. Utility-based agent

Page 17: Artificial intelligence

17

Goals alone are not enough to generate high-quality behavior. For example, there are many action sequences

that will get the taxi to its destination, but some are quicker, safer, more reliable, cheaper, etc.

Goals just provide a crude distinction between ―happy‖ and ―unhappy‖ states whereas a more general

performance measure should allow a comparison of different world states. ―Happiness‖ of an agent is called

utility. Utility can be represented as a function that maps states into real numbers. The larger the number of

happiness then higher the utility of the state.

A complete specification of the utility function allows rational decisions in two kinds of cases where goals

have trouble. First, when there are conflicting goals, only some of which can be achieved, the utility function

specifies the appropriate trade-off. Second, when there are several goals that the agent can aim for none of

which can be achieved with certainty, utility provides a way in which the likelihood of success can be

weighed up against the importance of the goals.

Learning agents:

Page 18: Artificial intelligence

18

Chapter 3 Knowledge Representation

Representation and mapping:

We have some characterisations of AI, so that when an AI problem arises, we will be able to put it into

context, find the correct techniques and apply them. We have introduced the agents’ language so that we can

talk about intelligent tasks and how to carry them out. We have also looked at search in the general case,

which is central to AI problem solving. Most pieces of software have to deal with data of some type, and in AI

we use the more grandiose title of "knowledge" to stand for data including

(i) facts: such as the temperature of a patient

(ii) procedures: such as how to treat a patient with a high temperature and

(iii) Meaning, such as why a patient with a high temperature should not be given a hot bath. Accessing

and utilizing all these kinds of information will be vital for an intelligent agent to act rationally.

For this reason, knowledge representation is our final general consideration before we look at

particular problem types.

To a large extent, the way in which you organize information available to and generated by your intelligent

agent will be dictated by the type of problem you are addressing. Often, the best ways of representing

knowledge for particular techniques are known. However, as with the problem of how to search, we need a lot

of flexibility in the way we represent information. Therefore, it is worth looking at four general schemes for

representing knowledge, namely logic, semantic networks, production rules and frames. Knowledge

representation continues to be a much-researched topic in AI because of the realization fairly early on that

how information is arranged can often make or break an AI application.

Knowledge Representation is a combination of data structures and interpretive procedures that will

lead to "knowledgeable" behavior. This definition is not entirely satisfactory from an anthropological

perspective due to the emphasis on behavior rather than the system of knowledge. But it captures the central

idea that data plus rules results in knowledge. In some sense the anthropologist working in the field is

attempting to acquire and analyze an alien representation of knowledge. The goals of AI and Anthropology

are not identical. Most anthropologists are not interested in writing programs that have knowledgeable

behavior for that purpose; they are interested in representing knowledge.

Fig. Mapping between Facts and Representations

Facts Internal Representations

English Representation

Reasoning

Programs

English

understanding

English

Generation

Page 19: Artificial intelligence

19

Logical Representations

If all human beings spoke the same language, there would be a lot less misunderstanding in the world. The

problem with software engineering in general is that there are often slips in communication which mean that

what we think we've told an agent and what we've actually told it are two different things. One way to reduce

this, of course, is to specify and agree upon some concrete rules for the language we use to represent

information. To define a language, we need to specify the syntax of the language and the semantics. To

specify the syntax of a language, we must say what symbols are allowed in the language and what are legal

constructions (sentences) using those symbols. To specify the semantics of a language, we must say how the

legal sentences are to be read, i.e., what they mean. If we choose a particular well defined language and stick

to it, we are using a logical representation.

Certain logics are very popular for the representation of information, and range in terms of their

expressiveness. More expressive logics allow us to translate more sentences from our natural language into

the language defined by the logic.

Some popular logics are:

Propositional Logic

First Order Predicate Logic

Higher Order Predicate Logic

Fuzzy Logic

Approach to knowledge representation:

A good system for the representation of knowledge in a particular domain should possess the following

properties:

1. Representational Adequacy- the ability to represent all of the kinds of knowledge that are needed in

that domain.

2. Inferential Adequacy- the ability to manipulate the representational structures in such a way as to

derive new structures corresponding to new knowledge inferred from old.

3. Inferential Efficiency- the ability to incorporate into the knowledge structure additional information

that can be used to focus the attention of the inference mechanisms in the most promising directions.

4. Acquisitional Efficiency- the ability to acquire new information easily. The simplest case involves

direct insertion of new knowledge into the database.

Multiple techniques for knowledge representation exist. Many programs rely on more than one technique.

Issues in knowledge representation:

1. Are any attributes of objects so basic that they occur in almost every problem domain? We need to

make sure that they are handled appropriately in each of the mechanisms we propose. If such attributes

exist, what are they?

2. Are there any important relationships that exist among attributes of objects?

3. At what level should knowledge be represented? Is there a good set of primitives into which all

knowledge can be broken down? Is it helpful to use such primitives?

4. How should sets of objects be represented?

5. Given a large amount of knowledge stored in a database, how can relevant parts be accessed when

they are needed?

Page 20: Artificial intelligence

20

Logical Agents:

In logical agent we design agents that can form representations of the world, use a process of inference to

derive new representations about the world, and use these new representations to deduce what to do.

Knowledge based agents:

The central component of a knowledge-based agent is its knowledge base, or KB. Informally, a knowledge

base is a set of sentences. (Here "sentence" is used as a technical term. It is related but is not identical to the

sentences of English and other natural languages.) Each sentence is expressed in a language called a

knowledge representation language and represents some assertion about the world.

There must be a way to add new sentences to the knowledge base and a way to query what is known. The

standard names for these tasks are TELL and ASK, respectively. Both tasks may involve inference-that is,

deriving new sentences from old. In logical agents, which are the main subject of study in this chapter,

inference must obey the fundamental requirement that when one ASKS a question of the knowledge base, the

answer should follow from what has been told (or rather, TELL) to the knowledge base previously.

Figure: A generic knowledge-based agent.

Figure shows the outline of a knowledge-based agent program. Like all our agents, it takes a percept as input

and returns an action. The agent maintains a knowledge base, KB, which may initially contain some

background knowledge. Each time the agent program is called, it does three things. First, it TELLS the

knowledge base what it perceives. Second, it ASKS the knowledge base what action it should perform. In the

process of answering this query, extensive reasoning may be done about the current state of the world, about

the outcomes of possible action sequences, and so on. Third, the agent records its choice with TELL and

executes the action. The second TELL is necessary to let the knowledge base know that the hypothetical

action has actually been executed.

MAKE-PERCEPT-SENTENCE constructs a sentence asserting that the agent perceived the given percept at

the given time. MAKE-ACTION-QUERY constructs a sentence that asks what action should be done at the

current time. Finally, MAKE-ACTION-SENTENCE constructs a sentence asserting that the chosen action

was executed.

The knowledge-based agent is not an arbitrary program for calculating actions. It is amenable to a description

at the knowledge level, where we need specify only what the agent knows and what its goals are, in order to

fix its behavior. For example, an automated taxi might have the goal of delivering a passenger to Sitapaila and

might know that it is Kalanki and that it can follow the any link between the two locations if there exist

multiple links. This analysis is independent of how the taxi works at the implementation level.

Page 21: Artificial intelligence

21

Formal logic-connectives, syntax, semantics:

Syntax

– Rules for constructing legal sentences in the logic

– Which symbols we can use (English: letters, punctuation)

– How we are allowed to combine symbols

– Propositions, e.g. ―it is wet‖

– Connectives: and, or, not, implies, iff (equivalent)

– Brackets, T (true) and F (false)

Semantics

– How we interpret (read) sentences in the logic

– Assigns a meaning to each sentence

– Define how connectives affect truth

– ―P and Q‖ is true if and only if P is true and Q is true

– Use truth tables to work out the truth of statements

Example: ―All lecturers are seven foot tall‖

– A valid sentence (syntax)

– And we can understand the meaning (semantics)

– This sentence happens to be false (there is a counterexample)

Syntax: well-formed formulas

Logical symbols: and, or, not, all, at least one, brackets,

Variables, equality (=), true, false

Predicate and function symbols (for example, Cat(x) for ―x is a Cat‖)

Term: variables and functions (for example, Cat(x))

Formula: any combination of terms and logical symbols (for example, ―Cat(x) and Sleeps(x)‖)

Sentence: formulas without free variables (for example, ―All x: Cat(x) and Sleeps(x)‖)

Knowledge bases consist of sentences. These sentences are expressed according to the syntax of the

representation language, which specifies all the sentences that are well formed. ―X+Y=4‖ is a well-formed

sentences but not ―X4Y+=‖. There are literally dozens of different syntaxes, some with lots of Greek letters

and exotic mathematical symbols, and some with rather visually appealing diagrams with arrows and bubbles.

Logic must also define the semantics of the language. Semantics has to do with the "meaning" of sentences.

In logic:, the definition is more precise. The semantics of the language defines the truth of each sentence with

respect to each possible world. For example, the usual semantics adopted for arithmetic specifies that the

sentence "x + y = 4" is true in a world where x is 2 and y is 2, but false in a world where x is 1 and y is 1. In

standard logics, every sentence must be either true or false in each possible world-there is no "in between‖.

When we need to be precise, we will use the term model in place of "possible world." Now that we have a

notion of truth, we are ready to talk about logical reasoning. This involves the relation of logical entailment

between sentences-the idea that a sentence follows 1ogicaEly from another sentence. In mathematical

notation, we write as ―a |= p‖ to mean that the sentence ―a‖ entails the sentence ―p‖. The formal definition of

Page 22: Artificial intelligence

22

entailment is that: a I= p if and only if, in every model in which ―a‖ is true, ―p‖ is also true. Another way to

say this is that if ―a‖ is true, then ―p‖ must also be true. Informally, the truth of ―p‖ is "contained" in the truth

of a. The relation of entailment is familiar from arithmetic. In arithmetic we can say that the sentence x + y =

4 entails the sentence 4 = x+ y.

The property of completeness is also desirable: an inference algorithm is complete if it can derive any

sentence that is entailed. For many knowledge bases the consequences is infinite, and completeness becomes

an important issue. Fortunately, there are complete inference procedures for logics that are sufficiently

expressive to handle many knowledge bases. We have described a reasoning process whose conclusions are

guaranteed to be true in any world in which the premises are true; in particular, if KB is true in the real world,

then any sentence ―a‖ derived from KB by a sound inference procedure is also true in the real world.

Equivalence, validity, and satisfiability:

The first concept is logical equivalence: two sentences a and b are logically equivalent if they are true in the

same set of models. We write this as a≡b.

The second concept we will need is validity. A sentence is valid if it is true in all models. For example, the

sentence P V ¬P is valid. Valid sentences are also known as tautology.

A sentence is satisfiable if it is true under at least one interpretation.

The sentence is called valid if it is true under all interpretations. Similarly the sentence is invalid if it is false

under some interpretation.

Example: ―All x: Cat(x) and Sleeps(x)‖

If this is interpreted on an island which only has one cat that always sleeps, this is satisfiable.

Since not all cats in all interpretations always sleep, the sentence is not valid.

The final concept we will need is satisfiability. A sentence is satisfiable if it is true in some model. For

example, the knowledge base given earlier, (R1 Ʌ R2Ʌ R3 Ʌ R4 Ʌ R5), is satisfiable because there are three

models in which it is true. If a sentence a is true in a model m, then we say that m satisfies a , or that m is a

model of a. Satisfiability can be checked by enumerating the possible models until one is found that satisfies

the sentence. Determining the satisfiability of sentences in propositional logic was the first problem proved to

be NP-complete.

A|=B if and only if the sentence (AɅ¬B) is unsatisfiable.

Examples of Tautology:

A tautology is a redundancy, a needless repetition of an idea.

For example: Best of the best. Worst of the worst of the worst.

Mother of the mother of the mother This is not a teacher, this is a professor.

This is not noise, this is music. This is not music, this is noise.

a. The propositions α∨ ¬α and ¬(α∧¬α) are tautologies. Therefore, 1=P(α∨ ¬α) =P(α)+P( ¬α).

Rearranging gives the desired result.

b. The proposition α↔((α∧β)∨ (α∧¬β)) and ¬((α∧β)∧(α∧¬β)) are tautologies.

Thus, P(α)=P((α∧β)∨ (α∧¬β))=P(α∧β)+P(α∧¬β).

Page 23: Artificial intelligence

23

Propositional logic (very simple logic):

The syntax of propositional logic defines the allowable sentences. The atomic sentences, the indivisible

syntactic elements-consist of a single proposition symbol. Each such symbol stands for a proposition that can

be true or false. We will use uppercase names for symbols: P, Q, R, and so on. The names are arbitrary but are

often chosen to have some mnemonic value to the reader. There are two proposition symbols with fixed

meanings: True is the always-true proposition and False is the always-false proposition.

Complex sentences are constructed from simpler sentences using logical connectives. There are five

connectives in common use:

¬ (not). A literal is either an atomic sentence (a positive literal) or a negated atomic sentence (a negative

literal). Ʌ (and): A sentence whose main connective is Ʌ called a conjunction; its parts are the conjuncts.

V (or): A sentence whose main connective is V called a disjunction; Historically, the V comes from the Latin

"vel," which means "or."

(implies). A sentence called an implication (or conditional). Its premise or antecedent is conclusion or

consequent. The implication symbol is sometimes written in other books as

(If and only if).

Figure. A BNF (Backus-Naur Form) grammar of sentences in propositional logic.

One possible model is: ml = {P1,2=false, P2,2 =false, P3,1= true)

The semantics for propositional logic must specify how to compute the truth value of any sentence, given a

model. This is done recursively. All sentences are constructed from atomic sentences and the five connectives;

therefore, we need to specify how to compute the truth of atomic sentences and how to compute the truth of

sentences formed with each of the five connectives.

Atomic sentences are easy:

True is true in every model and False is false in every model.

The truth value of every other proposition symbol must be specified directly in the model.

For example, in the model ml given earlier, P1,2 is false.

For complex sentences, we have rules such as

For any sentence s and any model m, the sentence is true in m if and only if s is false in m.

Fig. truth table for five logical connectives

Page 24: Artificial intelligence

24

Semantic Networks:

Fig. Semantic Network

• The idea behind a semantic network is that knowledge is often best understood as a set of concepts that are related to

one another.

The meaning of a concept is defined by its relationship to other concepts.

• A semantic network consists of a set of nodes that are connected by labeled arcs. The nodes represent concepts and the

arcs represent relations between concepts.

Common Semantic Relations:

There is no standard set of relations for semantic networks, but the following relations are very common:

INSTANCE: X is an INSTANCE of Y if X is a specific example of the general concept Y.

Example: Elvis is an INSTANCE of Human

ISA: X ISA Y if X is a subset of the more general concept Y.

Example: sparrow ISA bird

HASPART: X HASPART Y if the concept Y is a part of the concept X. (Or this can be any other property)

Example: sparrow HASPART tail

Semantic Tree:

A semantic tree is a representation that is a semantic net in which Certain links are called branches. Each branch

connects two nodes; the head node is called the parent node and the tail node is called the child node

· One node has no parent; it is called the root node. Other nodes have exactly one parent.

· Some nodes have no children; they are called leaf nodes.

· When two nodes are connected to each other by a chain of two or more branches, one is said to be the ancestor; the

other is said to be the descendant.

Inheritance:

• Inheritance is a key concept in semantic networks and can be represented naturally by following ISA links.

• In general, if concept X has property P, then all concepts that are a subset of X should also have property P.

• But exceptions are pervasive in the real world!

Page 25: Artificial intelligence

25

• In practice, inherited properties are usually treated as default values. If a node has a direct link that contradicts an

inherited property, then the default is overridden.

Multiple Inheritances:

• Multiple inheritance allows an object to inherit properties from multiple concepts.

• Multiple inheritance can sometimes allow an object to inherit conflicting properties.

• Conflicts are potentially unavoidable, so conflict resolution strategies are needed.

Predicate calculus (Predicate logic):

In mathematical logic, predicate logic is the generic term for symbolic formal systems like first-order logic,

second-order logic, many-sorted logic, or infinitary logic. This formal system is distinguished from other

systems in that its formulae contain variables which can be quantified. Two common quantifiers are the

existential ∃ ("there exists") and universal ∀ ("for all") quantifiers. The variables could be elements in the

universe under discussion, or perhaps relations or functions over that universe. For instance, an existential

quantifier over a function symbol would be interpreted as modifier "there is a function". In informal usage, the

term "predicate logic" occasionally refers to first-order logic.

Predicate calculus symbols may represent either variables, constants, functions or predicates.

Constants name specific objects or properties in the domain of discourse. Thus George, tree, tall and blue are

examples of well-formed constant symbols. The constants (true) and (false) are sometimes included.

Functions denote a mapping of one or more elements in a set (called the domain of the function) into a unique

element of another set (the range of the function). Elements of the domain and range are objects in the world

of discourse. Every function symbol has an associated arity, indicating the number of elements in the domain

mapped onto each element of range.

A function expression is a function symbol followed by its arguments. The arguments are elements from the

domain of the function; the number of arguments is equal to the arity of the function. The arguments are

enclosed in parentheses and separated by commas. e.g.:

f(X,Y)

father(david)

price(apple)

First-order logic / First order predict logic (FOPL):

First-order logic is sufficiently expressive to represent a good deal of our commonsense knowledge.

It also either subsumes or forms the foundation of many other representation languages and has been studied

intensively for many decades. This procedural approach can be contrasted with the declarative nature of

propositional logic, in which knowledge and inference are separate, and inference is entirely domain-

independent.

Propositional logic is a declarative language because its semantics is based on a truth relation between

sentences and possible worlds. It also has sufficient expressive power to deal with partial information, using

disjunction and negation. Propositional logic has a third property that is desirable in representation languages,

namely compositionality.

Page 26: Artificial intelligence

26

Syntax and Semantics of First-Order Logic:

The first-order sentence asserts that no matter what a represents, if a is a philosopher then a is scholar. Here,

the universal quantifier, expresses the idea that the claim in parentheses holds for all choices of a. To show

that the claim "If a is a philosopher then a is a scholar" is false, one would show there is some philosopher

who is not a scholar. This counterclaim can be expressed with the existential quantifier:

Here:

is the negation operator: is true if and only if A is false, in other words if and only if a is not a scholar.

^ is the conjunction operator: asserts that a is a philosopher and also not a scholar.

The predicates Phil(a) and Schol(a) take only one parameter each. First-order logic can also express predicates

with more than one parameter. For example, "there is someone who can be fooled every time" can be

expressed as:

∃ ) ∀ ( ) )))

Here Person(x) is interpreted to mean x is a person, Time(y) to mean that y is a moment of time, and

Canfool(x,y) to mean that (person) x can be fooled at (time) y. For clarity, this statement asserts that there is at

least one person who can be fooled at all times, which is stronger than asserting that at all times at least one

person exists who can be fooled. This would be expressed as:

∀ ) ∃ ) )))

Interpretation: The meaning of a term or formula is a set of elements. The meaning of a sentence is a truth value. The function that

maps a formula into a set of elements is called an interpretation. An interpretation maps an intentional description

(formula/sentence) into an extensional description (set or truth value).

Term: A term is a logical expression that refers to an object. Constant symbols are therefore terms, but it is not always

convenient to have a distinct symbol to name every object. For example, in English we might use the expression "John's

left leg" rather than giving a name to his leg. This is what function symbols are for: instead of using a constant symbol,

we use LeftLeg(John).

Page 27: Artificial intelligence

27

Atomic sentences Now that we have both terms for referring to objects and predicate symbols for referring to relations, we can put them

together to make atomic sentences that state facts. An atomic sentence is formed from a predicate symbol followed by

a parenthesized list of terms: Brother(Richard, John).

Married (Father(Richard), Mother( John)) states that father of Richard married to John’s mother.

Complex sentences We can use logical connectives to construct more complex sentences, just as in propositional calculus. The semantics

of sentences formed with logical connectives is identical to that in the propositional case. Here are four sentences that

are:

Brother(LeftLeg(Richard, )J ohn)

Brother(Richard, John) Ʌ Brother( John, Richard)

King (Richard) V King (John)

¬King (Richard) => King (John).

Quantifiers Once we have a logic that allows objects, it is only natural to want to express properties of entire collections of objects,

instead of enumerating the objects by name. Quantifiers let us do this. First-order logic contains two standard

quantifiers, called universal and existential.

Universal quantification ( v )

"All kings are persons,'' is written in first-order logic as x King(x) => Person(x)

V is usually pronounced "For all . . .". (Remember that the upside-down A stands for "all.") Thus, the sentence says,

"For all x, if x is a king, then z is a person." The symbol x is called a variable. By convention, variables are lowercase

letters. A variable is a term all by itself, and as such can also serve as the argument of a function-for example,

LeftLeg(x). A term with no variables is called a ground term.

Intuitively, the sentence V x P, where P is any logical expression, says that P is true for every object x. Existential quantification (3) Universal quantification makes statements about every object. Similarly, we can make a statement about some object in

the universe without naming it, by using an existential quantifier. To say that King John has a crown on his head, we

write 3 x Crown(x) Ʌ OnHead (x, John). 3x is pronounced "There exists an x such that . . ." or "For some x . . .". Intuitively, the sentence 3x P says that P is true for at least one object x. More precisely, 3 x P is true in a given model

under a given interpretation if P is true in at least one extended interpretation that assigns x to a domain element.

Nested quantifiers We will often want to express more complex sentences using multiple quantifiers. The simplest case is where the

quantifiers are of the same type. For example, "Everybody loves somebody" means that for every person, there is

someone that person loves:

∀ ∃ )

On the other hand, to say "There is someone who is loved by everyone," we write:

∃ ∀ )

Connections between∀ ∃: The two quantifiers are actually intimately connected with each other, through negation. Asserting that everyone

dislikes parsnips is the same as asserting there does not exist someone who likes them, and vice versa. We can go one

step further: "Everyone likes ice cream" means that there is no one who does not like ice cream:

∀ ) ∃ )

Page 28: Artificial intelligence

28

The De Morgan rules for quantified and unquantified sentences are as follows:

Equality First-order logic includes one more way to make atomic sentences, other than using a predicate and terms as described

earlier. We can use the equality symbol to make statements to the effect that two terms refer to the same object. For

example,

Father (John) = Henry

Says that the object referred to by Father (John) and the object referred to by Henry are the same. Because an

interpretation fixes the referent of any term, determining the truth of an equality sentence is simply a matter of seeing

that the referents of the two terms are the same object.

The equality symbol can be used to state facts about a given function, as we just did for the Father symbol. It can also

be used with negation to insist that two terms are not the same object. To say that Richard has at least two brothers, we

would write

∃ ) ) ) The sentence ∃ ) ) does not have the same meaning.

The kinship domain The domain of family relationships is called kinship. This domain includes facts such as "Sita is the mother of kush"

and "kush is the father of hari' and rules such as "One's grandmother is the mother of one's parent." Clearly, the objects

in domain are people. We will have two unary predicates, Male and Female. Kinship relations-parenthood,

brotherhood, marriage, and so on-will be represented by binary predicates: Parent, Sibling, Brother, Sister, Child,

Daughter, Son, Spouse, Wife, Husband, Grandparent, Grandchild, Cousin, Aunt, and Uncle. We will use functions

for Mother and Father, because every person has exactly one of each of these. We can go through each function and

predicate, writing down what we know in terms of the other symbols. For example, one's mother is one's female parent:

∀m, c Mother(c) = m Female(m) Ʌ Parent(m, c) . One's husband is one's male spouse:

∀ w,h Husband(h,w ) Male(h) Ʌ Spouse(h,w). Male and female are disjoint categories:

∀ x Male (x) ¬Female(x) . Parent and child are inverse relations:

∀p, c Parent (p, c) Child (c, p) . A grandparent is a parent of one's parent:

∀g,c Grandparent(g,c) ∃p Parent (g,P ) Ʌ Parent (p,c)

Diagnostic rules: Diagnostic rules lead from observed effects to hidden causes. For finding pits, the obvious diagnostic rules say that if a

square is breezy, some adjacent square must contain a pit, or

∀s Breezy(s) =>∃ r Adjacent (r, s) Ʌ Pit(r) , And that if a square is not breezy, no adjacent square contains a pit:

∀s ¬Breezy (s) => ∃r Adjacent (r, s) Ʌ Pit (r) Combining these two, we obtain the bi-conditional sentence

∃s Breezy(s) ∃r Adjacent(r, s) Ʌ Pit (r) .

Page 29: Artificial intelligence

29

Causal rules:

Causal rules reflect the assumed direction of causality in the world: some hidden property of the world causes certain

percepts to be generated. For example, a pit causes all adjacent squares to be breezy:

∀ ) ∀ ) ) And if all squares adjacent to a given square are pitless, the square will not be breezy:

∀s [∀r Adjacent (r, s)=>¬ Pit (r)] =>¬ Breezy ( s ) . With some work, it is possible to show that these two sentences together are logically equivalent to the biconditional

sentence. The bi-conditional itself can also be thought of as causal, because it states how the truth value of Breezy is

generated from the world state.

Horn clauses:

In computational logic, a Horn clause is a clause with at most one positive literal. Horn clauses are named

after the logician Alfred Horn, who investigated the mathematical properties of similar sentences in the non-

clausal form of first-order logic. Horn clauses play a basic role in logic programming and are important

for constructive logic.

A Horn clause with exactly one positive literal is a definite clause. A definite clause with no negative literals

is also called a fact. The following is a propositional example of a definite Horn clause:

Such a formula can also be written equivalently in the form of an implication:

)

In the non-propositional case, all variables in a clause are implicitly universally quantified with scope the

entire clause. Thus, for example: ) )

Stands for: ∀ ) ))

Frames:

• A frame represents an entity as a set of slots (attributes) and associated values.

• Each slot may have constraints that describe legal values that the slot can take.

• A frame can represent a specific entity, or a general concept.

• Frames are implicitly associated with one another because the value of a slot can be another frame.

Page 30: Artificial intelligence

30

Demons:

• One of the main advantages of frames is the ability to include demons to compute slot values. A demon is a

function that computes the value of a slot on demand.

Features of Frame Representations:

• Frames can support values more naturally than semantic nets (e.g. the value 25)

• Frames can be easily implemented using object-oriented programming techniques.

• Demons allow for arbitrary functions to be embedded in a representation.

• But a price is paid in terms of efficiency, generality, and modularity!

• Inheritance can be easily controlled.

Comparative Issues in Knowledge Representation:

• The semantics behind a knowledge representation model depends on the way that it is used (implemented).

Notation is irrelevant!

Whether a statement is written in logic or as a semantic network is not important -- what matters is whether the

knowledge is used in the same manner.

• Most knowledge representation models can be made to be functionally equivalent. It is a useful exercise to try

converting knowledge in one form to another form.

• From a practical perspective, the most important consideration usually is whether the KR model allows the

knowledge to be encoded and manipulated in a natural fashion.

Expressiveness of Semantic Nets:

• Some types of properties are not easily expressed using a semantic network. For example: negation, disjunction,

and general non-taxonomic knowledge.

• There are specialized ways of dealing with these relationships, for example partitioned semantic networks and

procedural attachment. But these approaches are ugly and not commonly used.

• Negation can be handled by having complementary predicates (e.g., A and NOT A) and using specialized

procedures to check for them. Also very ugly, but easy to do.

• If the lack of expressiveness is acceptable, semantic nets have several advantages: inheritance is natural and

modular, and semantic nets can be quite efficient.

Conceptual dependencies and scripts:

It is a theory of how to represent the kind of knowledge about events that is usually contained in natural

language sentences. The objective is to represent the knowledge in a way that:

Facilitates drawing inferences from the sentences.

Is independent of the language in which the sentences were originally stated.

The conceptual dependency representation of a sentence is built not out of primitives corresponding to the

words used in the sentence, but rather out of conceptual primitives that can

Page 31: Artificial intelligence

31

be combined to form the meanings of words in any particular. Hence, the conceptual dependency

is implemented in a variety of programs that read and understand natural language text. Semantic

nets provide only a structure into which nodes representing information at any level can be placed, whereas

conceptual dependency provide both a structure and a specific set of primitives out of which representations

of particular pieces of information can be constructed.

For example, the event representation by the sentence ―I gave the man a book‖ is represented in CD as

follows:

Where the symbols have the following meanings:

arrows indicate direction of dependency double arrow indicate direction of dependency p indicates the past tense ATRANS is one of the primitive acts used by the theory. It indicates transfer of possession indicates the object case relation R indicates the recipient case relation

In CD, representations of actions are built from a set of primitive acts. Some of the typical sets of primitive

actions are as follows:

ATRANS transfer of an abstract relationship (example: give)

PTRANS transfer of the physical location of an object (example: go)

MTRANS transfer of mental information (example: tell)

PROPEL application of physical force to an object (example: push)

MOVE movement of a body part by its owner (example: kick)

MBUILD building new information out of old (example: decide)

A second set of CD building blocks is the set of allowable dependencies among the conceptualizations

described in a sentence. There are four primitive conceptual categories from which dependency structures can

be built. They are:

ACTs action

PPs objects (picture procedures)

AAs modifiers of actions (action aiders)

PAs modifiers of PPs (picture aiders)

Scripts

A script is a structure that prescribes a set of circumstances which could be expected to follow on from one

another. It is similar to a thought sequence or a chain of situations which could be anticipated. It could be

considered to consist of a number of slots or frames but with more specialized roles.

Scripts are beneficial because:

Events tend to occur in known runs or patterns.

Causal relationships between events exist.

I

P

ATRANS

I

book

to

from

man

R o

Page 32: Artificial intelligence

32

Entry conditions exist which allow an event to take place

Prerequisites exist upon events taking place. E.g. when a student progresses through a degree scheme

or when a purchaser buys a house.

The components of a script include:

Entry Conditions these must be satisfied before events in the script can occur.

Results Conditions that will be true after events in script occur.

Props Slots representing objects involved in events.

Roles Persons involved in the events.

Track Variations on the script. Different tracks may share components of the same script.

Scenes The sequence of events that occur. Events are represented in conceptual dependency form.

Advantages of Scripts:

Ability to predict events.

A single coherent interpretation may be build up from a collection of observations.

Disadvantages:

Less general than frames.

May not be suitable to represent all kinds of knowledge.