The Integration of Processing Components and Knowledge

The Integration of Processing Components and Knowledge

Introduction

• So far– Presented methods of achieving goals

• Integration of methods?– Controlling execution– Incorporating knowledge

What knowledge?

• Knowledge representation is one of the major research areas of cognitive science and psychology.

• The assumption that in AI: there is a separate module which represents the information that the program has about the world

• The Concise Oxford Dictionary defines knowledge as:-

knowledge awareness or familiarity gained by experience,(of person, fact, or thing); theoretical or understanding of a subject,language, etc."

What is Knowledge?

• Facts alone do not constitute knowledge. For information to become knowledge,it must incorporate the relationships between ideas. And for the knowledge to be useful, the links describing how concepts interact must be easily accessed, updated, and manipulated. [Kurzweil 1990:284]

Domain Knowledge• Laws of syntactic (literal) equality and similarity define the

relation between image pixels or image features regardless of its physical or perceptual causes.

• Laws describing the human perception of equality and similarity• Physical laws describing equality and difference of images

under differences in sensing and object surface properties. The physics of illumination, surface reflection, and image formation have a general effect on images.

• Geometric and topological rules describe equality and differences of patterns in space.

• Category-based rules encode the characteristics common to class z of the space of all notions Z.

• Finally, man-made customs or man-related patterns introduce rules of culture-based equality and difference.

Knowledge representation

• Implied

• Feature vectors

• Relational structures

• Hierarchical structures

• Rules

• Frames

Implied knowledge

• Knowledge encoded in software

• Usually inflexible in– Execution– Reuse

• Simple to design and implement

• Systems often unreliable

Feature vectors

• As seen in statistical representations

• Vector elements can be– Numerical– Symbolic coded numerically

Example:

strokes 3

loops 1

w-h ratio 1

A Nstrokes 3

loops 1

w-h ratio 1

Relational structures

• Encodes relationships between– Objects– Parts of objects

• Can become unwieldy for – Large scenes– Complex objects

Relational structures

supporting supportingblock

blockblockadjacent

adjacent

above above

Semantic Networks

• Semantic networks - graphical representation of knowledge showing relationships between objects.

• It is suitable for declarative knowledge and is an excellent visual way to represent knowledge.

Advantage of Semantic Networks

• Flexible

• Economical

• Function similarly to human information storage

Disadvantage of Semantic Networks

• No standard exists.

• Problems with exceptions in inheritance.

• Facts placed inappropriately cause problems.

• Representing procedural knowledge is a big a problem.

Follow natural division of

Hierarchical structures

scene

objects

parts of object

Example:scene

roadway building grassland

grass treeroad junction

edges

Uses

• Structure defines possible appearance of objects

• Structure guides processing

Rule-Based System

Database RulebaseInferenceengine

Rules

• Rules code quanta of knowledge

• Interpretation– Forwards– Backwards

<antecedent> <action>

<two antiparallel lines> <road>

Forward chaining

• If <antecedent> is TRUE

• Execute <action>

• Antecedent will be a test on some data

• Action might modify the data

• Suitable for low level processing

Backward chaining

• Action is some goal to achieve

• Antecedent defines how it should be achieved

• Suitable for high level processing– Guides focus of system

Advantage of using rules

• Easy to understand, to modify and maintain.

• Inference and explanation easily produced.

• Uncertainty is easily combined with rules.

• Good for procedural knowledge. suits wide range of heuristic knowledge.

Disadvantage of using rules

• Complex knowledge requires many rules.

• Search limitations when there are many rules

FramesA “data-structure for representing a

stereotyped situation”

Slot(attribute) Filler

(value: atomic, link to another frame, default or empty, call to a function to fill the slot)

Frames

• Frames represent specific lists of attributes(slots) to be filled in for specific nodes

• Frame based systems are hierarchical structures that contain a number of slots or named attributes which together describe a concept, object or event

• It is more reasonable for a large knowledge area.

Methods of control

• How to control how the system’s knowledge is used.– Hierarchical

Hierarchical control

• “Algorithm” defines control

• Compare other software:– Main programme calls subroutines– Achieve a predefined sequence of tasks

• Two extreme variants– Bottom-up– Top-down

Bottom-up control

Objectrecognition

Extracted features,Attributes,

Relationships

Image

Decision making

Feature extraction

Top-down control

Hypothesisedobject

Predicted features,Attributes,

Relationships

Features in image thatSupport or refute the

hypothesis

Prediction

Directed feature extraction

Critique

• Inflexible methods

• Errors propagate

• Hybrid control– Can make predictions– Verify– Modify predictions

Hybrid control

Objectrecognition

Image

Decision making

Feature extraction

Extracted features,Attributes,

Relationships

Predicted features,Attributes,

Relationships

Prediction

Uncertainty Reasoning

• Bayesian methods– Define a belief network– A tree structure

• Reflects evidential support of a fact

F1

F2 F3

Dempster-Shafer

• Bayesian theory has confidence in belief only

• No measure of disbelief

• D-S attempts to define this

Dempster-Shafer

• Bayesian reasoning allows us to state our belief in a hypothesis and out belief in that same hypothesis when some new data are received. Dempster-Shafer theory (D-S) also provides an assessment of belief in some hypothesis which can be modified in the light of new data. Unlike Bayesian reasoning, D-S takes into account the fact that it may not be possible to assign a belief to every hypothesis set.

D-S scenario

• Mrs Jones has a carton of cream delivered along with the milk every early morning on some days of the week. On most mornings following delivery of the cream, the carton is found open and the content is gone. Mrs Jones believes that the culprit is one of the three animals that stalk the area. One animal is a dog, the other a cat and the third a fox. Occasionally a neighbour will catch sight of the thief in the act, but the delivery is before daylight and no neighbour has been certain about their sighting.

D-S scenario

There are three suspects:– Dog –d– Cat =-c– Fox – f

and each suspect represents a hypothesis. Only a single animal is responsible for the theft.

D-S scenario

• The set of hypotheses is called the frame of discernment Θ. In this example:

• Θ={d,c,f}• The thief is either the dog or the cat or the fox. DS

is not limited to assigning a belief to only dog, cat or fox but can assign beliefs to any element that is a member of the power set of Θ. The belief in an element, x, is referred to as a probability mass denoted, m(x).

D-S scenario• The power set of Θ is the set of all subsets of Θ and

is denoted by 2 Θ .• 2 Θ={Φ, {d}, {c}, {f}, {d,c}, {d,f}, {c,f}, Θ}• Where Φ denote the empty set. The power set

expresses all possibilities. For example, {d} is the hypothesis that the dog takes the cream and {d,f} is the hypothesis that the culprit is either the dog or the fox.

• There are restrictions on the values of m(x):

1)(2

x

xm 0)( m

D-S scenario

• Which state that the total mass must sum to 1 and that the empty set is not possible (the closed world assumption which means that no animal other than the god , fox or cat is stealing the cream).Any subset x that has a non-zero value for m(x) is called a focal element.

D-S scenario

• Suppose neighbour 1 states that she believes it is either the dog or cat with probability 0.8. So m({d,c}) = 0.8. The probability must sum to 1 and so 0.2 has to e assigned somehow to the other hypotheses sets. The best we can do without any other information is to assign it to the whole frame of discernment m({d, c, f}) = 0.2.

D-S scenario

• On the following night, another neighbour spots the thief and states that she believes that it was either the cat or ox with probability 0.7. How should these new data be combined with the original data? D-S theory states that the original mass is combined with the new mass according to the rule

-eqn (1)

BAC

BmAmCm )()()(

D-S scenario

• A is the set of focal elements identified by neighbour 1 and B those by neighbour 2. This equation states that there is a set C of focal elements formed by the intersection of the sets in A and B and the mass assigned to an element in C is the product of the intersection masses. The result of applying eqn (1) is given in table 1

D-S scenario

Neighbour 2

m({c,f}) =0.7 m({d,c,f})=0.3

Neighbour 1 m({d,c})=0.8 m({c})=0.56 m({d,c})=0.24

m({d,c,f})=0.2 m({c,f}) =0.14 m(d,c,f})=0.06

Table 1. The probability masses from neighbours 1 and 2 are combined

D-S scenario

• We shall use the notation mn to indicate the evidence has been encountered at step n. The first step was from neighbour 1 and the second from neighbour 2, which are combined to give a new belief at step 3. So:

m3({c})=0.56m3({d,c})=0.24m3({c,f}) =0.14m3(d,c,f})=0.06

D-S scenario

• Two probability meaures are provided which assess the belief (Bel) and plausibility (Pl) of any set of hypotheses:

-eqn (2)

-eqn (3)

AB

BmABel )()(

AB

BmAPl )()(

D-S scenario

• These two measures represent lower and upper bounds on the belief in a set of hypotheses. So the belief in the cat being the culprit is the sum of the masses where the set of hypotheses is a subset of {c}, which in this case is simply:

• Bel({c})=0.56• The plausibility is the sum of all masses that

contain cat as a member:• Pl ({c}) = 0.56+0.24+0.14+0.06= 1.0

D-S scenario

• The belief and plausibility in the dog and fox are:

• Bel({d}) = 0

• Pl({d}) = 0.24+0.06 = 0.3

• Bel({f}) =0

• Pl ({f}) 0.14+0.06=0.2

D-S scenario

• The belief and plausibility in it being either the dog or cat are:

• Bel ({d,c}) = 0.56+0.24=0.8

• Pl({d,c}) = 0.56+0.24+0.14+0.06=1.0

D-S scenario

Neighbour 3

m4({f})=0.6 m4=({d,c,f})=0.4

m3({c})=0.56 null m5({c})=0.224

m3({d,c})=0.24 null m5({d,c})=0.096

Existing focal elements

m3({c,f}) =0.14 m5({f})=0.084 m5({d,f})=0.056

m3(d,c,f})=0.06 m5({f})=0.036 m5({d,c,f})=0.024

Table 2. Combining evidence from neighbour 3 with the evidence derived from combining sightings of neighbours 1 and 2

D-S scenario

• Table 2 is problematic because there are two null entries that indicate an empty intersection between the existing focal elements and the new evidence. In other words, the empty set has a mass which violates the earlier condition that is not possible to have belief in something outside of the sets of hypotheses. The suggested way around this problem is to normalise the entries using the following equation

BA

nnCBA

nn

n

BmAm

BmAmCm

)()(

)()()(

1

1

2

D-S scenario

• For out example, this equation suggests that we should divide each new focal element by the sum of all focal elements that do not have a null entry. All we are doing is ensuring that the null entries have a mass of zero and that all other new focal elements sum to 1. The denominator is:

• 0.084+ 0.036 + 0.224 + 0.096 + 0.056 + 0.024 = 0.52• Each newly calculated focal element in table 2 is now updated by

dividing by 0.52. The updated values are given in table 3. The final beliefs and plausibilities for each set of hypotheses after all three neighbours have given evidence are list in table 4

D-S scenario

Neighbour 3

m4({f})=0.6 m4=({d,c,f})=0.4

m3({c})=0.56 null m5({c})=0.431

m3({d,c})=0.24 null m5({d,c})=0.185

Existing focal elements

m3({c,f}) =0.14 m5({f})=0.162 m5({d,f})=0.108

m3(d,c,f})=0.06 m5({f})=0.069 m5({d,c,f})=0.046

D-S scenario

Belief Plausibility

{d} 0 0.231

{c} 0.431 0.770

{f} 0.231 0.385

{d,c} 0.616 0.770

{d,f} 0.231 0.570

{c,f} 0.770 1.0

{d,c,f} 1.0 1.0

Other formalisms

• Belief calculi exist

• Not yet widely used– A result is important– Confidence in result is not quantified

Summary

• Intelligent (vision) systems– Knowledge representation– Control strategies– Integration of belief

Documents

The Integration of Processing Components and Knowledge