1 We want to design some truly intelligent robot

1

We want to design some truly intelligent robot

2

Or like this??

• musee mechanique

• http://museemecanique.citysearch.com/

http://museemecanique.citysearch.com/

Application:Knowledge Discovery in Databases

We need AI and ML for all these applications

What is AI?

Discipline that systematizes and automates intellectual tasks to create machines that:

Act like humans Act rationally

Think like humans Think rationally

More formal and mathematical

Some Important Achievements of AI

Logic reasoning (data bases) Applied in reasoning interactive robot helper

Search and game playing Applied in robot motion planning

Knowledge-based systems Applied in robots that use knowledge like

internet robots

Bayesian networks (diagnosis) Machine learning and data mining Planning and military logistics Autonomous robots

MAIN BRANCHES OF AI APPLICABLE TO ROBOTICS

ArtificialIntelligence

Neural Nets

FuzzyLogic

GeneticAlgorithms

ARTIFICIAL INTELLIGENCE

Machine Learning

Decision Theoretic Techniques

Symbolic Concept Acquisition

Constructive Induction

Genetic learning

examples of applicationsexamples of applications

Un-supervised leaning

Treatment of uncertainty

Efficient constraint satisfaction

Inductive Learning by Nearest-Neighbor Classification

• One simple approach to inductive learning is to save each training example as a point in feature space

• Classify a new example by giving it the same classification (+ or -) as its nearest neighbor in Feature Space.– A variation involves computing a weighted sum of class

of a set of neighbors where the weights correspond to distances

– Another variation uses the center of class• The problem with this approach is that it doesn't

necessarily generalize well if the examples are not well "clustered."

Text Mining:Information Retrieval and Filtering

• 20 USENET Newsgroups– comp.graphics misc.forsale soc.religion.christian sci.space

– comp.os.ms-windows.misc rec.autos talk.politics.guns sci.crypt

– comp.sys.ibm.pc.hardware rec.motorcycles talk.politics.mideast sci.electronics

– comp.sys.mac.hardware rec.sports.baseball talk.politics.misc sci.med

– comp.windows.x rec.sports.hockey talk.religion.misc

– alt.atheism

• Problem Definition [Joachims, 1996]– Given: 1000 training documents (posts) from each group

– Return: classifier for new documents that identifies the group it belongs to

• Example: Recent Article from comp.graphics.algorithmsHi all

I'm writing an adaptive marching cube algorithm, which must deal with cracks. I got the vertices of the cracks in a list (one list per crack).

Does there exist an algorithm to triangulate a concave polygon ? Or how can I bisect the polygon so, that I get a set of connected convex polygons.

The cases of occuring polygons are these:

...

• Performance of Newsweeder (Naïve Bayes): 89% Accuracy

Rule and Decision Tree Learning• Example: Rule Acquisition from Historical Data

• Data– Customer 103 (visit = 1): Age 23, Previous-Purchase: no, Marital-Status: single, Children:

none, Annual-Income: 20000, Purchase-Interests: unknown, Store-Credit-Card: no, Homeowner: unknown

– Customer 103 (visit = 2): Age 23, Previous-Purchase: no, Marital-Status: married, Children: none, Annual-Income: 20000: Purchase-Interests: car, Store-Credit-Card: yes, Homeowner: no

– Customer 103 (visit = n): Age 24, Previous-Purchase: yes, Marital-Status: married, Children: yes, Annual-Income: 75000, Purchase-Interests: television, Store-Credit-Card:

yes, Homeowner: no, Computer-Sales-Target: YES• Learned Rule

– IF customer has made a previous purchase, AND customer has an annual income over $25000, AND customer is interested in buying home electronics

THEN probability of computer sale is 0.5

– Training set: 26/41 = 0.634, test set: 12/20 = 0.600

– Typical application: target marketing

INDUCTIVE LEARNINGExample of Risk Classification

NO. RISK CREDIT DEBT COLLATERAL INCOME

1. High Bad High none $0-$15 K2. High Unk. High none $15-$35 K3. Mod. Unk. Low none $15-$35 K4. High Unk. Low none $0-$15 K5. Low Unk. Low none >$35 K6. Low Unk. Low none >$35 K

Decision variable (output) is RISK

INDUCTIVE LEARNINGExample of Risk Classification..

Income

Debt Credit History Collateral

The Block’s world

Hand-Coded Knowledge vs. Machine Learning

•How much work would it be to enter knowledge by hand?

•Do we even know what to enter?

1952-62 Samuel’s checkers player learned its evaluation function1975 Winston’s system learned structural descriptions from examples and near misses

1984 Probably Approximately Correct learning offers a theoretical foundationmid 80’s The rise of neural networks

Concept AcquisitionConcept Acquisition

Example of “Arch” concept

two bricks supporta brick

two bricks supporta Pyramid

Concept Acquisition (cont)

Brick Pyramid

PolygonPolygonBricks and pyramidsare instances of

Polygon

ARCH => Two bricks support a polygon

Some Fundamental Issuesfor Most AI Problems

• Learningnew knowledge is acquired– inductive inference

– neural networks

– artificial life

– evolutionary approaches

What we’ll be doing

• Uncertain knowledge and reasoning– Probability, Bayes rule

• Machine learning– Decision trees, computationally learning

theory, reinforcement learning

j

jijiji aWin aW,

A Generalized Model of Learning

TrainingSystem

LearningElement

KnowledgeBase

PerformanceElement

Feedback Element

Input Output

Correct outputs

Training system is used to create learning pairs (input, output) to train our robot

TrainingSystem

LearningElement

KnowledgeBase

PerformanceElement

Feedback Element

Input Output

Correct outputs

Starts with some knowledge. The performance element participates in performance. The feedback element provides a comparison of actual output vs correct output which becomes input to the learning element. It analyzes the differences and updates the knowledge base.

Neural Networks

24

Standard computers

• Referred to as Von Neumann machines

• Follows explicit instructions

• Sample program if (time < noon) print “Good morning”else print “Good afternoon”

25

Neural networks

• Modeled off the human brain• Does not follow explicit instructions• Is trained instead of programmed• Key papers

– McCulloch, W. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 7:115 - 133.

– Rosenblatt, Frank. (1958) The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychological Review, 65:386-408.

26

Sources for lecture

• comp.ai.neural-networks FAQ

• Gurney, Kevin. An Introduction to Neural Networks, 1996.

27

Neuron drawing

28

Neuron behavior

• Signals travel between neurons through electrical pulses

• Within neurons, communication is through chemical neurotransmitters

• If the inputs to a neuron are greater than its threshold, the neuron fires, sending an electrical pulse to other neurons

This is a simplification.

29

Perceptron (artificial neuron)

x a

x b

+ >10?

inputs weights

outputthreshold

30

Training

• Inputs and outputs are 0 (no) or 1 (yes)

• Initially, weights are random

• Provide training input

• Compare output of neural network to desired output– If same, reinforce patterns– If different, adjust weights

31

Example

If both inputs are 1, output should be 1.

x 2

x 3

+ >10?

inputs weights

outputthreshold

32

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

33

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

2

3

34

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

2

3

5

35

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

2

3

50

36

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

2

3

50

37

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

2

3

50

Must increase weights!

38

Example (1,1)


x 2

x 3

+ >10?

inputs weights

outputthreshold

1

1

Repeat for all inputs until weights stop changing.

Function-Learning Formulation

Goal function f Training set: (x(i), f(x(i))), i = 1,…,n

Inductive inference: find a function h that fits the points well

Same Keep-It-Simple biasx

f(x)

Perceptron(The goal function f is a boolean

one)

gxi

x1

xn

ywi

y = g(i=1,…,n wi xi)

+ +

+

++ -

-

--

-x1

x2

w1 x1 + w2 x2 = 0

gxi

x0

xn

ywi


+ +

+ +

+ -

-

--

-

?

Perceptron(The goal function f is a boolean

one)

Unit (Neuron)

gxi

x1

xn

ywi


g(u) = 1/[1 + exp(-u)]

Neural Network

Network of interconnected neurons

gxi

x1

xn

ywi

gxi

x1

xn

ywi

Acyclic (feed-forward) vs. recurrent networks

Two-Layer Feed-Forward Neural Network

Inputs Hiddenlayer

Outputlayer

w1j w2k

Backpropagation (Principle)

New example y(k) = f(x(k)) φ(k) = outcome of NN with weights w(k-1)

for inputs x(k) Error function: E(k)(w(k-1)) = ||φ(k) – y(k)||2

wij(k) = wij

(k-1) – εE/wij (w(k) = w(k-1) - E)

Backpropagation algorithm: Update the weights of the inputs to the last layer, then the weights of the inputs to the previous layer, etc.

Comments and Issues

How to choose the size and structure of networks?• If network is too large, risk of over-

fitting (data caching)• If network is too small, representation

may not be rich enough Role of representation: e.g., learn

the concept of an odd number Incremental learning

RECOGNIZING A PERSON

Glasses Hairstyle

Facial features Height

Perceptrons Early neural nets

Symbolic vs. Subsymbolic AI

Subsymbolic AI: Model intelligence at a level similar to the neuron. Let such things as knowledge and planning emerge.

Symbolic AI: Model such things as knowledge and planning in data structures that make sense to the programmers that build them.

(blueberry (isa fruit) (shape round) (color purple) (size .4 inch))

The Origins of Subsymbolic AI

1943 McCulloch and Pitts A Logical Calculus of the Ideas Immanent in Nervous Activity

“Because of the “all-or-none” character of nervous activity, neural events and the relations among them can be treated by means of propositional logic”

Interest in Subsymbolic AI

40 50 60 70 80 90 00 10

Artificial Neural Networks

• Neural networks are composed of:– Layers of nodes (artificial neurons)– Weights connecting the layers of nodes

• Different types of neural networks:– Radial basis function networks– Multi-layer feed forward networks– Recurrent networks (feedback)

• Learning algorithms:– Modify the weights connecting the nodes

How to Work with Environment?

• Noisy data from sensors– overhead camera poor for orientation

• Some sensors don’t work well (eg. RPMs)

• Can’t add new sensors (rules of the game)

• Other parts of the system constrain available information (eg. radio communications takes 25ms, frame processing takes 15ms, etc)

Key Points

• Problems can be specified in a number of ways for neural network creation

• Neural networks are a generic technology

• Problem specification has a direct impact on the application of the generic neural network technology

• Problem specific information is vital

Problem

1. Insufficiently characterized development process compared with conventional software

– What are the steps to create a neural network?

2. How do we create neural networks in a repeatable and predictable manner?

3. Absence of quality assurance methods for neural network models and implementations

– How do I verify my implementation?

Problem 1 – The Steps

Define the process of developing neural networks:

1. Formally capture the specifics of the problem in a document based on a template

2. Define the factors/parameters for creation– Neural network creation parameters– Performance requirements

3. Create the neural network

4. Get feedback on performance

Neural Network Development Process

Problem Specification Phase

• Some factors to define in problem specification:– Type of neural networks (based on experience or

published results)– How to collect and transform problem data– Potential input/output representations– Training & testing method and data selection– Performance targets (accuracy and precision)

• Most important output is the ranked collections of factors/parameters

Problem 2 – Neural Network Creation

• Predictability (with regard to resources)– Depending on creation approach used,

record time for one iteration– Use timing to predict maximum and

minimum times for all of the combinations specified

• Repeatability– Relevant information is captured in problem

specification and combinations of parameters

Problem 3 - Quality Assurance

• Specification of generic neural network software (models and learning)

• Prototype of specification

• Comparison of a given implementation with specification prototype

• Allows practitioners to create arbitrary neural networks verified against models

Two Methods for Comparison

• Direct comparison of outputs:

• Verification of weights generated by learning algorithm:

20-10-5 (with particular connections and input)

Prototype <0.123892, 0.567442, 0.981194, 0.321438, 0.699115>

Implementation <0.123892, 0.567442, 0.981194, 0.321438, 0.699115>

20-10-5 Iteration 100 Iteration 200 ………. Iteration n

Prototype Weight state 1 Weight state 2 ………. Weight state n

Implementation Weight state 1 Weight state 2 ………. Weight state n

Further Work (1)

• Practitioners to use the development process or at least document in problem specification

• Feedback from neural network development community on the content of the problem specification template

• Collect problem specifications and analyse to look for commonalities in problem domains and improve predictability (eg. control)

• More verification of specification prototype

Further Work (2)

• Translation methods for formal specification• Extend formal specification to new types• Fully prove aspects of the specification• Cross discipline data analysis methods (eg.

ICA, statistical analysis)• Implementation of learning on distributed

systems– Peer-to-peer network systems (farm each

combination of parameters to a peer)

• Remain unfashionable

Some Applications of ANN

65

Applications

Pattern recognition– Face recognition– Optical character recognition (OCR)– Medicine– Games (chess, bridge, go, …)– Finance– Gambling (greyhound racing)

66

Face recognition

Steve Lawrence, C. Lee Giles, A.C. Tsoi and A.D. Back. Face Recognition: A Convolutional Neural Network Approach. IEEE Transactions on Neural Networks, Special Issue on Neural Networks and Pattern Recognition, Volume 8, Number 1, pp. 98-113, 1997.

Machine Discovery

Famous Applications

• NASA’s intelligent flight control system (F15As and C19s – NOT space shuttle)

• Handwritten number recognition (postcodes on snail mail)

• Robotic arm control• Number plate recognition on moving cars• Detection of cancerous cells in smear

tests

An Example Application

• Control for robotic soccer• Part of system

requirements: move the robot to a target position with certain orientation

• Dynamics of robot too hard to model directly

• Overhead camera for position (i.e. some environmental constraints)

How to Represent the Problem?

• Input representation of current & target state:– Distance and time or velocity?– Orientation, degrees? Radians?– Target position, current position, x-y? Polar?

• Output representation:– Wheel velocities and translate to motor currents?– Number of wheels “clicks” per second?

Application of NN to Motion

Planning(Climbing

Robot)

Transition

one-step planning

initial 4-hold stance

3-hold stance

4-hold stance

...

... ...

...breaking contact / zero force

one-step planning

Idea: Learn Feasibility

Create a large database of labeled transitions

Train a NN classifier : transition {feasible, not feasible)

Learning is possible because:Shape of a feasible space is mostly determined by the equilibrium condition that depends on

relatively few parameters

Creation of Database

Sample transitions at random (by picking 4 holds at randomwithin robot’s limb span)

Label each transition – feasible or infeasible – by sampling with high time limit over 95% infeasible transitions

Re-sample around feasibletransitions 35-65% feasible transitions

~1 day of computation to create adatabase of 100,000 labeled transitions

Training of a NN Classifier

3-layer NN, with 9 input units, 100 hidden units, and 1 output unit

Training on 50,000 examples (~3 days of computation)

Validation on the remaining 50,000 examples ~78% accuracy (ε = 0.22) 0.003ms average running time

What Have We Learned? Useful methods Connection between fields, e.g., control theory,

game theory, operational research Impact of hardware (chess software brute-force

reasoning, case-base reasoning) Relation between high-level (e.g., search, logic)

and low-level (e.g., neural nets) representations: from pixels to predicates

Beyond learning: What concepts to learn? What is intelligence? Impact of other aspects of

human nature: fear of dying, appreciation for beauty, self-consciousness, ...

Should AI be limited to information-processing tasks?

Our methods are better than our understanding

79

Ellen Spertus

Sources used

Documents

1 We want to design some truly intelligent robot