Upload
harlow
View
39
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Adaptive Robotics COM2110 Autumn Semester 2008 Lecturer: Amanda Sharkey. Researchers at Georgia Tech have built a biologically inspired robot to perform actions of service dogs Users issue verbal commands to robot, and indicate object with laser pointer. - PowerPoint PPT Presentation
Citation preview
1
Adaptive RoboticsAdaptive RoboticsCOM2110COM2110
Autumn Semester 2008Autumn Semester 2008Lecturer: Amanda SharkeyLecturer: Amanda Sharkey
2
Researchers at Georgia Tech have built a biologically inspired robot to perform actions of service dogs
Users issue verbal commands to robot, and indicate object with laser pointer.
Eg. Fetching items, or closing doors or drawers. Worked with trainers of dogs Could replicate 10 commands and tasks
Robots could reduce costs, and waiting times for service dogs
Companionship?
3
Assignment (30%) Due by Monday 17th Nov at 11 am Write an essay (1500-2500 words) on one of the
following topics. You should use the lectures as a starting point, but also research the topic yourself. Plan your answer. Include a reference section, with the references cited in full.
1. Identify the main characteristics of Behaviour-based robotics, and contrast the approach to that of “Good old-fashioned AI”.
2. To what extent did Grey Walter’s robots, Elsie and Elmer, differ from robots that preceded, or followed them.
3. Explain how the concepts of “emergence” and “embodiment” are related to recent developments in robotics and artificial intelligence.
4
Assignment notesEssay – Try to structure it with introduction, body and
conclusion. Plan out an argument, like pseudo-code.
Include references - in the text, either numerical [1], or by name, Cao et al (1997)- In a Reference section at the endjournal articles:[1] Cao, Fukunaga and Kahn (1997) Cooperative mobile robotics: antecedents and
directions. Autonomous Robots, 4,1, 7-27.And booksGordon, D. (1999) Ants at work: How an insect society is organised. W.W.Norton
and Co.,London.Secondary citationSharkey, A.J.C. (1800) How to cite references, cited in Gordon, D. (1999) Ants at
work: How an insect society is organised. W.W. Norton and Co. London.
5
Motor outputs
Light sensor inputs
Bias unit
1. Training weights with delta rule to produce target outputs
2. Testing – presenting new inputs to (already trained) weights, to see what the output is. Generalisation performance is the percentage of the test set the trained net gets right.
6
Advantages of Neural Nets for robotics
They provide a straightforward mapping between sensors and motors
They are robust to noise (noisy sensors and environments)
They can provide a biologically plausible metaphor
They offer a relatively smooth search space (gradual changes in weights = gradual changes in behaviour)
7
Example: Hebbian learning for collision avoidance From Pfeiffer and Scheier (1999)
On-line learningMotor action
Collision layer
Distance sensorsFixed weights between collision layer and motor actions.
When collision occurs, and distance sensor active, Hebbian learning used to strengthen the weight between the two.
After learning – activating the distance sensor will result in the collision detector being activated.
The robot will learn to avoid objects.
8
Other methods of setting up ANN controller Can use delta rule (for linearly separable
patterns). Or can use backpropagation learning rule to train
multi-layer net Needs training set how can you translate the behaviour of not
bumping into objects into a training set? See Sharkey (1998) for one method – write simple
code (innate controller) for avoiding obstacles, and collect examples of inputs and outputs for training
Or Genetic algorithms can be used to evolve neural net controller
9
10
Darwin …..
11
Genetic Algorithms- a form of evolutionary computation
A GA operates on a population of artificial chromosomes, selectively reproducing chromosomes of individuals with better performances with some random mutations.
Artificial chromosome (genotype) encodes characteristics of individual (phenotype)
E.g. could encode weights of artificial neural network
12
Fitness function: Used to evaluate the performance of each
phenotype. Higher fitness values better E.g. minimising difference between output of
function and target Smaller difference = higher fitness
13
Start with population of randomly generated chromosomes
Decode each chromosome and evaluate its fitness
Apply genetic operators to create new population Crossover Mutation Selective reproduction
Repeat until desired individual found, or best fitness stops increasing.
14
15
Genetic operators: Crossover and Mutation
New population created by selective reproduction.
Offspring are randomly paired, crossed over and mutated
One point crossover – for a pair of chromosomes select a random point to crossover material between two individuals.
16
17
18
Genetic operators: Mutation For binary representations, switch value of
selected bits
For real value representations, increment by small number randomly extracted from distribution centred round zero.
Other representations: substitute selected location with symbol randomly chosen from same alphabet
19
20
If evolving NN weights, crossover may not work – mutation often more effective.
21
Genetic operators: Selective reproduction
Making copies of best individuals – more copies in next generation.
Problem: method breaks down when all individuals have similar fitness values, (becomes like random search)
or one or two have higher fitness values than rest of population (they dominate)
22
Solutions
Scale fitness values to enhance, or decrease individual differences
Rank-based selection: probability of making offspring proportional to rank
Truncation selection: ranking individuals and selecting top M individuals
Tournament based selection: randomly select 2, generate random number between 0 and 1, if number smaller than predefined parameter T, fitter individual makes offspring, or if greater, other individual reproduces.
Elitism: maintain best individual for next population.
23
Each generation: fitness evaluation of all individuals in population Selective reproduction Crossover and mutation
Repeat for several generations – monitor average and best fitness: halt when fitness indicators stop increasing or satisfactory individual found.
24
Aim of using GAs: emergence of complex abilities from interaction between agent and environment
Difficult to establish what abilities and achievements are needed
Other approaches: Incremental methods – change fitness
evaluation at different stages Evolution and training – can often work to
both evolve and to train.
25
Useful reference (extensive review) - Yao, X. (1999) Evolving artificial neural networks.
Proceedings of the IEE 87, 9, 1423-1447Possible to evolve: Weights and learning parameters Architectures Learning rules
26
Evolving weights and learning parameters Advantage for evolutionary robotics: don’t
need to specify network response to each pattern
Synaptic weights encoded on genotype Strings of real values, or binary values GAs used to evolve the weights Combining evolution with NN learning – use
GAs to find initial values of weights for nets subsequently trained with backpropagation
27
Evolving Architectures Indirect coding of network on genotype – e.g.
number of nodes, probability of connecting them, type of activation function
E.g. Harp, Samad and Guha (1989): genetic string encodes blueprint for net. Blueprint consists of several segments corresponding to a layer.
Segments have 2 parts: (I) node properties (no. of units, activation functions, geometric layout) (ii) outgoing properties (connection density, learning rate etc).
Blueprints decoded into networks, which are trained using Backpropagation.
28
Evolving learning rules Evolvable hardware E.g. Maris and Boekhorst 1996 Sensor position evolved
29
5 min break……
30
Example of evolving robots Floreano, D. and Mondado, F.(1994) Automatic
creation of an autonomous agent: genetic evolution of a neural network driven robot. In D.Cliff, P.Husbands, J. Meyer and S.W. Wilson (Eds) From Animals to Animats 3: Proceedings of Third Conference on Simulation of Adaptive Behaviour. Cambridge, MA: MIT Press/Bradford Books
Comparison of evolution of simple navigation, to predesigned architecture
31
32
33
Predesigned architecture: Braitenberg-type controller to perform straight
motion and obstacle avoidance with Khepera robot
Positive connection between wheel and sensors on its own side: rotation speed of wheel proportional to activation of sensor
Negative connection between wheel and sensors on the opposite side: rotation speed of wheel is inversely proportional to sensor activation
Positive offset value to each wheel generates forward motion
34
35
Weighted sum on incoming signals steers robot away from objects
But design needs prior knowledge of sensors, motors and environments
E.g. if sensor has lower response profile than other sensors, its outgoing connections require stronger weights.
36
Evolving a controller(Floreano and Mondado, 1994)
Goal: evolving a controller to maximise forward motion while avoiding obstacles
Fitness function based on 3 variables
37
Three components in fitness function to encourage Motion Straight displacement Obstacle avoidance
38
)1)(1( ivV Where V is the sum of the rotation speeds of the two wheels
Δv is the absolute value of the algebraic difference between the signed speed values of the wheels
i is the normalised activation value of the infrared sensor with the highest value
39
First component V is computed by summing the rotation speeds of the 2 wheels (direction of rotation given by sign of read value, and speed by its absolute value).
)1)(1( ivV
40
Second component encourages 2 wheels to rotate in same direction. The higher the difference in
rotation the closer will be to 1
e.g. if the left wheel rotates backwards at speed –0.4, and the right wheel rotates forward at speed 0.5
v
vwill be 0.9
The square root gives stronger weight to smaller differences. Since component is subtracted from 1, it is maximised by robots whose wheels move in the same direction, regardless of speed and
overall direction.
)1)(1( ivV
41
Last component encourages obstacle avoidance
Proximity sensors on Khepera emit a beam of infrared light – and measure quantity of reflected infrared light.
Closer a robot is to an object, the higher the measured value.
Value of i of most active sensor provides a measure of how close nearest object is.
Value subtracted from 1, so this component selects robots that stay away from objects.
Combined result of 3 components: selecting robots that move as straight as possible while avoiding
obstacles in path.
)1)(1( ivV
42
Control system for robot
Fixed network architectureWeights between 8 proximity sensors and 2
motor units (also bias unit)Recurrent connections at output layerSynaptic connections (weights) encoded as
floating point numbers on chromosome
43
Evolutionary experiments Each generation: individuals allowed to
move for 80 sensory motor loops Each sensory motor loop lasted 30 ms Each generation: 80 individuals Evolved using roulette wheel selection,
biased mutations, and one point crossover After 50 generations: smooth navigation
around the maze without bumping into walls.
44
45
46
Comparison
Comparison between handcrafted, and evolved solution
Handcrafted: stopped when in looping maze (when two contralateral sensors receive the same inputs, their signals cancel each other out, and wheels don’t move)
Evolved solution: strong asymmetrical weights on recurrent connections to avoid deadlock situations
Conclusions: evolved solution is competitive to handcrafted solution. Evolved solution emerges from interaction with
environment
47
Dario Floreano
48
2nd example: Modularity Reasons for modularity;
Complexity of task Improving performance Design Biological inspiration (and/or modelling)
Examples of modularity; Behaviour-based robotics and subsumption
architecture Where do modules come from?
Explicit design – based on analysis/understanding of task
Evolving modularity: an example (Nolfi, 1997)
49
Evolving modularity Nolfi(1997) Using emergent modularity to develop
control systems for mobile robots. Adaptive Behaviour 5, 3-4, 343-364
Usual approach to behaviour-based robotics, depends on breaking required behaviour down into subcomponents
50
Stefano Nolfi
51
Experiments: Khepera robot with gripper module
Task: removing rubbish from arena Subtasks:
Recognise target object and orient towards it Pick up target object Move towards walls avoiding other target
objects Recognise walls, and orient so as to be able to
drop object Release object
52
Colombetti, Dorigo and Borhi (1996) Studied similar task – robot collected
food pieces and stored them in nest. Decomposed target behaviour into
simpler behaviours Leave nest Get food Reach nest Avoid obstacles Coordinate behaviours
53
Modules trained separately and frozen Coordinator module trained to achieve
behaviour But system depended on human
division of task into behavioural modules
54
Nolfi (1997): used GAs to evolve NNs for control
Used different network architectures.
55
`
56
57
Genetic algorithm For each network architecture, 100
random weight genotypes Run for 15 epochs, each epoch 200
actions 20 fittest reproduced (5 copies of
NNs) Mutations of NN: substituting 2% of
randomly selected bits with new value
58
Fitness function How many objects released
outside arena? Ability to pick up targets Also useful training experiences –
when robot carrying target put new target in front of it.
59
Results 10 simulations for each of 5
architectures 100 networks in each generation,
run for 1000 generations Number of targets released out of
arena improves
60
61
Evolution: run in simulation. Best controllers of generation 999
downloaded onto robots and tested in real environment for 5000 cycles
62
63
Best results from “emergent modular architecture”
How does network work? No clear relationship between distal
description of behaviour and modules Each sub-behaviour is the result of
the contribution of different neural modules
64
Evolving communication for collective navigation Marocco and Nolfi (2007) Emergence of
communication in embodied agents evolved for the ability to solve a collective navigation problem.
Collection of simulated robots, evolved to solve collective navigation problem, develop a communication system.
Simplified account of paper (some conditions missed out).
65
Davide Marocco
66
Com unit
Motor units
Infrared sensors
ground Com sensors
e.g. robot in target area hearing signal c would exit
e.g. robot outside target hearing b, would approach target and signal d.
Robots evolved to move onto target sites in pairsDeveloped signals: a) lone robot outside target b) single robot in target area c) robot in target with another robot
d) robot approaching target, interacting with robot inside.
67
Summary Use of Genetic Algorithms (GAs) in combination with
neural nets Introduction to GAs
Genotype and phenotype Fitness function and idea of selective reproduction Genetic operators
Crossover Mutation
Example studies Example of evolving a robot controller: Floreano and Mondado
(1994) Modular approach to behaviours: example of evolving modular
decomposition (Nolfi, 1997) Marocco and Nolfi (2007) evolving communication for collective
navigation