Adaptive Robotics COM2110 Autumn Semester 2008 Lecturer: Amanda Sharkey

1

Adaptive RoboticsAdaptive RoboticsCOM2110COM2110

Autumn Semester 2008Autumn Semester 2008Lecturer: Amanda SharkeyLecturer: Amanda Sharkey

2

Researchers at Georgia Tech have built a biologically inspired robot to perform actions of service dogs

Users issue verbal commands to robot, and indicate object with laser pointer.

Eg. Fetching items, or closing doors or drawers. Worked with trainers of dogs Could replicate 10 commands and tasks

Robots could reduce costs, and waiting times for service dogs

Companionship?

3

Assignment (30%) Due by Monday 17th Nov at 11 am Write an essay (1500-2500 words) on one of the

following topics. You should use the lectures as a starting point, but also research the topic yourself. Plan your answer. Include a reference section, with the references cited in full.

1. Identify the main characteristics of Behaviour-based robotics, and contrast the approach to that of “Good old-fashioned AI”.

2. To what extent did Grey Walter’s robots, Elsie and Elmer, differ from robots that preceded, or followed them.

3. Explain how the concepts of “emergence” and “embodiment” are related to recent developments in robotics and artificial intelligence.

4

Assignment notesEssay – Try to structure it with introduction, body and

conclusion. Plan out an argument, like pseudo-code.

Include references - in the text, either numerical [1], or by name, Cao et al (1997)- In a Reference section at the endjournal articles:[1] Cao, Fukunaga and Kahn (1997) Cooperative mobile robotics: antecedents and

directions. Autonomous Robots, 4,1, 7-27.And booksGordon, D. (1999) Ants at work: How an insect society is organised. W.W.Norton

and Co.,London.Secondary citationSharkey, A.J.C. (1800) How to cite references, cited in Gordon, D. (1999) Ants at

work: How an insect society is organised. W.W. Norton and Co. London.

5

Motor outputs

Light sensor inputs

Bias unit

1. Training weights with delta rule to produce target outputs

2. Testing – presenting new inputs to (already trained) weights, to see what the output is. Generalisation performance is the percentage of the test set the trained net gets right.

6

Advantages of Neural Nets for robotics

They provide a straightforward mapping between sensors and motors

They are robust to noise (noisy sensors and environments)

They can provide a biologically plausible metaphor

They offer a relatively smooth search space (gradual changes in weights = gradual changes in behaviour)

7

Example: Hebbian learning for collision avoidance From Pfeiffer and Scheier (1999)

On-line learningMotor action

Collision layer

Distance sensorsFixed weights between collision layer and motor actions.

When collision occurs, and distance sensor active, Hebbian learning used to strengthen the weight between the two.

After learning – activating the distance sensor will result in the collision detector being activated.

The robot will learn to avoid objects.

8

Other methods of setting up ANN controller Can use delta rule (for linearly separable

patterns). Or can use backpropagation learning rule to train

multi-layer net Needs training set how can you translate the behaviour of not

bumping into objects into a training set? See Sharkey (1998) for one method – write simple

code (innate controller) for avoiding obstacles, and collect examples of inputs and outputs for training

Or Genetic algorithms can be used to evolve neural net controller

9

10

Darwin …..

11

Genetic Algorithms- a form of evolutionary computation

A GA operates on a population of artificial chromosomes, selectively reproducing chromosomes of individuals with better performances with some random mutations.

Artificial chromosome (genotype) encodes characteristics of individual (phenotype)

E.g. could encode weights of artificial neural network

12

Fitness function: Used to evaluate the performance of each

phenotype. Higher fitness values better E.g. minimising difference between output of

function and target Smaller difference = higher fitness

13

Start with population of randomly generated chromosomes

Decode each chromosome and evaluate its fitness

Apply genetic operators to create new population Crossover Mutation Selective reproduction

Repeat until desired individual found, or best fitness stops increasing.

14

15

Genetic operators: Crossover and Mutation

New population created by selective reproduction.

Offspring are randomly paired, crossed over and mutated

One point crossover – for a pair of chromosomes select a random point to crossover material between two individuals.

16

17

18

Genetic operators: Mutation For binary representations, switch value of

selected bits

For real value representations, increment by small number randomly extracted from distribution centred round zero.

Other representations: substitute selected location with symbol randomly chosen from same alphabet

19

20

If evolving NN weights, crossover may not work – mutation often more effective.

21

Genetic operators: Selective reproduction

Making copies of best individuals – more copies in next generation.

Problem: method breaks down when all individuals have similar fitness values, (becomes like random search)

or one or two have higher fitness values than rest of population (they dominate)

22

Solutions

Scale fitness values to enhance, or decrease individual differences

Rank-based selection: probability of making offspring proportional to rank

Truncation selection: ranking individuals and selecting top M individuals

Tournament based selection: randomly select 2, generate random number between 0 and 1, if number smaller than predefined parameter T, fitter individual makes offspring, or if greater, other individual reproduces.

Elitism: maintain best individual for next population.

23

Each generation: fitness evaluation of all individuals in population Selective reproduction Crossover and mutation

Repeat for several generations – monitor average and best fitness: halt when fitness indicators stop increasing or satisfactory individual found.

24

Aim of using GAs: emergence of complex abilities from interaction between agent and environment

Difficult to establish what abilities and achievements are needed

Other approaches: Incremental methods – change fitness

evaluation at different stages Evolution and training – can often work to

both evolve and to train.

25

Useful reference (extensive review) - Yao, X. (1999) Evolving artificial neural networks.

Proceedings of the IEE 87, 9, 1423-1447Possible to evolve: Weights and learning parameters Architectures Learning rules

26

Evolving weights and learning parameters Advantage for evolutionary robotics: don’t

need to specify network response to each pattern

Synaptic weights encoded on genotype Strings of real values, or binary values GAs used to evolve the weights Combining evolution with NN learning – use

GAs to find initial values of weights for nets subsequently trained with backpropagation

27

Evolving Architectures Indirect coding of network on genotype – e.g.

number of nodes, probability of connecting them, type of activation function

E.g. Harp, Samad and Guha (1989): genetic string encodes blueprint for net. Blueprint consists of several segments corresponding to a layer.

Segments have 2 parts: (I) node properties (no. of units, activation functions, geometric layout) (ii) outgoing properties (connection density, learning rate etc).

Blueprints decoded into networks, which are trained using Backpropagation.

28

Evolving learning rules Evolvable hardware E.g. Maris and Boekhorst 1996 Sensor position evolved

29

5 min break……

30

Example of evolving robots Floreano, D. and Mondado, F.(1994) Automatic

creation of an autonomous agent: genetic evolution of a neural network driven robot. In D.Cliff, P.Husbands, J. Meyer and S.W. Wilson (Eds) From Animals to Animats 3: Proceedings of Third Conference on Simulation of Adaptive Behaviour. Cambridge, MA: MIT Press/Bradford Books

Comparison of evolution of simple navigation, to predesigned architecture

31

32

33

Predesigned architecture: Braitenberg-type controller to perform straight

motion and obstacle avoidance with Khepera robot

Positive connection between wheel and sensors on its own side: rotation speed of wheel proportional to activation of sensor

Negative connection between wheel and sensors on the opposite side: rotation speed of wheel is inversely proportional to sensor activation

Positive offset value to each wheel generates forward motion

34

35

Weighted sum on incoming signals steers robot away from objects

But design needs prior knowledge of sensors, motors and environments

E.g. if sensor has lower response profile than other sensors, its outgoing connections require stronger weights.

36

Evolving a controller(Floreano and Mondado, 1994)

Goal: evolving a controller to maximise forward motion while avoiding obstacles

Fitness function based on 3 variables

37

Three components in fitness function to encourage Motion Straight displacement Obstacle avoidance

38

)1)(1( ivV Where V is the sum of the rotation speeds of the two wheels

Δv is the absolute value of the algebraic difference between the signed speed values of the wheels

i is the normalised activation value of the infrared sensor with the highest value

39

First component V is computed by summing the rotation speeds of the 2 wheels (direction of rotation given by sign of read value, and speed by its absolute value).

)1)(1( ivV

40

Second component encourages 2 wheels to rotate in same direction. The higher the difference in

rotation the closer will be to 1

e.g. if the left wheel rotates backwards at speed –0.4, and the right wheel rotates forward at speed 0.5

v

vwill be 0.9

The square root gives stronger weight to smaller differences. Since component is subtracted from 1, it is maximised by robots whose wheels move in the same direction, regardless of speed and

overall direction.

)1)(1( ivV

41

Last component encourages obstacle avoidance

Proximity sensors on Khepera emit a beam of infrared light – and measure quantity of reflected infrared light.

Closer a robot is to an object, the higher the measured value.

Value of i of most active sensor provides a measure of how close nearest object is.

Value subtracted from 1, so this component selects robots that stay away from objects.

Combined result of 3 components: selecting robots that move as straight as possible while avoiding

obstacles in path.

)1)(1( ivV

42

Control system for robot

Fixed network architectureWeights between 8 proximity sensors and 2

motor units (also bias unit)Recurrent connections at output layerSynaptic connections (weights) encoded as

floating point numbers on chromosome

43

Evolutionary experiments Each generation: individuals allowed to

move for 80 sensory motor loops Each sensory motor loop lasted 30 ms Each generation: 80 individuals Evolved using roulette wheel selection,

biased mutations, and one point crossover After 50 generations: smooth navigation

around the maze without bumping into walls.

44

45

46

Comparison

Comparison between handcrafted, and evolved solution

Handcrafted: stopped when in looping maze (when two contralateral sensors receive the same inputs, their signals cancel each other out, and wheels don’t move)

Evolved solution: strong asymmetrical weights on recurrent connections to avoid deadlock situations

Conclusions: evolved solution is competitive to handcrafted solution. Evolved solution emerges from interaction with

environment

47

Dario Floreano

48

2nd example: Modularity Reasons for modularity;

Complexity of task Improving performance Design Biological inspiration (and/or modelling)

Examples of modularity; Behaviour-based robotics and subsumption

architecture Where do modules come from?

Explicit design – based on analysis/understanding of task

Evolving modularity: an example (Nolfi, 1997)

49

Evolving modularity Nolfi(1997) Using emergent modularity to develop

control systems for mobile robots. Adaptive Behaviour 5, 3-4, 343-364

Usual approach to behaviour-based robotics, depends on breaking required behaviour down into subcomponents

50

Stefano Nolfi

51

Experiments: Khepera robot with gripper module

Task: removing rubbish from arena Subtasks:

Recognise target object and orient towards it Pick up target object Move towards walls avoiding other target

objects Recognise walls, and orient so as to be able to

drop object Release object

52

Colombetti, Dorigo and Borhi (1996) Studied similar task – robot collected

food pieces and stored them in nest. Decomposed target behaviour into

simpler behaviours Leave nest Get food Reach nest Avoid obstacles Coordinate behaviours

53

Modules trained separately and frozen Coordinator module trained to achieve

behaviour But system depended on human

division of task into behavioural modules

54

Nolfi (1997): used GAs to evolve NNs for control

Used different network architectures.

55

`

56

57

Genetic algorithm For each network architecture, 100

random weight genotypes Run for 15 epochs, each epoch 200

actions 20 fittest reproduced (5 copies of

NNs) Mutations of NN: substituting 2% of

randomly selected bits with new value

58

Fitness function How many objects released

outside arena? Ability to pick up targets Also useful training experiences –

when robot carrying target put new target in front of it.

59

Results 10 simulations for each of 5

architectures 100 networks in each generation,

run for 1000 generations Number of targets released out of

arena improves

60

61

Evolution: run in simulation. Best controllers of generation 999

downloaded onto robots and tested in real environment for 5000 cycles

62

63

Best results from “emergent modular architecture”

How does network work? No clear relationship between distal

description of behaviour and modules Each sub-behaviour is the result of

the contribution of different neural modules

64

Evolving communication for collective navigation Marocco and Nolfi (2007) Emergence of

communication in embodied agents evolved for the ability to solve a collective navigation problem.

Collection of simulated robots, evolved to solve collective navigation problem, develop a communication system.

Simplified account of paper (some conditions missed out).

65

Davide Marocco

66

Com unit

Motor units

Infrared sensors

ground Com sensors

e.g. robot in target area hearing signal c would exit

e.g. robot outside target hearing b, would approach target and signal d.

Robots evolved to move onto target sites in pairsDeveloped signals: a) lone robot outside target b) single robot in target area c) robot in target with another robot

d) robot approaching target, interacting with robot inside.

67

Summary Use of Genetic Algorithms (GAs) in combination with

neural nets Introduction to GAs

Genotype and phenotype Fitness function and idea of selective reproduction Genetic operators

Crossover Mutation

Example studies Example of evolving a robot controller: Floreano and Mondado

(1994) Modular approach to behaviours: example of evolving modular

decomposition (Nolfi, 1997) Marocco and Nolfi (2007) evolving communication for collective

navigation

Documents

Adaptive Robotics COM2110 Autumn Semester 2008 Lecturer: Amanda Sharkey