F. Kaplan, P. Oudeyer, E. Kubinyi and A. Miklosi Mart van de Sanden

Preview:

Citation preview

F. Kaplan, P. Oudeyer, E. Kubinyi and A. Miklosi

Mart van de Sanden

AIBO As a Digital Creaturen Animal-like entertainment robot A companion How to teach it to do new things? Train like real pets? Through

interaction!

How Does it Work For Real Pets? How to teach a dolphin to

jump? Show it to him? Explain it to him? It needs to discover on its

own! But what if the action is rare

or complex? We need to guide it!

The same goes for robots!

How Not To Do It

Chanting while pushing the dog to sit Split attention between

learning a new move and listening to the trainer.

Which part of the behavior is sit?

Often the command is given while the dog is still standing.

Then How?

First teach the behavior. Then add the command!

Modelling (or molding)

Physically manipulating the animal into the desired position.

Then give positive feedback. Never used by professional trainers. The dog is not actively involved. Learning performance is poor. Used for teaching industrial robots! Not convenient for autonomous robots. Not good for teaching complex

movements.

Luring (or “magnet method”) Same as modeling, but with the use

of a lure. Gives satisfactory results for real

dogs. Can only teach positions or simple

movements. Not really used with robots.

Capturing

Exploits behavior that the animal performs spontaneously.

Wait for the correct behavior and give a positive reinforcement.

Takes to much time when multiple commands need to be learned.

The use of imitation?

Animal anatomy mostly does not resemble ours.

Only higher animals (e.g. primates) are able to imitate.

Has been done with robotics. It can handle the learning of sequences

of actions and rare behaviors. Requires elaborate vision techniques.

Shaping

Breaks behaviors down into small steps.

Which can be trained used any of the mentioned techniques.

Clicker training!

Clicker Training

B.F. Skinner: Operant conditioning.

A Clicker emits a brief sharp sound.

Which is associated with a primary reinforcer. Foods, toys, etc. It becomes a secondary

reinforcer. It will act as a positive cue.

Clicker Training

The clicker can be used to guide animals in the right direction.

By only giving the clicker sound when the animal performs the desired behavior.

Clicker Training

Four steps: Charging the clicker. Getting the behavior. Adding the command word. Testing the behavior.

It can be used to learn rare behaviors.

It can be used to learn sequences of behaviors.

Discussion!

Do you want to train your robot using this way or do you rather use a computer to program it?

Or build in another way of training? Because clicker training does not exactly come natural.

Robotic Clicker Training

Robot: Hierarchical schemata based behavior

model. Behavior selection according to:

Opportunities in the environment Natural instincts Emotion of the robot User expectation model

(associative memory)

Charging the Clicker

Primary reinforcer -> event within 5 seconds.

After 30 times it becomes a secondary reinforcer.

TRAINER scratches the robot’s head and says “Good”.ROBOT learns association in user’s expectation module.TRAINER scratches the robot’s head and says “Good”.ROBOT learns association in user’s expectation module.Etc.

Guiding the Robot

The robot starts out just doing what it wants to do.

When the trainers says “good”, the training module reinforces the current top-level schemata.

This means that the robot does the underlying behaviors more often.

Adding the Command Word

When a word is heared, the expection modules associates it with all the reinforced actions in the training session.

It creates a new schema for them. A new schema has a confidence

level. After reaching a certain level it

becomes permanent.

Recommended