Learning Reactive Behavior in Autonomous Vehicles: SAMUEL

Learning Reactive Behavior in Autonomous Vehicles:

SAMUEL

• Sanaa Kamari

SAMUEL

• Computer system that learns reactive behavior for autonomous vehicles.– Reactive behavior is the set of actions taken

by an AV as a reaction to sensor readings.

• uses Genetic algorithm to improve decision making rules.

• Each individual in SAMUEL is an entire rule set or strategy.

Motivation for SAMUEL

• Learning facilitates extraction of rules from the expert.

• Rules are context based => impossible to account for every situation.– Given a set of conditions, the system is able

to learn the rules of operation from observing and recording his own actions.

• Samuel uses a simulation environment to learn.

SAMUEL• Problem specific module.

– The world model and its interface.

– Set of internal and external sensors

– Controllers that control the AV simulator

– Critic component that criticizes the success or failure of the AV.

[1]

SAMUEL (cont)Performance module

– Matches the rules.– Performs conflict resolution.– Assign some strength values

to the rules.

• Learning module.– Uses GA to develop reactive

behavior, as a set of condition-reaction rules.

• GA searches for the behavior to exhibit the best performance

– Behaviors are evaluated in real world model.

– Behaviors are selected for duplication and modification. [1]

Experiment Domain: Autonomous Underwater Vehicle navigation and collision avoidance

• Training the AUV simulator by virtually positioning it in the center of a field with 25 mines, and an objective outside the field.

• 2D AUV must navigate through a dense mine field toward a stationary object.

• AUV Actions: set speed and direction each decision cycle.

• System does not learn path, but a set of rules that reactively decide a move at each step.

Experiment Results

• Great improvement in both static and moving mines.

• SAMUEL shows that reactive behavior can be learned.

[1]

Domain: ROBOT Continuous and embedded learning

• To create Autonomous systems that continue to learn throughout their lives.

• To adapt a robot’s behavior in response to changes in its operating environment and capabilities.

• experiment: robot learns to adapt to failure in its sonar sensors.

Continuous and Embedded learning Model

• Execution module: controls the robot’s interaction with its environment.

• Learning module: continuously tests new strategies for the robot against a simulation model of its environment.

[2]

Execution Model

• Includes a rule-based system that operates on reactive (stimulus-response) rules.– IF range = [35, 45] AND front sonar < 20 AND right

sonar > 50 THEN SET turn = -24 (Strength 0.8)

• Monitor: Identifies symptoms of sonar failure.– measures output of sonar, compare it to recent

readings and direction of motion.– Modifies simulation used by learning sys to

replicate failure.

Learning Module

• Uses SAMUEL: uses Genetic algorithm to improve decision making rules.

Experiment

• Task requires Robot to go from one side of a room to the other through an opening.

• Robot placed randomly 4 ft from back wall.

• Location of opening is random.

• Center of front wall is 12.5ft from back wall

Experiment (cont)

• Robot begins with a set of default rules for moving toward the goal.

• Learning starts with simulation that includes and all sonars working.

• After an initial period one ore more sonars are blinded.• Monitor detects failed sonars, learning simulation is adjusted

to reflect failure.• Population of competing strategies is re-initialized and

learning continues.• The online Robot uses the best rules discovered by the

learning system since the last change to the learning simulation model,

Experiment Results

• Robot in motion with all sensors intact:– a) during run and b) at goal.

• Robot in motion after adapting to loss of three sensors: front, front right and right:– a) during run, and b) at goal.

[2]

Experiment Results

• a) Robot with full sensors passing directly through doorway.

• b) Robot with front sonar covered.• c) Robot after adapting to covered sonar. It uses side

sonar to find opening, and then turns into the opening.

[2]

References

• [1]. A. C. Schultz and J. J.Grefenstetts, “Using a genetic algorithm to learn reactive behavior for autonomous vehicles,” in Proceedings of the AIAA Guidance, Navigation, and Control Conference, (Hilton Head, SC), 1992.

• [2]. A. C. Schultz and J. J.Grefenstetts, ”Continuous and Embedded Learning in Autonomous Vehicles: Adapting to Sensor Failures”, in Proceeding of SPIE vol. 4024, pg 55-62, 2000.

Documents

Learning Reactive Behavior in Autonomous Vehicles: SAMUEL