Upload
lacy-jennings
View
56
Download
2
Embed Size (px)
DESCRIPTION
Learning Reactive Behavior in Autonomous Vehicles: SAMUEL. Sanaa Kamari. SAMUEL. Computer system that learns reactive behavior for autonomous vehicles. Reactive behavior is the set of actions taken by an AV as a reaction to sensor readings. - PowerPoint PPT Presentation
Citation preview
Learning Reactive Behavior in Autonomous Vehicles:
SAMUEL
• Sanaa Kamari
SAMUEL
• Computer system that learns reactive behavior for autonomous vehicles.– Reactive behavior is the set of actions taken
by an AV as a reaction to sensor readings.
• uses Genetic algorithm to improve decision making rules.
• Each individual in SAMUEL is an entire rule set or strategy.
Motivation for SAMUEL
• Learning facilitates extraction of rules from the expert.
• Rules are context based => impossible to account for every situation.– Given a set of conditions, the system is able
to learn the rules of operation from observing and recording his own actions.
• Samuel uses a simulation environment to learn.
SAMUEL• Problem specific module.
– The world model and its interface.
– Set of internal and external sensors
– Controllers that control the AV simulator
– Critic component that criticizes the success or failure of the AV.
[1]
SAMUEL (cont)Performance module
– Matches the rules.– Performs conflict resolution.– Assign some strength values
to the rules.
• Learning module.– Uses GA to develop reactive
behavior, as a set of condition-reaction rules.
• GA searches for the behavior to exhibit the best performance
– Behaviors are evaluated in real world model.
– Behaviors are selected for duplication and modification. [1]
Experiment Domain: Autonomous Underwater Vehicle navigation and collision avoidance
• Training the AUV simulator by virtually positioning it in the center of a field with 25 mines, and an objective outside the field.
• 2D AUV must navigate through a dense mine field toward a stationary object.
• AUV Actions: set speed and direction each decision cycle.
• System does not learn path, but a set of rules that reactively decide a move at each step.
Experiment Results
• Great improvement in both static and moving mines.
• SAMUEL shows that reactive behavior can be learned.
[1]
Domain: ROBOT Continuous and embedded learning
• To create Autonomous systems that continue to learn throughout their lives.
• To adapt a robot’s behavior in response to changes in its operating environment and capabilities.
• experiment: robot learns to adapt to failure in its sonar sensors.
Continuous and Embedded learning Model
• Execution module: controls the robot’s interaction with its environment.
• Learning module: continuously tests new strategies for the robot against a simulation model of its environment.
[2]
Execution Model
• Includes a rule-based system that operates on reactive (stimulus-response) rules.– IF range = [35, 45] AND front sonar < 20 AND right
sonar > 50 THEN SET turn = -24 (Strength 0.8)
• Monitor: Identifies symptoms of sonar failure.– measures output of sonar, compare it to recent
readings and direction of motion.– Modifies simulation used by learning sys to
replicate failure.
Learning Module
• Uses SAMUEL: uses Genetic algorithm to improve decision making rules.
Experiment
• Task requires Robot to go from one side of a room to the other through an opening.
• Robot placed randomly 4 ft from back wall.
• Location of opening is random.
• Center of front wall is 12.5ft from back wall
Experiment (cont)
• Robot begins with a set of default rules for moving toward the goal.
• Learning starts with simulation that includes and all sonars working.
• After an initial period one ore more sonars are blinded.• Monitor detects failed sonars, learning simulation is adjusted
to reflect failure.• Population of competing strategies is re-initialized and
learning continues.• The online Robot uses the best rules discovered by the
learning system since the last change to the learning simulation model,
Experiment Results
• Robot in motion with all sensors intact:– a) during run and b) at goal.
• Robot in motion after adapting to loss of three sensors: front, front right and right:– a) during run, and b) at goal.
[2]
Experiment Results
• a) Robot with full sensors passing directly through doorway.
• b) Robot with front sonar covered.• c) Robot after adapting to covered sonar. It uses side
sonar to find opening, and then turns into the opening.
[2]
References
• [1]. A. C. Schultz and J. J.Grefenstetts, “Using a genetic algorithm to learn reactive behavior for autonomous vehicles,” in Proceedings of the AIAA Guidance, Navigation, and Control Conference, (Hilton Head, SC), 1992.
• [2]. A. C. Schultz and J. J.Grefenstetts, ”Continuous and Embedded Learning in Autonomous Vehicles: Adapting to Sensor Failures”, in Proceeding of SPIE vol. 4024, pg 55-62, 2000.