Action Selection in a hypothetical house robot: Using those RL numbers

Humphrys, Mark (1996) Action Selection in a hypothetical house robot: Using those RL numbers. [Conference Paper]

Full text available as:

[img] Postscript


Reinforcement Learning (RL) methods, in contrast to many forms of machine learning, build up value functions for actions. That is, an agent not only knows `what' it wants to do, it also knows `how much' it wants to do it. Traditionally, the latter are used to produce the former and are then ignored, since the agent is assumed to act alone. But the latter numbers contain useful information - they tell us how much the agent will suffer if its action is not executed (perhaps not much). They tell us which actions the agent can compromise on and which it cannot. It is clear that many interesting systems possess multiple parallel and conflicting goals, all demanding attention, and none of which can be fully satisfied expect at the expense of others. Animals are the prime example of such systems. In [Humphrys, 1995], I introduced the W-learning algorithms, showing one method of resolving competition among behaviors automatically by reference to their RL values. The scheme has the unusal feature that behaviors are at all times in selfish pursuit of their own goals and have no explicit concept of cooperation, despite residing in the same body. In this paper, I apply W-learning to the world of a hypothetical house robot, which doubles as family toy, movile security camera, mobile smoke alarm and occasional vacuum cleaner. I show how a W-learning community of behaviors inside the robot will support a robust behavior pattern, capabable of opportunistic behavior, avoiding dithering, and allowing for the concept of default behavior and expression of low-priority goals.

Item Type:Conference Paper
Keywords:reactive systems, action selection, reinforcement learning, multi-behavior learning
Subjects:Biology > Animal Behavior
Biology > Ethology
Computer Science > Artificial Intelligence
Computer Science > Dynamical Systems
Computer Science > Machine Learning
Computer Science > Robotics
ID Code:448
Deposited By: Humphrys, Mark
Deposited On:09 Jun 1998
Last Modified:11 Mar 2011 08:53


Repository Staff Only: item control page