Intrinsic Motivation Systems for Autonomous Mental Development

Oudeyer, Pierre-Yves and Kaplan, Frédéric and Hafner, Véréna (2007) Intrinsic Motivation Systems for Autonomous Mental Development. [Journal (Paginated)]

Full text available as:



Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development.The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology. Key words: Active learning, autonomy, behavior, complexity, curiosity, development, developmental trajectory, epigenetic robotics, intrinsic motivation, learning, reinforcement learning, values.

Item Type:Journal (Paginated)
Keywords:Active learning, autonomy, behavior, complexity, curiosity, development, developmental trajectory, epigenetic robotics, intrinsic motivation, learning, reinforcement learning, values.
Subjects:Computer Science > Dynamical Systems
Psychology > Developmental Psychology
Computer Science > Artificial Intelligence
Computer Science > Robotics
ID Code:5473
Deposited By: Oudeyer, Pierre-Yves
Deposited On:04 Apr 2007
Last Modified:11 Mar 2011 08:56

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

[1] J. Weng, J. McClelland, A. Pentland, O. Sporns, I. Stockman, M. Sur,

and E. Thelen, “Autonomous mental development by robots and animals,”

Science, vol. 291, pp. 599–600, 2001.

[2] M. Lungarella, G. Metta, R. Pfeifer, and G. Sandini, “Developmental

robotics: A survey,” Connection Sci., vol. 15, no. 4, pp. 151–190, 2003.

[3] M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda, “Purposive

behavior acquisition on a real robot by vision-based reinforcement

learning,” Mach. Learn., vol. 23, pp. 279–303, 1996.

[4] J. Elman, “Learning and development in neural networks: The importance

of starting small,” Cognition, vol. 48, pp. 71–99, 1993.

[5] R. White, “Motivation reconsidered: The concept of competence,” Psychol.

Rev., vol. 66, pp. 297–333, 1959.

[6] E. Deci and R. Ryan, Intrinsic Motivation and Self-Determination in

Human Behavior. New York: Plenum, 1985.

[7] D. Berlyne, Conflict, Arousal and Curiosity. New York: McGraw-

Hill, 1960.

[8] M. Csikszenthmihalyi, Flow-the Psychology of Optimal Experience.

New York: Harper Perennial, 1991.

[9] W. Schultz, P. Dayan, and P. Montague, “A neural substrate of prediction

and reward,” Science, vol. 275, pp. 1593–1599, 1997.

[10] P. Dayan and W. Belleine, “Reward, motivation and reinforcement

learning,” Neuron, vol. 36, pp. 285–298, 2002.

[11] S. Kakade and P. Dayan, “Dopamine: Generalization and bonuses,”

Neural Netw., vol. 15, pp. 549–559, 2002.

[12] J.-C. Horvitz, “Mesolimbocortical and nigrostriatal dopamine responses

to salient non-reward events,” Neuroscience, vol. 96, no. 4,

pp. 651–656, 2000.

[13] M. Csikszentmihalyi, Creativity-Flow and the Psychology of Discovery

and Invention. New York: Harper Perennial, 1996.

[14] J. Schmidhuber, “Curious model-building control systems,” in Proc.

Int. Joint Conf. Neural Netw., Singapore, 1991, vol. 2, pp. 1458–1463.

[15] S. Thrun, “Exploration in active learning,” in Handbook of Brain

Science and Neural Networks, M. Arbib, Ed. Cambridge, MA: MIT

Press, 1995.

[16] J. Herrmann, K. Pawelzik, and T. Geisel, “Learning predicitve representations,”

Neurocomputing, vol. 32–33, pp. 785–791, 2000.

[17] J. Weng, “A theory for mentally developing robots,” in Proc. 2nd Int.

Conf. Development Learn., 2002, pp. 131–140.

[18] X. Huang and J. Weng, “Novelty and reinforcement learning in the

value system of developmental robots,” in Proc. 2nd Int. Workshop

Epigenetic Robotics: Modeling Cognitive Development in Robotic

Systems, C. Prince, Y. Demiris, Y. Marom, H. Kozima, and C.

Balkenius, Eds., 2002, vol. 94, Lund University Cognitive Studies,

pp. 47–55.

[19] F. Kaplan and P.-Y. Oudeyer, “Motivational principles for visual

know-how development,” in Proc. 3rd Int. Workshop Epigenetic

Robotics: Modeling Cognitive Development in Robotic Systems, C.

Prince, L. Berthouze, H. Kozima, D. Bullock, G. Stojanov, and C.

Balkenius, Eds., 2003, vol. 101, Lund University Cognitive Studies,

pp. 73–80.

[20] J. Marshall, D. Blank, and L. Meeden, “An emergent framework for

self-motivation in developmental robotics,” in Proc. 3rd Int. Conf. Development

Learn., San Diego, CA, 2004, pp. 104–111.

[21] A. Barto, S. Singh, and N. Chentanez, “Intrinsically motivated learning

of hierarchical collections of skills,” in Proc. 3rd Int. Conf. Development

Learn., San Diego, CA, 2004, pp. 112–119.

[22] V. Fedorov, Theory of Optimal Experiment. New York, NY: Academic,


[23] D. Cohn, Z. Ghahramani, and M. Jordan, “Active learning with statistical

models,” J. Artif. Intell. Res., vol. 4, pp. 129–145, 1996.

[24] M. Hasenjager and H. Ritter, Active Learning in Neural Networks.

Berlin, Germany: Physica-Verlag GmbH, 2002, Physica-Verlag Studies

In Fuzziness and Soft Computing Series, pp. 137–169.

[25] J. Denzler and C. Brown, “Information theoretic sensor data selection

for active object recognition and state estimation,” IEEE Trans. Pattern

Anal. Mach. Intell., vol. 2, no. 24, pp. 145–157, Feb. 2002.

[26] M. Plutowsky and H. White, “Selecting concise training sets from clean

data,” IEEE Trans. Neural Netw., vol. 4, no. 2, pp. 305–318, Mar. 1993.

[27] T.Watkin and A. Rau, “Selecting examples for perceptrons,” J. Physics

A: Mathematical and General, vol. 25, pp. 113–121, 1992.

[28] D. MacKay, “Information-based objective functions for active data selection,”

Neural Comput., vol. 4, pp. 590–604, 1992.

[29] M. Belue, K. Bauer, and D. Ruck, “Selecting optimal experiments for

multiple output multi-layer perceptrons,” Neural Comput., vol. 9, pp.

161–183, 1997.

[30] G. Paas and J. Kindermann, “Bayesian query construction for neural

network models,” in Advances in Neural Processing Systems, G.

Tesauro, D. Touretzky, and T. Leen, Eds. : MIT Press, 1995, vol. 7,

pp. 443–450.

[31] K. O. M. Hasenjager and H. Ritter, Active Learning in Self-Organizing

Maps. New York: Elsevier, 1999, pp. 57–70.

[32] D. Cohn, L. Atlas, and R. Ladner, “Improving generalization with active

learning,” Mach. Learn., vol. 15, no. 2, pp. 201–221, 1994.

[33] J. Poland and A. Zell, “Different criteria for active learning in neural

networks: A comparative study,” in Proc. 10th Eur. Symp. Artif. Neural

Netw., M. Verleysen, Ed., 2002, pp. 119–124.

[34] J. Weng, “Developmental robotics: Theory and experiments,” Int. J.

Humanoid Robotics, vol. 1, no. 2, pp. 199–236, 2004.

[35] N. Roy and A. McCallum, “Towards optimal active learning through

sampling estimation of error reduction,” in Proc. 18th Int. Conf. Mach.

Learn., 2001, pp. 441–448.

[36] R. Collobert and S. Bengio, “Svmtorch: Support vector machines for

large-scale regression problems,” J. Mach. Learn. Res., vol. 1, pp.

143–160, 2001.

[37] R. Sutton and A. Barto, Reinforcement Learning: An Introduction.

Cambridge, MA.: MIT Press, 1998.

[38] C. Walkins and P. Dayan, “ -learning,” Mach. Learn., vol. 8, pp.

279–292, 1992.

[39] K. Kaneko and I. Tsuda, Complex Systems : Chaos and Beyond.

Berlin, Germany: Springer-Verlag, 2000.

[40] O. Sporns and T. Pegors, “Information-theoretical aspects of embodied

artificial intelligence,” in Embodied Artificial Intelligence, F. Iida, R.

Pfeifer, L. Steels, and Y. Kuniyoshi, Eds. Berlin, Germany: Springer-

Verlag, 2003, LNAI 3139, pp. 74–85.

[41] J. Piaget, The Origins of Intelligence in Children. New York, NY:

Norton, 1952.

[42] O. Michel, “Webots: Professional mobile robot simulation,” Int. J. Advanced

Robotic Syst., vol. 1, no. 1, pp. 39–42, 2004.

[43] J. Rekimoto and Y. Ayatsuka, “Cybercode: Designing augmented reality

environments with visual tags,” in Proc. Designing Augmented

Reality Environments, 2000, pp. 1–10.

[44] S. Schaal, C. Atkeson, and S. Vijayakumar, “Scalable techniques from

nonparameteric statistics for real-time robot learning,” Appl. Intell.,

vol. 17, no. 1, pp. 49–60, 2002.

[45] E. Thelen and L. B. Smith, A Dynamic Systems Approach to the Development

of Cognition and Action. Cambridge, MA: MIT Press, 1994.

[46] R. D. Beer, “The dynamics of active categorical perception in an

evolved model agent,” Adaptive Behav., vol. 11, no. 4, pp. 209–243,


[47] S. Nolfi and J. Tani, “Extracting regularities in space and time through

a cascade of prediction networks,” Connection Sci., vol. 11, no. 2, pp.

129–152, 1999.

[48] M. Arbib, The Handbook of Brain Theory and Neural Networks.

Cambridge, MA: MIT Press, 2003.

[49] M. Minsky, “A framework for representing knowledge,” in The Psychology

of Computer Vision, P. Wiston, Ed. New York: McGraw-

Hill, 1975, pp. 211–277.

[50] R. Schank and R. Abelson, Scripts, Plans, Goals and Understanding:

An Inquiry into Human Knowledge Structures. Hillsdale, NJ.:

Lawrence Erlbaum, 1977.

[51] G. L. Drescher,Made-Up Minds. Cambridge, MA.: MIT Press, 1991.

[52] R. Sutton, D. Precup, and S. Singh, “Between MDPSs and

semi-MDPS: A framework for temporal abstraction in reinforcement

learning,” Artif. Intell., vol. 112, pp. 181–211, 1999.

[53] K. Doya, K. Samejima, K. Katagiri, and M. Kawato, “Multiple

model-based reinforcement learning,” Neural Comput., vol. 14, pp.

1347–1369, 2002.

[54] J. Tani and S. Nolfi, “Learning to perceive the world as articulated: An

approach for hierarchical learning in sensory-motor systems,” Neural

Netw., vol. 12, pp. 1131–1141, 1999.

[55] M. Tomasello, M. Carpenter, J. Call, T. Behne, and H. Moll, “Understanding

and sharing intentions: The origins of cultural cognition,”

Behav. Brain Sci., vol. 28, no. 5, pp. 675–691, 2005.

[56] F. Dignum and R. Conte, “Intentional agents and goal formation,”

in Proc. 4th Int. Workshop Intell. Agents IV, Agent Theories, Architectures,

and Languages, London, U.K., 1997, vol. 1365, LNCS, pp.


[57] F. Kaplan and V. Hafner, “The challenges of joint attention,” Interaction

Studies, vol. 7, no. 2, pp. 128–134, 2006.

[58] A. Robins, “Transfer in cognition,” Connection Sci., vol. 8, no. 2, pp.

185–204, 1996.

[59] G. Lakoff and M. Johnson, Philosophy in the Flesh: The Embodied

Mind and its Challenge toWestern Thought. New York: Basic Books,


[60] D. Gentner, K. Holyoak, and N. Kokinov, The Analogical Mind: Perspectives

from Cognitive Science. Cambridge, MA:MIT Press, 2001.

[61] L. Pratt and B. Jennings, “A survey of connectionist network reuse

through transfer,” Connection Sci., vol. 8, no. 2, pp. 163–184, 1996.

[62] J. Tani, M. Ito, and Y. Sugita, “Self-organization of distributedly represented

multiple behavior schema in a mirror system,” Neural Netw.,

vol. 17, pp. 1273–1289, 2004.

[63] F. Kaplan and P.-Y. Oudeyer, “The progress-drive hypothesis: An interpretation

of early imitation,” in Models and Mechanisms of Imitation

and Social Learning: Behavioral, Social and Communication Dimensions,

K. Dautenhahn and C. Nehaniv, Eds. Cambridge, U.K.: Cambridge

Univ. Press, 2007, pp. 361–377.

[64] L. Vygotsky, Mind in Society. Cambridge, MA: Harvard Univ. Press,

1978, The Development of Higher Psychological Processes.

[65] L. Steels, “The autotelic principle,” in Embodied Artificial Intelligence,

I. Fumiya, R. Pfeifer, L. Steels, and K. Kunyoshi, Eds. Berlin,

Germany: Springer-Verlag, 2004, vol. 3139, Lecture Notes in AI, pp.


[66] A. Meltzoff and A. Gopnick, “The role of imitation in understanding

persons and developing a theory of mind,” in Understanding Other

Minds, H. T.-F. S. Baron-Cohen and D. Cohen, Eds. Oxford, U.K.:

Oxford Univ. Press, 1993, pp. 335–366.

[67] C. Moore and V. Corkum, “Social understanding at the end of the first

year of life,” Developmental Rev., vol. 14, pp. 349–372, 1994.

[68] P. Rochat, “Ego function of early imitation,” in The Imitative Mind:

Development, Evolution and Brain Bases, A. Melzoff and W. Prinz,

Eds. Cambridge, U.K.: Cambridge Univ. Press, 2002.

[69] J. Baldwin, Mental Development in the Child and the Race. New

York: Macmillan, 1925.

[70] H. Schaffer, “Early interactive development in studies of mother-infant

interaction,” in Proc. Loch Lomonds Symp., New York, 1977, pp. 3–18.

[71] J. Piaget, Play, Dreams and Imitation in Childhood. New York:

Norton Press, 1962.

[72] J. Gibson, The Ecological Approach to Visual Perception. Mahwah,

NJ: Lawrence Erlbaum, 1986.

[73] J.-C. Baillie, “Urbi: Towards a universal robotic low-level programming

language,” in Proc. IEEE Int. Conf. Intell. Robots Syst., Aug.

2005, pp. 820–825.


Repository Staff Only: item control page