Nonparametric representation of an approximated Poincaré map for learning biped locomotion

Jun Morimoto; Christopher Atkeson

doi:10.1007/s10514-009-9133-z

Loading next page...

References (60)

P. Abbeel, M. Quigley, A. Y. Ng (2006)
Proceedings of the 23rd international conference on machine learning
M. Ghavamzadeh, Y. Engel (2006)
Bayesian Policy Gradient Algorithms
K. Tsuchiya, S. Aoi, K. Tsujita (2003)
Locomotion control of a biped locomotion robot using nonlinear oscillators
Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453), 2
R. Sutton, David McAllester, Satinder Singh, Y. Mansour (1999)
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Alex Smola, P. Bartlett (2000)
Sparse Greedy Gaussian Process Regression
S. Ounpraseuth (2008)
Gaussian Processes for Machine Learning
Journal of the American Statistical Association, 103
T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato, K. Doya (2006)
Learning CPG-based biped locomotion with a policy gradient method
Robotics and Autonomous Systems, 54
T. Jaakkola, Satinder Singh, Michael Jordan (1994)
Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems
S. Kajita, T. Nagasaki, K. Kaneko, H. Hirukawa (2007)
ZMP-based biped running control
Robotics and Automation Magazine, IEEE, 14
J. Candela, C. Rasmussen (2005)
A Unifying View of Sparse Approximate Gaussian Process Regression
J. Mach. Learn. Res., 6
B. Kendall (2001)
Nonlinear Dynamics and Chaos
S. Kajita, T. Nagasaki, K. Kaneko, H. Hirukawa (2007)
ZMP-Based Biped Running Control
IEEE Robotics & Automation Magazine, 14
Nicolas Meuleau, L. Peshkin, Kee-Eung Kim (2001)
Exploration in Gradient-Based Reinforcement Learning
T. McGeer (1990)
Passive Dynamic Walking
The International Journal of Robotics Research, 9
C. Atkeson (1997)
Nonparametric Model-Based Reinforcement Learning
F. Fesquet, F. Kronowetter, M. Renger, W. Yam, S. Gandorfer, K. Inomata, Y. Nakamura, A. Marx, R. Gross, K. Fedorov (1987)
DEMONSTRATION
British Journal of Pharmacology, 90
Katie Byl, Russ Tedrake (2008)
Metastable Walking on Stochastically Rough Terrain
, 04
Russ Tedrake, T. Zhang, H. Seung (2004)
Stochastic policy gradient reinforcement learning on a simple 3D biped
2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), 3
J. Morimoto, C. G. Atkeson (2007)
Learning biped locomotion: application of Poincaré-map-based reinforcement learning
IEEE Robotics and Automation Magazine, 14
S. Hyon, Joshua Hale, G. Cheng (2007)
Full-Body Compliant Human–Humanoid Interaction: Balancing in the Presence of Unknown External Forces
IEEE Transactions on Robotics, 23
C. Rasmussen, M. Kuss (2003)
Gaussian Processes in Reinforcement Learning
(2007)
ZMPbased biped running control. Robotics and Automation Magazine
Jan Peters, S. Schaal (2008)
Reinforcement learning of motor skills with policy gradients
Neural networks : the official journal of the International Neural Network Society, 21 4
P. Abbeel, M. Quigley, A. Ng (2006)
Using inaccurate models in reinforcement learning
Proceedings of the 23rd international conference on Machine learning
R. Dearden, N. Friedman, D. Andre (1999)
Model based Bayesian Exploration
T. Sugihara, Yoshihiko Nakamura (2003)
Whole-body Cooperative COG Control through ZMP Manipulation for Humanoid Robots
M. Howard, Stefan Klanke, M. Gienger, C. Goerick, S. Vijayakumar (2009)
A novel method for learning policies from variable constraint data
Autonomous Robots, 27
D. Touretzky, M. Mozer, M. Hasselmo, RegressionChristopher, I. K., WilliamsNeural, GroupAston, UniversityBirmingham (1996)
In Advances in Neural Information Processing Systems
Jonathan Ko, D. Fox (2008)
GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models
Autonomous Robots, 27
Jan Peters, S. Vijayakumar, S. Schaal (2003)
Natural Actor-Critic
Neurocomputing, 71
Vijay Konda, J. Tsitsiklis (2003)
OnActor-Critic Algorithms
SIAM J. Control. Optim., 42
F. Miyazaki, S. Arimoto (1981)
Implementation of a Hierarchical Control for Biped Locomotion
IFAC Proceedings Volumes, 14
Edward Snelson, Zoubin Ghahramani (2005)
Sparse Gaussian Processes using Pseudo-inputs
J. Morimoto, C. Atkeson (2007)
Learning Biped Locomotion
IEEE Robotics & Automation Magazine, 14
E. Westervelt, Gabriel Buche, J. Grizzle (2004)
Experimental Validation of a Framework for the Design of Controllers that Induce Stable Walking in Planar Bipeds
The International Journal of Robotics Research, 23
G. Endo, J. Morimoto, Takamitsu Matsubara, J. Nakanishi, G. Cheng (2005)
Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot
The International Journal of Robotics Research, 27
S. Kakade (2001)
A Natural Policy Gradient
A. Shiriaev, A. Robertsson, J. Perram, Anders Sandberg (2005)
Periodic Motion Planning for Virtually Constrained (Hybrid) Mechanical Systems
Proceedings of the 44th IEEE Conference on Decision and Control
Radford Neal (2006)
Pattern Recognition and Machine Learning
Pattern Recognition and Machine Learning
R. S. Sutton, A. G. Barto (1998)
Reinforcement learning: an introduction
J. Morimoto, G. Endo, J. Nakanishi, G. Cheng (2008)
A Biologically Inspired Biped Locomotion Strategy for Humanoid Robots: Modulation of Sinusoidal Patterns by a Coupled Oscillator Model
IEEE Transactions on Robotics, 24
L. Baird, A. Moore (1998)
Gradient Descent for General Reinforcement Learning
K. Nagasaka, Y. Kuroki, Shin'ya Suzuki, Y. Itoh, J. Yamaguchi (2004)
Integrated motion control for walking, jumping and running on a small bipedal entertainment robot
IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004, 4
K. Hirai, M. Hirose, Yuji Haikawa, T. Takenaka (1998)
The development of Honda humanoid robot
Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146), 2
Leonid Kuvayev, R. Sutton (1996)
Model-Based Reinforcement Learning with an Approximate, Learned Model
J. Morimoto, K. Doya (2000)
Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning
Robotics Auton. Syst., 36
J. Morimoto, G. Endo, J. Nakanishi, S. Hyon, G. Cheng, Darrin Bentivegna, C. Atkeson (2006)
Modulation of simple sinusoidal patterns by a coupled oscillator model for biped walking
Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006.
T. Sugihara, Yoshihiko Nakamura (2005)
A Fast Online Gait Planning with Boundary Condition Relaxation for Humanoid Robots
Proceedings of the 2005 IEEE International Conference on Robotics and Automation
Martin Riedmiller, T. Gabel, Roland Hafner, S. Lange (2009)
Reinforcement learning for robot soccer
Autonomous Robots, 27
H. Benbrahim, J. Franklin (1997)
Biped dynamic walking using reinforcement learning
Robotics Auton. Syst., 22
C. G. Atkeson, S. Schaal (1997)
Proc. 14th international conference on machine learning
Richard Linde (1999)
Passive bipedal walking with phasic muscle contraction
Biological Cybernetics, 81
Vijay Konda, J. Tsitsiklis (1999)
Actor-Critic Algorithms
K. Nagasaka (1999)
Stabilization of Dynamic Walk on a Humanoid Using Torso Position Compliance Control
H. Kimura, S. Kobayashi (1998)
An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function
K. Doya (2000)
Reinforcement Learning in Continuous Time and Space
Neural Computation, 12
W. Liu, Sanjiang Li, Jochen Renz (2003)
Covariant Policy Search
H. Miura (1983)
Dynamical walk of biped locomotion
R. Dearden, N. Friedman, D. Andre (1999)
Proceedings of fifteenth conference on uncertainty in artificial intelligence
H. Miura, I. Shimoyama (1984)
Dynamic Walk of a Biped
The International Journal of Robotics Research, 3

Publisher: Springer Journals
Copyright: Copyright © 2009 by Springer Science+Business Media, LLC
Subject: Computer Science; Simulation and Modeling; Mechanical Engineering; Computer Imaging, Vision, Pattern Recognition and Graphics; Electrical Engineering; Control , Robotics, Mechatronics; Artificial Intelligence (incl. Robotics)
ISSN: 0929-5593
eISSN: 1573-7527
DOI: 10.1007/s10514-009-9133-z
Publisher site: See Article on Publisher Site

Abstract

We propose approximating a Poincaré map of biped walking dynamics using Gaussian processes. We locally optimize parameters of a given biped walking controller based on the approximated Poincaré map. By using Gaussian processes, we can estimate a probability distribution of a target nonlinear function with a given covariance. Thus, an optimization method can take the uncertainty of approximated maps into account throughout the learning process. We use a reinforcement learning (RL) method as the optimization method. Although RL is a useful non-linear optimizer, it is usually difficult to apply RL to real robotic systems due to the large number of iterations required to acquire suitable policies. In this study, we first approximated the Poincaré map by using data from a real robot, and then applied RL using the estimated map in order to optimize stepping and walking policies. We show that we can improve stepping and walking policies both in simulated and real environments. Experimental validation on a humanoid robot of the approach is presented.

Journal

Autonomous Robots – Springer Journals

Published: Sep 1, 2009

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Nonparametric representation of an approximated Poincaré map for learning biped locomotion

Nonparametric representation of an approximated Poincaré map for learning biped locomotion

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Nonparametric representation of an approximated Poincaré map for learning biped locomotion

Nonparametric representation of an approximated Poincaré map for learning biped locomotion

References (60)

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies