Access the full text.
Sign up today, get DeepDyve free for 14 days.
K Hunt, D Sbarbaro, R Zbikowski, P Gawthrop (1992)
Neural networks for control systemsa surveyAutomatica, 28
H Modares, FL Lewis, MB Naghibi-Sistani (2013)
Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networksIEEE Trans Neural Netw Learn Syst, 24
Y Jiang, ZP Jiang (2015)
Global adaptive dynamic programming for continuous-time nonlinear systemsIEEE Trans Autom Control, 60
FL Lewis, D Vrabie (2009)
Reinforcement learning and adaptive dynamic programming for feedback controlIEEE Circuits Syst Mag, 9
LP Kaelbling, ML Littman, AW Moore (1996)
Reinforcement learning: a surveyJ Artif Intell Res, 4
M Abu-Khalaf, FL Lewis (2005)
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approachAutomatica, 41
M Bardi, I Capuzzo-Dolcetta (2008)
Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations
KG Vamvoudakis, D Vrabie, FL Lewis (2014)
Online adaptive algorithm for optimal control with integral reinforcement learningInt J Robust Nonlinear Control, 24
D Vrabie, F Lewis (2009)
Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systemsNeural Netw, 22
Y Jiang, ZP Jiang (2014)
Robust adaptive dynamic programming and feedback stabilization of nonlinear systemsIEEE Trans Neural Netw Learn Syst, 25
R Song, F Lewis, Q Wei, HG Zhang, ZP Jiang, D Levine (2015)
Multiple actor–critic structures for continuous-time optimal control using input–output dataIEEE Trans Neural Netw Learn Syst, 26
D Zhao, Q Zhang, D Wang, Y Zhu (2016)
Experience replay for optimal control of nonzero-sum game systems with unknown dynamicsIEEE Trans Cybern, 46
JJ Murray, CJ Cox, GG Lendaris, R Saeks (2002)
Adaptive dynamic programmingIEEE Trans Syst Man Cybern Part C Appl Rev, 32
C Ribeiro (2002)
Reinforcement learning agentsArtif Intell Rev, 17
PJ Werbos (1977)
Advanced forecasting methods for global crisis warning and models of intelligenceGen Syst Yearb, 22
D Wang, D Liu, Q Wei, D Zhao, N Jin (2012)
Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programmingAutomatica, 48
Y Zhu, D Zhao, X Li (2017)
Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online dataIEEE Trans Neural Netw Learn Syst, 28
RS Sutton, AG Barto (1998)
Reinforcement learning: an introduction
BL Stevens, FL Lewis (2003)
Aircraft control and simulation
Y Zhu, D Zhao, X Li (2016)
Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamicsIET Control Theory Appl, 10
R Beard, T McLain (1998)
Successive Galerkin approximation algorithms for nonlinear optimal and robust controlInt J Control, 71
S Bhasin, R Kamalapurkar, M Johnson, KG Vamvoudakis, FL Lewis, WE Dixon (2013)
A novel actor–critic-identifier architecture for approximate optimal control of uncertain nonlinear systemsAutomatica, 49
H Zhang, L Cui, Y Luo (2013)
Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADPIEEE Trans Cybern, 43
A Al-Tamimi, FL Lewis, M Abu-Khalaf (2008)
Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proofIEEE Trans Syst Man Cybern Part B Cybern, 38
H Modares, FL Lewis, MB Naghibi-Sistani (2014)
Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systemsAutomatica, 50
KG Vamvoudakis, FL Lewis (2010)
Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problemAutomatica, 46
A Cochocki, R Unbehauen (1993)
Neural networks for optimization and signal processing
D Zhao, Y Zhu (2015)
MEC—a near-optimal online reinforcement learning algorithm for continuous deterministic systemsIEEE Trans Neural Netw Learn Syst, 26
RW Beard, GN Saridis, JT Wen (1997)
Galerkin approximations of the generalized Hamilton–Jacobi–Bellman equationAutomatica, 33
FY Wang, H Zhang, D Liu (2009)
Adaptive dynamic programming: an introductionIEEE Comput Intell Mag, 4
H Zhang, D Liu, Y Luo, D Wang (2012)
Adaptive dynamic programming for control: algorithms and stability
Online learning is an important property of adaptive dynamic programming (ADP). Online observations contain plentiful dynamics information, and ADP algorithms can utilize them to learn the optimal control policy. This paper reviews the research of online ADP algorithms for the optimal control of continuous-time systems. With the intensive study, ADP has been developed towards model free and data efficient. After separately introducing the algorithms, we compare their performance on the same problem. This paper is desired to provide a comprehensive understanding of continuous-time online ADP algorithms.
Artificial Intelligence Review – Springer Journals
Published: Feb 24, 2017
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.