Access the full text.
Sign up today, get DeepDyve free for 14 days.
D Liu, X Yang, D Wang, Q Wei (2015)
Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraintsIEEE Trans Cybern, 45
D Liu, D Wang, D Zhao, Q Wei, N Jin (2012)
Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programmingIEEE Trans Autom Sci Eng, 9
Q Wei, FY Wang, D Liu, X Yang (2014)
Finite-approximation-error based discrete-time iterative adaptive dynamic programmingIEEE Trans Cybern, 44
H Jiang, H Zhang, Y Luo, X Cui (2017)
$$H_\infty $$ H ∞ control with constrained input for completely unknown nonlinear systems using data-driven reinforcement learning methodNeurocomputing, 237
D Liu, H Li, D Wang (2014)
Online synchronous approximate optimal learning algorithm for multi-player non-zero-sum games with unknown dynamicsIEEE Trans Syst Man Cybern Syst, 44
Q Wei, D Liu, X Yang (2015)
Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systemsIEEE Trans Neural Netw Learn Syst, 26
H Zhang, C Qin, B Jiang, Y Luo (2014)
Online adaptive policy learning algorithm for $$H_ {\infty }$$ H ∞ state feedback control of unknown affine nonlinear discrete-time systemsIEEE Trans Cybern, 44
H Zhang, X Cui, Y Luo, H Jiang (2017)
Finite-horizon $$H_\infty $$ H ∞ tracking control for unknown nonlinear systems with saturating actuatorsIEEE Trans Neural Netw Learn Syst, 99
B Luo, HN Wu, T Huang, D Liu (2014)
Data-based approximate policy iteration for affine nonlinear continuous-time optimal control designAutomatica, 50
B Luo, HN Wu, T Huang, D Liu (2015)
Reinforcement learning solution for HJB equation arising in constrained optimal control problemNeural Netw, 71
B Luo, D Liu, T Huang, D Wang (2016)
Model-free optimal tracking control via critic-only Q-learningIEEE Trans Neural Netw Learn Syst, 27
D Wang, H He, D Liu (2017)
Adaptive critic nonlinear robust control: a surveyIEEE Trans Cybern, 47
D Liu, H Li, D Wang (2013)
Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithmNeurocomputing, 110
D Wang, D Liu, Q Zhang, D Zhao (2016)
Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamicsIEEE Trans Syst Man Cybern Syst, 46
Q Wei, D Liu, H Lin (2016)
Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systemsIEEE Trans Cybern, 46
D Wang, D Liu, H Li, B Luo, H Ma (2016)
An approximate optimal control approach for robust stabilization of a class of discrete-time nonlinear systems with uncertaintiesIEEE Trans Syst Man Cybern Syst, 46
A Al-Tamimi, M Abu-Khalaf, FL Lewis (2007)
Adaptive critic designs for discrete-time zero-sum games with application to $$H_{\infty }$$ H ∞ controlIEEE Trans Syst Man Cybern B Cybern, 37
KG Vamvoudakis, FL Lewis (2011)
Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equationsAutomatica, 47
R Song, FL Lewis, Q Wei, H Zhang, ZP Jiang, D Levine (2015)
Multiple actor-critic structures for continuous-time optimal control using input-output dataIEEE Trans Neural Netw Learn Syst, 26
D Zhao, Q Zhang, D Wang, Y Zhu (2016)
Experience replay for optimal control of nonzero-sum game systems with unknown dynamicsIEEE Trans Cybern, 46
Q Wei, D Liu, L Qiao, R Song (2017)
Adaptive dynamic programming for discrete-time zero-sum gamesIEEE Trans Neural Netw Learn Syst, 99
R Song, FL Lewis, Q Wei (2017)
Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum gamesIEEE Trans Neural Netw Learn Syst, 28
JJ Murray, CJ Cox, GG Lendaris, R Saeks (2002)
Adaptive dynamic programmingIEEE Trans Syst Man Cybern Part C Appl Rev, 32
A Al-Tamimi, FL Lewis, M Abu-Khalaf (2007)
Model-free Q-learning designs for linear discrete-time zero-sum games with application to $$H_{\infty }$$ H ∞ controlAutomatica, 43
PJ Werbos (1977)
Advanced forecasting methods for global crisis warning and models of intelligenceGen Syst Yearb, 22
D Wang, D Liu, H Li, H Ma (2014)
Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programmingInf Sci, 282
D Wang, H He, C Mu, D Liu (2017)
Intelligent critic control with disturbance attenuation for affine dynamics including an application to a microgrid systemIEEE Trans Ind Electron, 64
M Johnson, R Kamalapurkar, S Bhasin, WE Dixon (2015)
Approximate $$N$$ N -player nonzero-sum game solution for an uncertain continuous nonlinear systemIEEE Trans Neural Netw Learn Syst, 1
F Liu, J Sun, J Si, W Guo, S Mei (2012)
A boundedness result for the direct heuristic dynamic programmingNeural Netw, 32
D Wang, C Mu, D Liu, H Ma (2017)
On mixed data and event driven design for adaptive-critic-based nonlinear $$H_{\infty }$$ H ∞ controlIEEE Trans Neural Netw Learn Syst, 99
Y Zhu, D Zhao, H He, J Ji (2017)
Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programmingIEEE Trans Ind Electron, 64
Y Zhu, D Zhao, X Li (2016)
Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamicsIET Control Theory Appl, 10
H Zhang, L Cui, Y Luo (2013)
Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADPIEEE Trans Cybern, 43
H Jiang, H Zhang, G Xiao, X Cui (2017)
Data-based approximate optimal control for nonzero-sum games of multi-player systems using adaptive dynamic programmingNeurocomputing, 1–8
Y Sokolov, R Kozma, L Werbos, P Werbos (2015)
Complete stability analysis of a heuristic approximate dynamic programming control designAutomatica, 59
A Al-Tamimi, FL Lewis, M Abu-Khalaf (2008)
Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proofIEEE Trans Syst Man Cybern Part B Cybern, 38
X Yang, D Liu, Q Wei, D Wang (2016)
Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programmingNeurocomputing, 198
B Luo, HN Wu, T Huang (2015)
Off-policy reinforcement learning for $$H_\infty $$ H ∞ control designIEEE Trans Cybern, 45
KG Vamvoudakis, FL Lewis (2010)
Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problemAutomatica, 46
Q Wei, FL Lewis, D Liu, R Song, H Lin (2016)
Discrete-time local value iteration adaptive dynamic programming: convergence analysisIEEE Trans Syst Man Cybern Syst, 99
H Zhang, H Jiang, C Luo, G Xiao (2016)
Discrete-time nonzero-sum games for multiplayer using policy iteration-based adaptive dynamic programming algorithmsIEEE Trans Cybern, 99
D Zhao, Y Zhu (2015)
MEC—a near-optimal online reinforcement learning algorithm for continuous deterministic systemsIEEE Trans Neural Netw Learn Syst, 26
R Kamalapurkar, J Klotz, WE Dixon (2014)
Concurrent learning-based online approximate feedback Nash equilibrium solution of $$N$$ N -player nonzero-sum differential gamesIEEE/CAA J Autom Sin, 1
S Mehraeen, T Dierks, S Jagannathan (2013)
Zero-sum two-player game theoretic formulation of affine nonlinear discrete-time systems using neural networksIEEE Trans Cybern, 43
X Yang, D Liu, H Ma, Y Xu (2016)
Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systemsInf Sci, 328
FY Wang, H Zhang, D Liu (2009)
Adaptive dynamic programming: an introductionIEEE Comput Intell Mag, 4
D Zhao, Z Xia, D Wang (2015)
Model-free optimal control for affine nonlinear systems with convergence analysisIEEE Trans Autom Sci Eng, 12
D Liu, Q Wei (2014)
Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systemsIEEE Trans Neural Netw Learn Syst, 25
Adaptive dynamic programming (ADP) is an important branch of reinforcement learning to solve various optimal control issues. Most practical nonlinear systems are controlled by more than one controller. Each controller is a player, and to make a tradeoff between cooperation and conflict of these players can be viewed as a game. Multi-player games are divided into two main categories: zero-sum game and non-zero-sum game. To obtain the optimal control policy for each player, one needs to solve Hamilton–Jacobi–Isaacs equations for zero-sum games and a set of coupled Hamilton–Jacobi equations for non-zero-sum games. Unfortunately, these equations are generally difficult or even impossible to be solved analytically. To overcome this bottleneck, two ADP methods, including a modified gradient-descent-based online algorithm and a novel iterative offline learning approach, are proposed in this paper. Furthermore, to implement the proposed methods, we employ single-network structure, which obviously reduces computation burden compared with traditional multiple-network architecture. Simulation results demonstrate the effectiveness of our schemes.
Artificial Intelligence Review – Springer Journals
Published: Jan 12, 2018
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.