Access the full text.
Sign up today, get DeepDyve free for 14 days.
M. Anthony, P. Bartlett (1999)
Neural Network Learning - Theoretical Foundations
P. Fischer, F. Hecht, Y. Maday (2005)
A Parareal in Time Semi-implicit Approximation of the Navier-Stokes Equations
P. Parpas, Corey Muir (2019)
Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural NetworksArXiv, abs/1902.02542
Dagmar Niebur (1995)
Artificial Neural Networks
T. Chen, Yulia Rubanova, J. Bettencourt, D. Duvenaud (2018)
Neural Ordinary Differential Equations
M. Gander, S. Vandewalle (2007)
Analysis of the Parareal Time-Parallel Time-Integration MethodSIAM J. Sci. Comput., 29
(1992)
Approximation and learning theory
高等学校計算数学学報編輯委員会編 (1979)
高等学校計算数学学報 = Numerical mathematics
Y. Maday (2009)
Symposium: Recent Advances on the Parareal in Time Algorithms, 1168
Y. Saad, M. Schultz (1986)
GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systemsSiam Journal on Scientific and Statistical Computing, 7
Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun (2015)
Deep Residual Learning for Image Recognition2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Stefanie Günther, Lars Ruthotto, J. Schroder, E. Cyr, N. Gauger (2018)
Layer-Parallel Training of Deep Residual Neural NetworksArXiv, abs/1812.04352
R. Falgout, S. Friedhoff, T. Kolev, S. MacLachlan, J. Schroder (2013)
Parallel time integration with multigridPAMM, 14
D. Bertsekas, J. Tsitsiklis (1999)
Gradient Convergence in Gradient methods with ErrorsSIAM J. Optim., 10
K. Giannakoglou, D. Papadimitriou (2008)
Adjoint Methods for Shape Optimization
M. Gander, Yaolin Jiang, Rong-Jian Li (2013)
Parareal Schwarz Waveform Relaxation Methods
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
J. Lions, Y. Maday, Gabriel Turinici (2001)
Résolution d'EDP par un schéma en temps « pararéel »Comptes Rendus De L Academie Des Sciences Serie I-mathematique, 332
The introduction in 2015 of Residual Neural Networks (RNN) and ResNET allowed for outstanding improvements of the performance of learning algorithms for evolution problems containing a “large” number of layers. Continuous-depth RNN-like models called Neural Ordinary Differential Equations (NODE) were then introduced in 2019. The latter have a constant memory cost, and avoid the a priori specification of the number of hidden layers. In this paper, we derive and analyze a parallel (-in-parameter and time) version of the NODE, which potentially allows for a more efficient implementation than a standard/naive parallelization of NODEs with respect to the parameters only. We expect this approach to be relevant whenever we have access to a very large number of processors, or when we are dealing with high dimensional ODE systems. Moreover, when using implicit ODE solvers, solutions to linear systems with up to cubic complexity are then required for solving nonlinear systems using for instance Newton’s algorithm; as the proposed approach allows to reduce the overall number of time-steps thanks to an iterative increase of the accuracy order of the ODE system solvers, it then reduces the number of linear systems to solve, hence benefiting from a scaling effect.
Annals of Mathematics and Artificial Intelligence – Springer Journals
Published: Oct 25, 2020
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.