A Modular Analysis of Provable Acceleration via Polyak’s Momentum:
Training a Wide ReLU Network and a Deep Linear Network
Jun-Kun Wang 1 Chi-Heng Lin 2 Jacob Abernethy 1
Abstract 1. Introduction
Incorporating a so-called “momentum” dynamic Momentum methods are very popular for training neural
in gradient descent methods is widely used in networks in various applications (e.g. He et al. (2016);
neural net training as it has bee ...


雷达卡




京公网安备 11010802022788号







