Home Hot keywords

Search Modes

全部
小提示: 仅限搜索简体中文结果。您可以在设置中指定搜索语言

搜索结果

網路上的精選摘要

The Marginal Value of Adaptive Gradient Methods in Machine Learning. Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam.2017年5月23日
作者:AC Wilson被引用次数:709 — We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training ...
11 页·899 KB
2017年12月4日 — We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe ...
作者:A Wilson被引用次数:709 — Deep learning is an extremely powerful technology, but it has a drawback, which is that in order to actually use it, one has to contend with a sea of ...
81 页·1 MB
作者:AC Wilson2017被引用次数:709 — For the specific settings of the parameters for many of the algorithms used in deep learning, see Table 1. Adaptive methods attempt to adjust an ...
2017年5月17日 — ... Adaptive gradient (Adagrad [8]) is the method in which the updating rate of the weight coefficients of the neural network is adapted ...
Figures, Tables, and Topics from this paper · Stochastic gradient descent · Machine learning · Deep learning · Adaptive optimization · Binary classification ...
2017年5月24日 — Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro and Benjamin Recht. The Marginal Value of Adaptive Gradient Methods in Machine Learning.
论文笔记:The Marginal Value of Adaptive Gradient Methods in Machine Learning. WillerW 2018-12-30 18:42:46 311 收藏. 分类专栏: 论文笔记 文章标签: 笔记.
2017年7月26日 — IDS Lab The Marginal Value of Adaptive Gradient Methods in Machine Learning Does deep learning really doing some generalization?

google search trends