D1070063423 - Indian Journal of Artificial Intelligence and Neural Networking (IJAINN)

Characterizing Adaptive Optimizer in CNN by Reverse Mode Differentiation from Full-Scratch
Ruo Ando¹, Yoshihisa Fukuhara², Yoshiyasu Takefuji³
¹Ruo Ando, National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan.

²Yoshihisa Fukuhara, Musashino University, Department of Data Science, 3-3-3 Ariake, Koto-Ku, Tokyo, Japan.

³Yoshiyasu Takefuji, Musashino University, Department of Data Science, 3-3-3 Ariake, Koto-Ku, Tokyo, Japan.

Open Access | Editorial and Publishing Policies | Cite | Zenodo | Indexing and Abstracting
© The Authors. Published by Lattice Science Publication (LSP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Recently, datasets have been discovered for which adaptive optimizers are not more than adequate. No evaluation criteria have been established for optimization as to which algorithm is appropriate. In this paper, we propose a characterization method by implementing backward automatic differentiation and characterizes the optimizer by tracking the gradient and the value of the signal flowing to the output layer at each epoch. The proposed method was applied to a CNN (Convolutional Neural Network) recognizing CIFAR-10, and experiments were conducted comparing and Adam (adaptive moment estimation) and SGD (stochastic gradient descent). The experiments revealed that for batch sizes of 50, 100, 150, and 200, SGD and Adam significantly differ in the characteristics of the time series of signals sent to the output layer. This shows that the ADAM optimizer can be clearly characterized from the input signal series for each batch size.

Keywords: Characterization of Optimizers, Adaptive Optimizer, Reverse Mode Differentiation, CNN
Scope of the Article: Neural Networks

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

Contact US