Overview of Configuring Adaptive Activation Functions for Deep Neural Networks - A Comparative Study
PDF
PDF

How to Cite

Haoxiang, Wang, and S. Smys. 2021. “Overview of Configuring Adaptive Activation Functions for Deep Neural Networks - A Comparative Study”. Journal of Ubiquitous Computing and Communication Technologies 3 (1): 10-22. https://doi.org/10.36548/jucct.2021.1.002.

Keywords

— Deep networks
— Adaptive activation function
Published: 01-05-2021

Abstract

Recently, the deep neural networks (DNN) have demonstrated many performances in the pattern recognition paradigm. The research studies on DNN include depth layer networks, filters, training and testing datasets. Deep neural network is providing many solutions for nonlinear partial differential equations (PDE). This research article comprises of many activation functions for each neuron. Besides, these activation networks are allowing many neurons within the neuron networks. In this network, the multitude of the functions will be selected between node by node to minimize the classification error. This is the reason for selecting the adaptive activation function for deep neural networks. Therefore, the activation functions are adapted with every neuron on the network, which is used to reduce the classification error during the process. This research article discusses the scaling factor for activation function that provides better optimization for the process in the dynamic changes of procedure. The proposed adaptive activation function has better learning capability than fixed activation function in any neural network. The research articles compare the convergence rate, early training function, and accuracy between existing methods. Besides, this research work provides improvements in debt ideas of the learning process of various neural networks. This learning process works and tests the solution available in the domain of various frequency bands. In addition to that, both forward and inverse problems of the parameters in the overriding equation will be identified. The proposed method is very simple architecture and efficiency, robustness, and accuracy will be high when considering the nonlinear function. The overall classification performance will be improved in the resulting networks, which have been trained with common datasets. The proposed work is compared with the recent findings in neuroscience research and proved better performance.

References

  1. H. Owhadi, Bayesian numerical homogenization, Multiscale Model. Simul. 13, 812-828, 2015.
  2. E. J. Parish, K. Duraisamy, A paradigm for data-driven predictive modeling using field inversion and machine learning, J. Comput. Phys. 305, 758-774, 2016.
  3. S. Qian, et al, Adaptive activation functions in convolutional neural networks, Neurocomputing Volume 272, 10 January 2018, Pages 204-212.
  4. N. Rahaman, et al., On the spectral bias of deep neural networks, arXiv preprint arXiv:1806.08734, 2018.
  5. M. Raissi, G.E. Karniadakis, Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys., 357, 125-141, 2018.
  6. M. Raissi, P. Perdikaris, G.E. Karniadakis, Numerical Gaussian processes for time-dependent and nonlinear partial differential equations. SIAM J. Sci. Comput. 40, A172-A198, 2018.
  7. M. Raissi, P. Perdikaris, G.E. Karniadakis, Physics-informed neural network: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378, 686-707, 2019.
  8. M. Raissi, P. Perdikaris, G.E. Karniadakis, Inferring solutions of differential equations using noisy multi-fidelity data, J. Comput. Phys. 335 (2017) 736-746.
  9. M. Raissi, P. Perdikaris, G.E. Karniadakis, Machine learning of linear differential equations using Gaussian processes, J. Comput. Phys., 348, 683-693, 2017.
  10. M. Raissi, Z. Wang, M.S. Triantafyllou, G.E. Karniadakis, Deep learning of vortex-induced vibrations, J. Fluid Mech. (2019), vol. 861, pp. 119-137.
  11. S. Ruder, An overview of gradient descent optimization algorithms, arXiv: 1609.04747v2, 2017.
  12. S.H. Rudy, et al., Data-driven discovery of partial differential equations, Sci. Adv. 3(4), 2017.
  13. A.G. Baydin, B.A. Pearlmutter, A.A. Radul, J.M. Siskind, Automatic differentiation in machine learning: a survey, Journal of Machine Learning Research, 18 (2018) 1-43.
  14. J. Berg, K. Nystrom , Data-driven discovery of PDEs in complex datasets, J. Comput. Phys. 384 (2019) 239-252.
  15. K. Duraisamy, Z.J. Zhang, A.P. Singh, New approaches in turbulence and transition modeling using data-driven techniques, AIAA paper 2015-1284, 2015.
  16. D. P. Kingma, J. L. Ba, ADAM: A method for stochastic optimization, arXiv:1412.6980v9, 2017.
  17. Merkel, Cory, Dhireesha Kudithipudi, and Nick Sereni. ”Periodic activation functions in memristor-based analog neural networks.” Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.
  18. A.-R. Mohamed, G. E. Dahl, and G. Hinton, ”Acoustic modelling using deep belief networks,” IEEE Transactions on Audio, Speech and Language Processing, Vol. 20, pp. 14-22, 2012.
  19. Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E. “Image net classification with deep convolutional neural networks.” Advances in neural information processing systems, pp. 1097-1105, 2012.
  20. Tolga Bolukbasi, Joseph Wang, Ofer Dekel, and Venkatesh Saligrama. Adaptive neural networks for fast test-time prediction. In ICML, 2017.
  21. Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian QWeinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2018.
  22. Andreas Veit and Serge Belongie. Convolutional networks with adaptive inference graphs. In ECCV, 2018.
  23. Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, and Joseph E Gonzalez. Skipnet: Learning dynamic routing in convolutional networks. In ECCV, 2018.
  24. Michael Figurnov, Maxwell D Collins, Yukun Zhu, Li Zhang, Jonathan Huang, Dmitry Vetrov, and Ruslan Salakhutdinov. Spatially adaptive computation time for residual networks. arXiv preprint arXiv:1612.02297, 2016.
  25. Xu Lan, Xiatian Zhu, and Shaogang Gong. Knowledge distillation by on-the-fly native ensemble. arXiv preprint arXiv:1806.04606, 2018.
  26. Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition” CoRR, abs/1409.1556, 2014.
  27. Sergey Ioffe and Christian Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” CoRR, abs/1502.03167, 2015.
  28. He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian. ”Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.” arXiv:1502.01852, 2015.