Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks

Steven Wessels, Dustin van der Haar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

The use of gradient-based methods are ubiquitously used to update the internal parameters of neural networks. Problems commonly associated with gradient based methods are the tendency for the algorithms to get stuck in sub-optimal local minima, and their slow convergence rate. Efficacious solutions to these issues, such as the addition of “momentum” and adaptive learning rates, have been offered. In this paper, we investigate the efficacy of using particle swarm optimization (PSO) to help gradient-based methods search for the optimal internal parameters to minimize the loss function of a convolutional neural network (CNN). We compare the metric performance of traditional gradient-baseds method with and without the use of a PSO to either guide or refine the search for the optimal weights. The gradient-based methods we examine are stochastic gradient descent with and without a momentum term, as well as Adaptive Moment Estimation (Adam). We find that, with the exception of the Adam optimized networks, regular gradient-based methods achieve better metric scores than when used in conjunction with a PSO. We also observe that using a PSO to refine the solution found through a gradient-descent technique reduces loss better than when using a PSO to dictate that starting solution for gradient descent. Ultimately, the best solution on the MNIST dataset was achieved by the network optimized with stochastic gradient descent and momentum with an average loss score of 0.0092 when evaluated using k-fold cross validation.

Original languageEnglish
Title of host publicationProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 25th Iberoamerican Congress, CIARP 2021, Revised Selected Papers
EditorsJoão Manuel Tavares, João Paulo Papa, Manuel González Hidalgo
PublisherSpringer Science and Business Media Deutschland GmbH
Pages119-128
Number of pages10
ISBN (Print)9783030934194
DOIs
Publication statusPublished - 2021
Event25th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, CIARP 2021 - Virtual, Online
Duration: 10 May 202113 May 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12702 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, CIARP 2021
CityVirtual, Online
Period10/05/2113/05/21

Keywords

  • Deep learning
  • Meta-heuristics
  • Particle swarm optimization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this