Show simple item record

dc.contributor.authorSafayenikoo, P.
dc.contributor.authorAktürk, İsmail
dc.date.accessioned2021-12-06T08:29:49Z
dc.date.available2021-12-06T08:29:49Z
dc.date.issued2021-12
dc.identifier.urihttp://hdl.handle.net/10679/7656
dc.identifier.urihttps://ieeexplore.ieee.org/document/9614191
dc.description.abstractArtificial Neural Networks (ANNs) are known as state-of-the-art techniques in Machine Learning (ML) and have achieved outstanding results in data-intensive applications, such as recognition, classification, and segmentation. These networks mostly use deep layers of convolution and/or fully connected layers with many filters in each layer, demanding a large amount of data and tunable hyperparameters to achieve competitive accuracy. As a result, storage, communication, and computational costs of training (in particular time spent for training) become limiting factors to scale them up. In this paper, we propose a new training methodology for ANNs that exploits the observation of improvement of accuracy shows temporal variations which allow us to skip updating weights when the variation is minuscule. During such time windows, we keep updating bias which ensures the network still trains and avoids overfitting; however, we selectively skip updating weights (and their time-consuming computations). This training approach virtually achieves the same accuracy with considerably less computational cost and reduces the time spent on training. We developed two variations of the proposed training method for selectively updating weights, and call them as i) Weight Update Skipping (WUS), and ii) Weight Update Skipping with Learning Rate Scheduler (WUS+LR). We evaluate these two approaches by analyzing state-of-the-art models, including AlexNet, VGG-11, VGG-16, ResNet-18 on CIFAR datasets. We also use ImageNet dataset for AlexNet, VGG-16, and Resnet-18. On average, WUS and WUS+LR reduced the training time (compared to the baseline) by 54%, and 50% on CPU and 22%, and 21% on GPU, respectively for CIFAR-10; and 43% and 35% on CPU and 22%, and 21% on GPU, respectively for CIFAR-100; and finally 30% and 27% for ImageNet, respectively.en_US
dc.language.isoengen_US
dc.publisherIEEEen_US
dc.rightsrestrictedAccess
dc.titleWeight update skipping: Reducing training time for artificial neural networksen_US
dc.typeArticleen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.contributor.departmentÖzyeğin University
dc.contributor.authorID(ORCID 0000-0003-1970-2507 & YÖK ID 349836) Aktürk, İsmail
dc.contributor.ozuauthorAktürk, İsmail
dc.identifier.volume11
dc.identifier.issue4
dc.identifier.startpage563
dc.identifier.endpage574
dc.identifier.wosWOS:000730514000007
dc.identifier.doi10.1109/JETCAS.2021.3127907en_US
dc.subject.keywordsArtificial neural networksen_US
dc.subject.keywordsTraining timeen_US
dc.subject.keywordsTemporal variationen_US
dc.subject.keywordsWeight updateen_US
dc.identifier.scopusSCOPUS:2-s2.0-85119441655
dc.contributor.authorMale1
dc.relation.publicationcategoryConference Paper - International - Institutional Academic Staff


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Share this page