Publication:
Generalization to unseen viewpoint images of objects via alleviated pose attentive capsule agreement

dc.contributor.authorÖzcan, Barış
dc.contributor.authorKınlı, Osman Furkan
dc.contributor.authorKıraç, Mustafa Furkan
dc.contributor.departmentComputer Science
dc.contributor.ozuauthorKINLI, Osman Furkan
dc.contributor.ozuauthorKIRAÇ, Mustafa Furkan
dc.contributor.ozugradstudentÖzcan, Barış
dc.date.accessioned2024-01-11T06:51:57Z
dc.date.available2024-01-11T06:51:57Z
dc.date.issued2023-02
dc.description.abstractDespite their achievements in object recognition, Convolutional Neural Networks (CNNs) particularly fail to generalize to unseen viewpoints of a learned object even with substantial samples. On the other hand, recently emerged capsule networks outperform CNNs in novel viewpoint generalization tasks even with significantly fewer parameters. Capsule networks group the neuron activations for representing higher level attributes and their interactions for achieving equivariance to visual transformations. However, capsule networks have a high computational cost for learning the interactions of capsules in consecutive layers via the, so called, routing algorithm. To address these issues, we propose a novel routing algorithm, Alleviated Pose Attentive Capsule Agreement (ALPACA) which is tailored for capsules that contain pose, feature and existence probability information together to enhance novel viewpoint generalization of capsules on 2D images. For this purpose, we have created a Novel ViewPoint Dataset (NVPD) a viewpoint-controlled texture-free dataset that has 8 different setups where training and test samples are formed by different viewpoints. In addition to NVPD, we have conducted experiments on iLab2M dataset where the dataset is split in terms of the object instances. Experimental results show that ALPACA outperforms its capsule network counterparts and state-of-the-art CNNs on iLab2M and NVPD datasets. Moreover, ALPACA is 10 times faster when compared to routing-based capsule networks. It also outperforms attention-based routing algorithms of the domain while keeping the inference and training times comparable. Lastly, our code, the NVPD dataset, test setups, and implemented models are freely available at https://github.com/Boazrciasn/ALPACA.en_US
dc.identifier.doi10.1007/s00521-022-07900-3en_US
dc.identifier.endpage3536en_US
dc.identifier.issn0941-0643en_US
dc.identifier.issue4en_US
dc.identifier.scopus2-s2.0-85139871182
dc.identifier.startpage3521en_US
dc.identifier.urihttp://hdl.handle.net/10679/9030
dc.identifier.urihttps://doi.org/10.1007/s00521-022-07900-3
dc.identifier.volume35en_US
dc.identifier.wos000867543900001
dc.language.isoengen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.publisherSpringeren_US
dc.relation.ispartofNeural Computing and Applications
dc.relation.publicationcategoryInternational Refereed Journal
dc.rightsrestrictedAccess
dc.subject.keywordsCapsule networksen_US
dc.subject.keywordsNeural networksen_US
dc.subject.keywordsNovel viewpoint generalizationen_US
dc.subject.keywordsQuaternion neural networksen_US
dc.titleGeneralization to unseen viewpoint images of objects via alleviated pose attentive capsule agreementen_US
dc.typearticleen_US
dspace.entity.typePublication
relation.isOrgUnitOfPublication85662e71-2a61-492a-b407-df4d38ab90d7
relation.isOrgUnitOfPublication.latestForDiscovery85662e71-2a61-492a-b407-df4d38ab90d7

Files

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.45 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections