Show simple item record

dc.contributor.authorÖzer, Sedat
dc.contributor.authorEge, M.
dc.contributor.authorÖzkanoglu, M. A.
dc.date.accessioned2023-06-15T19:06:36Z
dc.date.available2023-06-15T19:06:36Z
dc.date.issued2022-09
dc.identifier.issn0031-3203en_US
dc.identifier.urihttp://hdl.handle.net/10679/8416
dc.identifier.urihttps://www.sciencedirect.com/science/article/pii/S0031320322001935
dc.description.abstractRecent developments in pattern analysis have motivated many researchers to focus on developing deep learning based solutions in various image processing applications. Fusing multi-modal images has been one such application area where the interest is combining different information coming from different modalities in a more visually meaningful and informative way. For that purpose, it is important to first extract salient features from each modality and then fuse them as efficiently and informatively as possible. Recent literature on fusing multi-modal images reports multiple deep solutions that combine both visible (RGB) and infra-red (IR) images. In this paper, we study the performance of various deep solutions available in the literature while seeking an answer to the question: “Do we really need deeper networks to fuse multi-modal images?” To have an answer for that question, we introduce a novel architecture based on Siamese networks to fuse RGB (visible) images with infrared (IR) images and report the state-of-the-art results. We present an extensive analysis on increasing the layer numbers in the architecture with the above-mentioned question in mind to see if using deeper networks (or adding additional layers) adds significant performance in our proposed solution. We report the state-of-the-art results on visually fusing given visible and IR image pairs in multiple performance metrics, while requiring the least number of trainable parameters. Our experimental results suggest that shallow networks (as in our proposed solutions in this paper) can fuse both visible and IR images as well as the deep networks that were previously proposed in the literature (we were able to reduce the total number of trainable parameters up to 96.5%, compare 2,625 trainable parameters to the 74,193 trainable parameters).en_US
dc.description.sponsorshipTÜBİTAK
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.relationinfo:turkey/grantAgreement/TUBITAK/118C356
dc.relation.ispartofPattern Recognition
dc.rightsrestrictedAccess
dc.titleSiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared imagesen_US
dc.typeArticleen_US
dc.peerreviewedyesen_US
dc.publicationstatusPublisheden_US
dc.contributor.departmentÖzyeğin University
dc.contributor.authorID(ORCID 0000-0002-2069-3807 & YÖK ID 386309) Özer, Sedat
dc.contributor.ozuauthorÖzer, Sedat
dc.identifier.volume129en_US
dc.identifier.wosWOS:000832702600001
dc.identifier.doi10.1016/j.patcog.2022.108712en_US
dc.subject.keywordsEfficient learningen_US
dc.subject.keywordsMulti-modal fusionen_US
dc.subject.keywordsMulti-temporal fusionen_US
dc.identifier.scopusSCOPUS:2-s2.0-85129333215
dc.relation.publicationcategoryArticle - International Refereed Journal - Institutional Academic Staff


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Share this page