Person:
ÖZER, Sedat

Loading...
Profile Picture

Email Address

Birth Date

WoSScopusGoogle ScholarORCID

Name

Job Title

First Name

Sedat

Last Name

ÖZER
Organizational Unit

Publication Search Results

Now showing 1 - 7 of 7
  • Placeholder
    Conference ObjectPublication
    Performance analysis of meta-learning based bayesian deep kernel transfer methods for regression tasks
    (IEEE, 2023) Savaşlı, Ahmet Çağatay; Tütüncü, Damla; Ndigande, Alain Patrick; Özer, Sedat; Computer Science; ÖZER, Sedat; Savaşlı, Ahmet Çağatay; Tütüncü, Damla; Ndigande, Alain Patrick
    Meta-learning aims to apply existing models on new tasks where the goal is 'learning to learn' so that learning from a limited amount of labeled data or learning in a short amount of time is possible. Deep Kernel Transfer (DKT) is a recently proposed meta-learning approach based on Bayesian framework. DKT's performance depends on the used kernel functions and it has two implementations, namely DKT and GPNet. In this paper, we use a large set of kernel functions on both DKT and GPNet implementations for two regression tasks to study their performances and train them under different optimizers. Furthermore, we compare the training time of both implementations to clarify the ambiguity in terms of which algorithm runs faster for the regression based tasks.
  • Placeholder
    Conference ObjectPublication
    Using different loss functions with YOLACT++ for real-time instance segmentation
    (IEEE, 2023) Köleş, Selin; Karakaş, Selami; Ndigande, Alain Patrick; Özer, Sedat; Computer Science; ÖZER, Sedat; Köleş, Selin; Karakaş, Selami; Ndigande, Alain Patrick
    In this paper, we study and analyze the performance of various loss functions on a recently proposed real-time instance segmentation algorithm, YOLACT++. In particular, we study the loss functions, including Huber Loss, Binary Cross Entropy (BCE), Mean Square Error (MSE), Log-Cosh-Dice Loss, and their various combinations within the YOLACT++ architecture. We demonstrate that we can use different loss functions from the default loss function (BCE) of YOLACT++ for improved real-time segmentation results. In our experiments, we show that a certain combination of two loss functions improves the segmentation performance of YOLACT++ in terms of the mean Average Precision (mAP) metric on Cigarettes dataset, when compared to its original loss function.
  • Placeholder
    Conference ObjectPublication
    ORTPiece: An ORT-based Turkish image captioning network based on transformers and WordPiece
    (IEEE, 2023) Ersoy, Asım; Yıldız, Olcay Taner; Özer, Sedat; Computer Science; YILDIZ, Olcay Taner; ÖZER, Sedat; Ersoy, Asım
    Recent transformers-based systems are advancing image captioning applications. However, those works have been mainly applied to English-based image captioning problems. In this paper, we introduce a transformers-based Turkish-based image captioning algorithm. Our proposed algorithm uses appearance and geometry features from the input image and combines them along with the WordPiece embeddings to generate the Turkish-based caption. Our experimental results show improvement when compared to the other existing techniques including the original ORT and the show-and-tell algorithms.
  • Placeholder
    Conference ObjectPublication
    YOLODrone+: improved YOLO architecture for object detection in UAV images
    (IEEE, 2022) Şahin, Ö.; Özer, Sedat; Computer Science; ÖZER, Sedat
    The performance of object detection algorithms running on images taken from Unmanned Aerial Vehicles (UAVs) remains limited when compared to the object detection algorithms running on ground taken images. Due to its various features, YOLO based models, as a part of one-stage object detectors, are preferred in many UAV based applications. In this paper, we are proposing novel architectural improvements to the YO-LOv5 architecture. Our improvements include: (i) increasing the number of detection layers and (ii) use of transformers in the model. In order to train and test the performance of our proposed model, we used VisDrone and SkyData datasets in our paper. Our test results suggest that our proposed solutions can improve the detection accuracy.
  • Placeholder
    ArticlePublication
    Offloading deep learning powered vision tasks from UAV to 5G edge server with denoising
    (IEEE, 2023-06) Özer, Sedat; Ilhan, H. E.; Özkanoğlu, Mehmet Akif; Cirpan, H. A.; Computer Science; ÖZER, Sedat; Özkanoğlu, Mehmet Akif
    Offloading computationally heavy tasks from an unmanned aerial vehicle (UAV) to a remote server helps improve battery life and can help reduce resource requirements. Deep learning based state-of-the-art computer vision tasks, such as object segmentation and detection, are computationally heavy algorithms, requiring large memory and computing power. Many UAVs are using (pretrained) off-the-shelf versions of such algorithms. Offloading such power-hungry algorithms to a remote server could help UAVs save power significantly. However, deep learning based algorithms are susceptible to noise, and a wireless communication system, by its nature, introduces noise to the original signal. When the signal represents an image, noise affects the image. There has not been much work studying the effect of the noise introduced by the communication system on pretrained deep networks. In this work, we first analyze how reliable it is to offload deep learning based computer vision tasks (including both object segmentation and detection) by focusing on the effect of various parameters of a 5G wireless communication system on the transmitted image and demonstrate how the introduced noise of the used 5G system reduces the performance of the offloaded deep learning task. Then solutions are introduced to eliminate (or reduce) the negative effect of the noise. Proposed framework starts with introducing many classical techniques as alternative solutions, and then introduces a novel deep learning based solution to denoise the given noisy input image. The performance of various denoising algorithms on offloading both object segmentation and object detection tasks are compared. Our proposed deep transformer-based denoiser algorithm (NR-Net) yields state-of-the-art results in our experiments.
  • Placeholder
    ArticlePublication
    SiameseFuse: A computationally efficient and a not-so-deep network to fuse visible and infrared images
    (Elsevier, 2022-09) Özer, Sedat; Ege, M.; Özkanoglu, M. A.; Computer Science; ÖZER, Sedat
    Recent developments in pattern analysis have motivated many researchers to focus on developing deep learning based solutions in various image processing applications. Fusing multi-modal images has been one such application area where the interest is combining different information coming from different modalities in a more visually meaningful and informative way. For that purpose, it is important to first extract salient features from each modality and then fuse them as efficiently and informatively as possible. Recent literature on fusing multi-modal images reports multiple deep solutions that combine both visible (RGB) and infra-red (IR) images. In this paper, we study the performance of various deep solutions available in the literature while seeking an answer to the question: “Do we really need deeper networks to fuse multi-modal images?” To have an answer for that question, we introduce a novel architecture based on Siamese networks to fuse RGB (visible) images with infrared (IR) images and report the state-of-the-art results. We present an extensive analysis on increasing the layer numbers in the architecture with the above-mentioned question in mind to see if using deeper networks (or adding additional layers) adds significant performance in our proposed solution. We report the state-of-the-art results on visually fusing given visible and IR image pairs in multiple performance metrics, while requiring the least number of trainable parameters. Our experimental results suggest that shallow networks (as in our proposed solutions in this paper) can fuse both visible and IR images as well as the deep networks that were previously proposed in the literature (we were able to reduce the total number of trainable parameters up to 96.5%, compare 2,625 trainable parameters to the 74,193 trainable parameters).
  • Placeholder
    ArticlePublication
    InfraGAN: A GAN architecture to transfer visible images to infrared domain
    (Elsevier, 2022-03) Özkanoglu, M. A.; Özer, Sedat; Computer Science; ÖZER, Sedat
    Utilizing both visible and infrared (IR) images in various deep learning based computer vision tasks has been a recent trend. Consequently, datasets having both visible and IR image pairs are desired in many applications. However, while large image datasets taken at the visible spectrum can be found in many domains, large IR-based datasets are not easily available in many domains. The lack of IR counterparts of the available visible image datasets limits existing deep algorithms to perform on IR images effectively. In this paper, to overcome with that challenge, we introduce a generative adversarial network (GAN) based solution and generate the IR equivalent of a given visible image by training our deep network to learn the relation between visible and IR modalities. In our proposed GAN architecture (InfraGAN), we introduce using structural similarity as an additional loss function. Furthermore, in our discriminator, we do not only consider the entire image being fake or real but also each pixel being fake or real. We evaluate our comparative results on three different datasets and report the state of the art results over five metrics when compared to Pix2Pix and ThermalGAN architectures from the literature. We report up to +16% better performance in Structural Similarity Index Measure (SSIM) over Pix2Pix and +8% better performance over ThermalGAN for VEDAI dataset. Further gains on different metrics and on different datasets are also reported in our experiments section.