Publication: Improving regression performance on monocular 3D object detection using bin-mixing and sparse voxel data
Institution Authors
Authors
Balatkan, Eren
Kıraç, Mustafa Furkan
Journal Title
Journal ISSN
Volume Title
Type
Conference paper
Access
info:eu-repo/semantics/restrictedAccess
Publication Status
Published
Abstract
Accurate and fast 3D object detection plays a role of paramount importance for safe and capable autonomous machines. LiDAR point cloud based methods have demonstrated impressive results, yet expensive LiDAR sensors make such approaches infeasible for wide-scale adaptation. Camera based methods on the other hand are performing sub-optimally given safety and accuracy requirements. Traditionally, camera based 3D object detection is performed by generating pseudo-LiDAR point clouds from RGB-D data and using point-cloud based models, however, irregular nature of point cloud data representation makes it challenging to exploit spatial local correlations on 3D space and point cloud based models generally suffer from this. To this end, we propose Sparse Voxel based 3D Object Detection, our approach differs from traditional approaches by converting point cloud information to sparse voxel grid and utilizing sub-manifold sparse convolutions to extract information instead of PointNet based models. Furthermore, we propose Bin-Mixing layers. Bin-Mixing replaces the output layer of a neural network and boosts performance by representing the problem of regression in a fashion that is easier for network to learn.
Date
2021
Publisher
IEEE