Bimodal SegNet: fused instance segmentation using events and RGB frames

Sanket Kachole, Xiaoqian Huang, Fariborz Baghaei Naeini, Rajkumar Muthusamy, Dimitrios Makris, Yahya Zweiri

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Object segmentation enhances robotic grasping by aiding object identification. Complex environments and dynamic conditions pose challenges such as occlusion, low light conditions, motion blur and object size variance. To address these challenges, we propose a Bimodal SegNet that fuses two types of visual signals, event-based data and RGB frame data. The proposed Bimodal SegNet network has two distinct encoders — one for RGB signal input and another for Event signal input, in addition to an Atrous Pyramidal Feature Amplification module. Encoders capture and fuse the rich contextual information from different resolutions via a Cross-Domain Contextual Attention layer while the decoder obtains sharp object boundaries. The evaluation of the proposed method undertakes five unique image degradation challenges including occlusion, blur, brightness, trajectory and scale variance on the Event-based Segmentation (ESD) Dataset. The results show a 4%-6% MIOU score improvement over state-of-the-art methods in terms of mean intersection over the union and pixel accuracy. The source code, dataset and model are publicly available at: https://github.com/sanket0707/Bimodal-SegNet.
    Original languageEnglish
    Article number110215
    JournalPattern Recognition
    Volume149
    Early online date21 Dec 2023
    DOIs
    Publication statusE-pub ahead of print - 21 Dec 2023

    Keywords

    • Robotics
    • Grasping
    • Event vision
    • Deep learning
    • Cross attention.
    • Computer science and informatics

    Fingerprint

    Dive into the research topics of 'Bimodal SegNet: fused instance segmentation using events and RGB frames'. Together they form a unique fingerprint.

    Cite this