2 September 2024 Light field salient object detection network based on feature enhancement and mutual attention
Xi Zhu, Huai Xia, Xucheng Wang, Zhenrong Zheng
Author Affiliations +
Abstract

Light field salient object detection (SOD) is an essential research topic in computer vision, but robust saliency detection in complex scenes is still very challenging. We propose a new method for accurate and robust light field SOD via convolutional neural networks containing feature enhancement modules. First, the light field dataset is extended by geometric transformations such as stretching, cropping, flipping, and rotating. Next, two feature enhancement modules are designed to extract features from RGB images and depth maps, respectively. The obtained feature maps are fed into a two-stream network to train the light field SOD. We propose a mutual attention approach in this process, extracting and fusing features from RGB images and depth maps. Therefore, our network can generate an accurate saliency map from the input light field images after training. The obtained saliency map can provide reliable a priori information for tasks such as semantic segmentation, target recognition, and visual tracking. Experimental results show that the proposed method achieves excellent detection performance in public benchmark datasets and outperforms the state-of-the-art methods. We also verify the generalization and stability of the method in real-world experiments.

© 2024 SPIE and IS&T
Xi Zhu, Huai Xia, Xucheng Wang, and Zhenrong Zheng "Light field salient object detection network based on feature enhancement and mutual attention," Journal of Electronic Imaging 33(5), 053001 (2 September 2024). https://doi.org/10.1117/1.JEI.33.5.053001
Received: 16 January 2024; Accepted: 9 August 2024; Published: 2 September 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

RGB color model

Depth maps

Education and training

Feature extraction

Computer vision technology

Data modeling

Back to Top