Paper
22 November 2024 MSAN: mask semantic attention network for single-frame structured light 3D reconstruction
YiMing Li, ZiNan Li, WeiKang Chen, Hao Wang, MingFeng Chen, ChaoBo Zhang, XiaoHao Wang, WeiHua Gui, XiaoJun Liang
Author Affiliations +
Abstract
Deep learning-driven structured light 3D measurement has garnered significant attention due to the fast speed, high precision and non-contact characteristic. However, the accurate prediction of edge discontinuity area is still one of the challenges. In single-frame end-to-end absolute phase prediction task, we initially proposed a mask semantic attention network (MSAN) to enhance the edge and whole accuracy. Firstly, mask serves to partition the scene into its background (shadow) and foreground (objects) elements, and it provides semantic attention for the network. Secondly, we designed a mask fusion (MF) module which can effectively integrates feature maps with mask semantics. Based on the MF module and mask semantic information, we developed a U-shaped network architecture, and each layer feature map of the decoder is fused with the input mask adopting the MF module. MSAN improves edge prediction accuracy by explicitly identifying edge regions and drawing the network's attention to the edges and objects rather than shadow areas, enhancing overall prediction accuracy. Validation on real datasets showed that the mean absolute error decreased by 33% and the root mean square error decreased by 76% with MSAN, demonstrating the network's capability to improve both overall and edge precision in structured light deep learning tasks. This advancement significantly benefits the development of high precise and rapid structured light 3D measurement technologies.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
YiMing Li, ZiNan Li, WeiKang Chen, Hao Wang, MingFeng Chen, ChaoBo Zhang, XiaoHao Wang, WeiHua Gui, and XiaoJun Liang "MSAN: mask semantic attention network for single-frame structured light 3D reconstruction", Proc. SPIE 13239, Optoelectronic Imaging and Multimedia Technology XI, 132390B (22 November 2024); https://doi.org/10.1117/12.3036442
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Semantics

3D mask effects

3D metrology

Structured light

3D modeling

Shadows

Deep learning

Back to Top