Paper
2 December 2024 Two-stage video scene segmentation method based on multimodal semantic interaction
Tengfei Wang, Yihui Liao
Author Affiliations +
Proceedings Volume 13443, Fifth International Conference on Computer Vision and Information Technology (CVIT 2024); 1344303 (2024) https://doi.org/10.1117/12.3055769
Event: 2024 5th International Conference on Computer Vision and Information Technology (CVIT 2024), 2024, Beijing, China
Abstract
This paper proposes a two-stage video scene segmentation method based on multimodal semantic interaction. The method divides the video scene segmentation task into two stages: shot audio-visual representation and multimodal scene segmentation. In the first stage, the method leverages the high correlation and complementarity between audio-visual information by using an interactive attention module to deeply explore audio-visual semantic information. Simultaneously, it introduces a self-supervised learning strategy to improve the model's generalization ability by utilizing the temporal structure characteristics of scenes. In the second stage, the method constructs a multimodal feature fusion module, learning a unified shot representation from the audio-visual representation based on the attention mechanism. Additionally, it builds a visual discrimination loss to regulate the influence of audio-visual features, further enhancing the discriminative power of shot representation. Experimental results on the MovieNet benchmark dataset show that the proposed method can achieve more accurate video scene segmentation.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Tengfei Wang and Yihui Liao "Two-stage video scene segmentation method based on multimodal semantic interaction", Proc. SPIE 13443, Fifth International Conference on Computer Vision and Information Technology (CVIT 2024), 1344303 (2 December 2024); https://doi.org/10.1117/12.3055769
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
Back to Top