Paper
3 January 2025 DFCRN: deep-learning-based audio denoising for bird monitoring
Jinlong Xu, Huaibin Zhang, Lin Han
Author Affiliations +
Proceedings Volume 13442, Fifth International Conference on Signal Processing and Computer Science (SPCS 2024); 134420A (2025) https://doi.org/10.1117/12.3054133
Event: Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 2024, Kaifeng, China
Abstract
As people's awareness of ecological protection increases, bird sound monitoring has received more and more attention. Among them, using bird sound monitoring as part of audio recognition has become a hot research topic. Since bird sounds are usually collected in natural environments, they contain a lot of noise, which will affect the monitoring results. To solve this problem, this paper designs a Convolutional Recurrent Network (CRN) that enhances feature representation along the frequency axis. This method is based on the Short-time Fourier transform (STFT) features of sound signals, focuses on the complex operation features in the time-frequency domain, and designs an Decode-Encode architecture combined with a time-frequency domain enhancement network to reduce the impact of interference information, We called this network DFCRN. Experimental results on the public datasets Birdsdata and xeno-canto-ca-nv show that compared with other denoising models, the noisy signal after DFCRN enhancement achieves the best results in SegSNR and SI-SNR, and the classification accuracy on xeno-canto-ca-nv is improved by 5%, verifying the effectiveness and robustness of this method.
(2025) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jinlong Xu, Huaibin Zhang, and Lin Han "DFCRN: deep-learning-based audio denoising for bird monitoring", Proc. SPIE 13442, Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), 134420A (3 January 2025); https://doi.org/10.1117/12.3054133
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Denoising

Convolution

Interference (communication)

Deep learning

Education and training

Signal to noise ratio

Matrices

Back to Top