17 October 2024 A multi-level text feature enhancement super-resolution network via group convolution and wavelet transform
Yanting Xiao, Xiaomei Tian, Yanyu Guo, Yongquan Xiao, Yong Deng
Author Affiliations +
Abstract

Scene text image super-resolution aims to provide high-resolution and readable text images to support scene text recognition. Although existing methods based on deep learning have made significant progress, shallow information during image super-resolution is often ignored as the depth of the neural network increases. We propose a multi-level text feature enhancement super-resolution network (TESRN) to address this issue. TESRN adopts a coarse-to-fine feature extraction methodology, mainly including a shallow feature enhancement block (SFEB) and a multi-level feature fusion and extraction block (MFFEB). In SFEB, we design a framework based on wavelet transform for extracting coarse high-frequency signals. This framework works in parallel with convolution to accomplish shallow extraction. In MFFEB, we propose sequential group convolution blocks (SGCBs) based on group convolution and attention mechanism. Multi-level text features are generated step by step through stacking SGCBs. To comprehensively capture text features, we introduce a bottleneck attention mechanism (BAM) to execute feature selection in spatial and channel dimensions. BAM helps in selecting the most relevant features for text restoration. Finally, we conduct extensive experiments on the TextZoom dataset to evaluate the performance of TESRN. The results demonstrate that TESRN achieves high-quality image restoration and significantly improves the recognition accuracy of low-resolution text images in downstream text recognition tasks. Notably, our model outperforms existing methods in terms of recognition accuracy on the easy test subset. This further validates that TESRN effectively utilizes the shallow features of text images, emphasizing the crucial role of shallow features in text reconstruction.

© 2024 SPIE and IS&T
Yanting Xiao, Xiaomei Tian, Yanyu Guo, Yongquan Xiao, and Yong Deng "A multi-level text feature enhancement super-resolution network via group convolution and wavelet transform," Journal of Electronic Imaging 33(5), 053045 (17 October 2024). https://doi.org/10.1117/1.JEI.33.5.053045
Received: 25 April 2024; Accepted: 17 September 2024; Published: 17 October 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Feature extraction

Super resolution

Image restoration

Discrete wavelet transforms

Performance modeling

Image processing

Back to Top