Investigating coupling preprocessing with shallow and deep convolutional neural networks in document image classification

Yi Liu; Leen-Kiat Soh; Elizabeth Lorang

doi:10.1117/1.JEI.30.4.043024

25 August 2021 Investigating coupling preprocessing with shallow and deep convolutional neural networks in document image classification

Yi Liu, Leen-Kiat Soh, Elizabeth Lorang

Author Affiliations +

Journal of Electronic Imaging, Vol. 30, Issue 4, 043024 (August 2021). https://doi.org/10.1117/1.JEI.30.4.043024

Abstract

Convolutional neural networks (CNNs) are effective for image classification, and deeper CNNs are being used to improve classification performance. Indeed, as needs increase for searchability of vast printed document image collections, powerful CNNs have been used in place of conventional image processing. However, better performances of deep CNNs come at the expense of computational complexity. Are the additional training efforts required by deeper CNNs worth the improvement in performance? Or could a shallow CNN coupled with conventional image processing (e.g., binarization and consolidation) outperform deeper CNN-based solutions? We investigate performance gaps among shallow (LeNet-5, -7, and -9), deep (ResNet-18), and very deep (ResNet-152, MobileNetV2, and EfficientNet) CNNs for noisy printed document images, e.g., historical newspapers and document images in the RVL-CDIP repository. Our investigation considers two different classification tasks: (1) identifying poems in historical newspapers and (2) classifying 16 document types in document images. Empirical results show that a shallow CNN coupled with computationally inexpensive preprocessing can have a robust response with significantly reduced training samples; deep CNNs coupled with preprocessing can outperform very deep CNNs effectively and efficiently; and aggressive preprocessing is not helpful as it could remove potentially useful information in document images.

CC BY: © The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

Citation Download Citation

Yi Liu, Leen-Kiat Soh, and Elizabeth Lorang "Investigating coupling preprocessing with shallow and deep convolutional neural networks in document image classification," Journal of Electronic Imaging 30(4), 043024 (25 August 2021). https://doi.org/10.1117/1.JEI.30.4.043024

Received: 4 April 2021; Accepted: 3 August 2021; Published: 25 August 2021

ACCESS THE FULL ARTICLE

JOURNAL ARTICLE
30 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY