An X-ray security image detection model incorporating a multi-scale fusion module is proposed to address the problem of low accuracy in detecting threat objects in X-ray images against complex backgrounds. The model adds a multichannel fusion convolution block after the Neck layer to perform adaptive feature fusion and refinement on the input image, effectively improve the description of global information and boundary attributes of x-ray threat objects to improve the precision of detecting and identifying threat objects. SIoU is chosen to replace CIoU as the loss function of border regression, which redefines the penalty index and reduces the total degrees of freedom of loss to achieve the high accuracy localization. The model can effectively detect five different categories of dangerous goods on the Tianchi dataset, and the mAP value for dangerous goods detection is 92.7%, which is 2.1% higher than YOLOv5s, can satisfy the real-time recognition and detection requirements with high accuracy, good robustness and speed.
KEYWORDS: Convolution, Optical character recognition, Education and training, Feature extraction, Data modeling, Performance modeling, Overfitting, Matrices, Deep learning, Visual process modeling
Handwriting Chinese Character Recognition (HCCR) is the foundation of document digitization. It is a challenging subject in the field of image classification and recognition for a series of reasons such as the large number of Chinese characters, the diversification of writing style and numerous similar characters. To solve the above problems, this paper designs a four-channel convolution recognition model based on MobileNetV2. First, the input image is sent to four-channel convolution with different receptive fields, and feature maps of different scales are extracted respectively to improve the accuracy of the model. Then the feature maps are combined to enrich the diversity of features. Afterwards, the combined features are weighted by SE Block, and more useful feature maps are screened by this means to accelerate the model convergence. Finally, the lightweight network mobilenetv2 is used to classify the weighted features. The experimental results show that the recognition accuracy of the four-channel convolution recognition model based on mobilenetv2 on the offline handwritten Chinese character set CASIA-HWDB1.1 has reached 96.05%, and the convergence speed of the model is extremely fast. Also, the memory occupation and parameter quantity are far lower than other Chinese handwriting character recognition models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.