Fine-grained image classification aims to accurately categorize subclasses within a particular category. Due to the small inter-class differences and large intra-class variations, fine-grained image classification has been a challenging research topic in the field of computer vision and holds significant research value. Existing neural network-based algorithms suffer from the loss of fine-grained texture details during the training process and the inability to effectively fuse features extracted from different convolution layers of the backbone network. To address these issues, this paper proposes a fine-grained image classification method based on a lightweight feature extraction network with MobileNet v2 as the core, incorporating multi-scale feature fusion and attention mechanism. Considering that high-level and low-level features contain rich semantic and textural information, attention mechanisms are embedded into different scales to capture more diverse feature information. Experimental evaluations conducted on the publicly available fine-grained dataset, A Large Scale Fish Dataset, achieve a classification accuracy of 99.86%. The results demonstrate the superiority of the proposed method in fine-grained object classification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.