15 October 2024 Zero-shot learning with visual-semantic mutual reinforcement for image recognition
Yuhong Zhang, Taohong Chen, Kui Yu, Xuegang Hu
Author Affiliations +
Abstract

Zero-shot learning for image recognition recognizes the unseen classes by learning the transferable representations from seen to unseen classes in view of visuals or semantics. Recently, a few works have focused on learning the transferability of visual and semantic modals. They aim to conduct the adaptation between two modals to achieve more reliable transferability. However, their performances depend on the consistency of transferability in both modals. When the transferability of one modal is insufficient, the adaptation will lead to sparse and indistinguishable representations of unseen classes. To this end, a zero-shot method with visual-semantic mutual reinforcement is proposed, in which visual transferability and semantic transferability are reinforced mutually so that the transferable representations of two modals can complement each other, resulting in more discriminative transferable representations. Specifically, in visual transferability learning, local semantics are utilized to reinforce the key region representations, thus enriching the key regions and excluding the ambiguous regions in the image representations. In semantic transferability learning, visual class prototypes are used to reinforce the semantic representations with higher class discriminability. Finally, the image representations and attribute class prototypes for unseen classes are incorporated to recognize the unseen samples. Experimental results on multiple datasets show that the zero-shot and generalized zero-shot recognition of our method outperforms the state-of-the-art.

© 2024 SPIE and IS&T
Yuhong Zhang, Taohong Chen, Kui Yu, and Xuegang Hu "Zero-shot learning with visual-semantic mutual reinforcement for image recognition," Journal of Electronic Imaging 33(5), 053041 (15 October 2024). https://doi.org/10.1117/1.JEI.33.5.053041
Received: 25 June 2024; Accepted: 23 September 2024; Published: 15 October 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Semantics

Prototyping

Education and training

Feature extraction

Classification systems

Picosecond phenomena

Back to Top