Abstract
Zero-shot learning (ZSL) is an important and rapidly growing area of machine learning that aims to recognize new classes without prior training data. Despite its significance, ZSL has faced challenges with overfitting in embedding-based methods and limitations in traditional one-directional attention (ODA) based approaches. To bridge these gaps, this paper proposes the use of bi-directional attention (BDA) to integrate insights from both embedding and attention-based approaches. The proposed BDA system consists of a bi-directional attention network (BDAN) and a synthesized visual embedding network (SVEN) that facilitates visual-semantic interaction for ZSL classification. More specifically, the BDAN employs region self-attention (RSA), semantic synthesis attention (SSA), and visual synthesis attention (VSA) to overcome the overfitting issue in embedding methods and enhance transferability, to associate visual features with semantic property information, and to learn locally improved visual features. Extensive testing on CUB, SUN, and AWA2 datasets confirm the superiority of our proposed method over traditional approaches.
| Original language | English |
|---|---|
| Pages (from-to) | 983-1003 |
| Number of pages | 21 |
| Journal | Computational Visual Media |
| Volume | 11 |
| Issue number | 5 |
| DOIs | |
| State | Published - 2025 |
Keywords
- bi-directional attention (BDA)
- interaction
- transferability
- zero-shot learning (ZSL)
Fingerprint
Dive into the research topics of 'BDA: Bi-directional attention for zero-shot learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver