BDA: Bi-directional attention for zero-shot learning

  • Junseok Lee
  • , Jinming Cao
  • , Yifang Yin
  • , Jihie Kim
  • , Roger Zimmermann
  • , Seongsik Park

Research output: Contribution to journalArticlepeer-review

Abstract

Zero-shot learning (ZSL) is an important and rapidly growing area of machine learning that aims to recognize new classes without prior training data. Despite its significance, ZSL has faced challenges with overfitting in embedding-based methods and limitations in traditional one-directional attention (ODA) based approaches. To bridge these gaps, this paper proposes the use of bi-directional attention (BDA) to integrate insights from both embedding and attention-based approaches. The proposed BDA system consists of a bi-directional attention network (BDAN) and a synthesized visual embedding network (SVEN) that facilitates visual-semantic interaction for ZSL classification. More specifically, the BDAN employs region self-attention (RSA), semantic synthesis attention (SSA), and visual synthesis attention (VSA) to overcome the overfitting issue in embedding methods and enhance transferability, to associate visual features with semantic property information, and to learn locally improved visual features. Extensive testing on CUB, SUN, and AWA2 datasets confirm the superiority of our proposed method over traditional approaches.

Original languageEnglish
Pages (from-to)983-1003
Number of pages21
JournalComputational Visual Media
Volume11
Issue number5
DOIs
StatePublished - 2025

Keywords

  • bi-directional attention (BDA)
  • interaction
  • transferability
  • zero-shot learning (ZSL)

Fingerprint

Dive into the research topics of 'BDA: Bi-directional attention for zero-shot learning'. Together they form a unique fingerprint.

Cite this