Hierarchical semantic loss and confidence estimator for visual-semantic embedding-based zero-shot learning

Sanghyun Seo, Juntae Kim

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Traditional supervised learning is dependent on the label of the training data, so there is a limitation that the class label which is not included in the training data cannot be recognized properly. Therefore, zero-shot learning, which can recognize unseen-classes that are not used in training, is gaining research interest. One approach to zero-shot learning is to embed visual data such as images and rich semantic data related to text labels of visual data into a common vector space to perform zero-shot cross-modal retrieval on newly input unseen-class data. This paper proposes a hierarchical semantic loss and confidence estimator to more efficiently perform zero-shot learning on visual data. Hierarchical semantic loss improves learning efficiency by using hierarchical knowledge in selecting a negative sample of triplet loss, and the confidence estimator estimates the confidence score to determine whether it is seen-class or unseen-class. These methodologies improve the performance of zero-shot learning by adjusting distances from a semantic vector to visual vector when performing zero-shot cross-modal retrieval. Experimental results show that the proposed method can improve the performance of zero-shot learning in terms of hit@k accuracy.

Original languageEnglish
Article number3133
JournalApplied Sciences (Switzerland)
Volume9
Issue number15
DOIs
StatePublished - 1 Aug 2019

Keywords

  • Confidence estimator
  • Hierarchical semantic loss
  • Visual-semantic embedding
  • Zero-shot cross-modal retrieval
  • Zero-shot learning

Fingerprint

Dive into the research topics of 'Hierarchical semantic loss and confidence estimator for visual-semantic embedding-based zero-shot learning'. Together they form a unique fingerprint.

Cite this