Language independent semantic kernels for short-text classification

  • Kwanho Kim
  • , Beom Suk Chung
  • , Yerim Choi
  • , Seungjun Lee
  • , Jae Yoon Jung
  • , Jonghun Park

Research output: Contribution to journalArticlepeer-review

47 Scopus citations

Abstract

Short-text classification is increasingly used in a wide range of applications. However, it still remains a challenging problem due to the insufficient nature of word occurrences in short-text documents, although some recently developed methods which exploit syntactic or semantic information have enhanced performance in short-text classification. The language-dependency problem, however, caused by the heavy use of grammatical tags and lexical databases, is considered the major drawback of the previous methods when they are applied to applications in diverse languages. In this article, we propose a novel kernel, called language independent semantic (LIS) kernel, which is able to effectively compute the similarity between short-text documents without using grammatical tags and lexical databases. From the experiment results on English and Korean datasets, it is shown that the LIS kernel has better performance than several existing kernels.

Original languageEnglish
Pages (from-to)735-743
Number of pages9
JournalExpert Systems with Applications
Volume41
Issue number2
DOIs
StatePublished - 2014

Keywords

  • Kernel method
  • Language independent semantic kernel
  • Short-text document classification
  • Similarity measure

Fingerprint

Dive into the research topics of 'Language independent semantic kernels for short-text classification'. Together they form a unique fingerprint.

Cite this