Data preprocessing and transformation in the sentiment analysis using a deep learning technique

Seo Hye-Jin, Jeong Ah Shin

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

This study examined how to preprocess and transform data efficiently in order to use deep learning techniques in analyzing linguistic data. Researchers interests in deep learning techniques have explosively increased worldwide; however, it is not easy for them to link linguistics to deep learning techniques or algorithms because linguists do not know how and where to begin in using them. Thus, this study provides the general procedure to train data using deep learning algorithms in practice. In particular, for instance, we focused on how to preprocess and transform Tweet data for a sentiment analysis by using deep learning techniques. In addition, we introduced the latest deep learning algorithm, so-called BERT, in the data preprocessing and transformation procedure. The data preprocessing is particularly important because the result from deep learning can significantly vary depending on it. Even though the data preprocessing procedure can differ according to the aim of research, this study tries to introduce the general way that advanced researchers frequently use for deep learning algorithms. This study is expected to lower the barriers in applying deep learning techniques to linguistic data and make it easier for researchers to conduct deep learning research related to linguistics.

Original languageEnglish
Pages (from-to)42-63
Number of pages22
JournalKorean Journal of English Language and Linguistics
Volume2020
Issue number20
DOIs
StatePublished - 2020

Keywords

  • Data preprocessing
  • Deep learning
  • Sentiment analysis
  • Transformation

Fingerprint

Dive into the research topics of 'Data preprocessing and transformation in the sentiment analysis using a deep learning technique'. Together they form a unique fingerprint.

Cite this