Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network

Sang Yup Oh, Dong Jun Oh, Dongmin Kim, Woohyuk Song, Youngbae Hwang, Namik Cho, Yun Jeong Lim

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Although wireless capsule endoscopy (WCE) detects small bowel diseases effectively, it has some limitations. For example, the reading process can be time consuming due to the numerous images generated per case and the lesion detection accuracy may rely on the operators’ skills and experiences. Hence, many researchers have recently developed deep-learning-based methods to address these limitations. However, they tend to select only a portion of the images from a given WCE video and analyze each image individually. In this study, we note that more information can be extracted from the unused frames and the temporal relations of sequential frames. Specifically, to increase the accuracy of lesion detection without depending on experts’ frame selection skills, we suggest using whole video frames as the input to the deep learning system. Thus, we propose a new Transformer-architecture-based neural encoder that takes the entire video as the input, exploiting the power of the Transformer architecture to extract long-term global correlation within and between the input frames. Subsequently, we can capture the temporal context of the input frames and the attentional features within a frame. Tests on benchmark datasets of four WCE videos showed 95.1% sensitivity and 83.4% specificity. These results may significantly advance automated lesion detection techniques for WCE images.

Original languageEnglish
Article number3133
JournalDiagnostics
Volume13
Issue number19
DOIs
StatePublished - Oct 2023

Keywords

  • artificial intelligence
  • capsule endoscopy
  • transformer
  • video-analysis

Fingerprint

Dive into the research topics of 'Video Analysis of Small Bowel Capsule Endoscopy Using a Transformer Network'. Together they form a unique fingerprint.

Cite this