Adaptive undersampling and short clip-based two-stream CNN-LSTM model for surgical phase recognition on cholecystectomy videos

Sang Goo Lee, Ga Young Kim, Yoo Na Hwang, Ji Yean Kwon, Sung Min Kim

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Surgical phase recognition is challenging due to overfitting problems caused by imbalanced data among surgical phases. We proposed an adaptive sampling rate-based undersampling method that could generate the number of each surgical phase data similarly to alleviate biased learning. To improve the performance of our method, we also introduced a two-stream CNN-LSTM model that could extract temporal information on behavioral changes between each image frame. First, we extracted a total of 40,236 short clips using an adaptive subsampling rate from the entire video. Each short clip was entered into a pre-trained GoogLeNet. The output with visual information was then immediately fed into a sequence-to-sequence LSTM model to extract temporal information of neighbor frames within a short clip. At the same time, another sequence-to-vector LSTM was used, to extract temporal information from all successive image frames to predict the final surgical phase. The proposed method was evaluated with a public dataset Cholec80. The proposed approach outperformed state-of-the-art methods, showing a high F1-score of 87.12% and an AUC of 98.00%. In addition, the F1-score deviation between all phases decreased by about 10% compared to that before applying undersampling. Experimental results confirmed that employing our proposed method could learn enrich temporal information from short clips. It outperformed the conventional one-stream CNN-LSTM architecture.

Original languageEnglish
Article number105637
JournalBiomedical Signal Processing and Control
Volume88
DOIs
StatePublished - Feb 2024

Keywords

  • Automated surgical phase recognition
  • Cholecystectomy
  • Endoscopic video
  • Short-clip-based
  • Two-stream CNN-LSTMs
  • Undersampling

Fingerprint

Dive into the research topics of 'Adaptive undersampling and short clip-based two-stream CNN-LSTM model for surgical phase recognition on cholecystectomy videos'. Together they form a unique fingerprint.

Cite this