Comparing two deep learning algorithms for acute infarct segmentation on diffusion-weighted imaging in routine clinical practice

  • Hokyu Kim
  • , Moses Lee
  • , Hoyoun Lee
  • , Jinyong Chung
  • , Sang Wuk Jeong
  • , Dong Seok Gwak
  • , Beom Joon Kim
  • , Joon Tae Kim
  • , Keun Sik Hong
  • , Kyung Bok Lee
  • , Tai Hwan Park
  • , Sang Soon Park
  • , Jong Moo Park
  • , Kyusik Kang
  • , Yong Jin Cho
  • , Hong Kyun Park
  • , Byung Chul Lee
  • , Kyung Ho Yu
  • , Mi Sun Oh
  • , Soo Joo Lee
  • Jae Guk Kim, Jae Kwan Cha, Dae Hyun Kim, Jun Lee, Man Seok Park, Hosung Kim, Hee Joon Bae, Dong Eog Kim, Chi Kyung Kim, Wi Sun Ryu

Research output: Contribution to journalArticlepeer-review

Abstract

Objectives: Infarct volumes on diffusion-weighted imaging (DWI) are critical for predicting stroke outcomes and guiding late-window endovascular thrombectomy. Although 3D U-Net-based deep learning achieves high sensitivity, it often yields false positives due to infarct mimics. We developed a SegMamba-based model to enhance global volumetric feature extraction and compared both approaches on a dataset encompassing multiple DWI hyperintense pathologies. Methods: Two models were trained on a multicenter dataset of 10,820 DWI scans (2011–2014) and evaluated against manual segmentation on an external test set of 2731 fresh DWI scans. Diagnostic accuracy was assessed in a clinical cohort of 1194 patients from a different center (2017–2020) who underwent DWI for various indications. We compared the models using the Dice similarity coefficient (DSC), average Hausdorff distance (AHD), sensitivity, and specificity. Results: The training, external test, and clinical test datasets had mean (SD) ages of 67.9 (12.8), 68.2 (12.7), and 63.9 (15.4) years, with 58.9%, 60.4%, and 58.1% male, respectively. In the external test dataset, SegMamba and U-Net achieved similar DSC (0.786 vs 0.785; p = 0.141), but SegMamba outperformed U-Net in AHD (1.25 mm vs 1.76 mm; p < 0.001). In the clinical dataset, SegMamba showed slightly lower sensitivity (96.97% vs 98.79%) but substantially higher specificity (58.80% vs 29.54%), resulting in higher overall accuracy (64.07% vs 39.11%; p < 0.001). Conclusions: Changing the main architecture of the segmentation model alone maintained segmentation performance within ischemic-stroke cohorts, while achieving better classification in broader disease populations. This study highlights the need for deep-learning models to be validated not only for segmentation performance within target disease cohorts but also across diverse clinical environments to ensure practical utility.

Original languageEnglish
JournalDigital Health
Volume11
DOIs
StatePublished - 1 Nov 2025

Keywords

  • Artificial intelligence
  • algorithms
  • deep learning
  • diffusion magnetic resonance imaging
  • ischemic stroke

Fingerprint

Dive into the research topics of 'Comparing two deep learning algorithms for acute infarct segmentation on diffusion-weighted imaging in routine clinical practice'. Together they form a unique fingerprint.

Cite this