Comparing two deep learning algorithms for acute infarct segmentation on diffusion-weighted imaging in routine clinical practiceopen access
- Authors
- Kim, Hokyu; Lee, Moses; Lee, Hoyoun; Chung, Jinyong; Jeong, Sang-Wuk; Gwak, Dong-Seok; Kim, Beom Joon; Kim, Joon-Tae; Hong, Keun-Sik; Lee, Kyung Bok; Park, Tai Hwan; Park, Sang-Soon; Park, Jong-Moo; Kang, Kyusik; Cho, Yong-Jin; Park, Hong-Kyun; Lee, Byung-Chul; Yu, Kyung-Ho; Oh, Mi Sun; Lee, Soo Joo; Kim, Jae Guk; Cha, Jae-Kwan; Kim, Dae-Hyun; Lee, Jun; Park, Man Seok; Kim, Hosung; Bae, Hee-Joon; Kim, Dong-Eog; Kim, Chi Kyung; Ryu, Wi-Sun
- Issue Date
- 2025
- Publisher
- SAGE PUBLICATIONS LTD
- Keywords
- Artificial intelligence; deep learning; algorithms; diffusion magnetic resonance imaging; ischemic stroke
- Citation
- Digital Health, v.11
- Indexed
- SCIE
SSCI
SCOPUS
- Journal Title
- Digital Health
- Volume
- 11
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/62237
- DOI
- 10.1177/20552076251396985
- ISSN
- 2055-2076
- Abstract
- Objectives: Infarct volumes on diffusion-weighted imaging (DWI) are critical for predicting stroke outcomes and guiding late-window endovascular thrombectomy. Although 3D U-Net-based deep learning achieves high sensitivity, it often yields false positives due to infarct mimics. We developed a SegMamba-based model to enhance global volumetric feature extraction and compared both approaches on a dataset encompassing multiple DWI hyperintense pathologies. Methods: Two models were trained on a multicenter dataset of 10,820 DWI scans (2011-2014) and evaluated against manual segmentation on an external test set of 2731 fresh DWI scans. Diagnostic accuracy was assessed in a clinical cohort of 1194 patients from a different center (2017-2020) who underwent DWI for various indications. We compared the models using the Dice similarity coefficient (DSC), average Hausdorff distance (AHD), sensitivity, and specificity. Results: The training, external test, and clinical test datasets had mean (SD) ages of 67.9 (12.8), 68.2 (12.7), and 63.9 (15.4) years, with 58.9%, 60.4%, and 58.1% male, respectively. In the external test dataset, SegMamba and U-Net achieved similar DSC (0.786 vs 0.785; p = 0.141), but SegMamba outperformed U-Net in AHD (1.25 mm vs 1.76 mm; p < 0.001). In the clinical dataset, SegMamba showed slightly lower sensitivity (96.97% vs 98.79%) but substantially higher specificity (58.80% vs 29.54%), resulting in higher overall accuracy (64.07% vs 39.11%; p < 0.001). Conclusions: Changing the main architecture of the segmentation model alone maintained segmentation performance within ischemic-stroke cohorts, while achieving better classification in broader disease populations. This study highlights the need for deep-learning models to be validated not only for segmentation performance within target disease cohorts but also across diverse clinical environments to ensure practical utility.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School > Department of Medicine > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.