TY - JOUR
T1 - Systematic Integration of Attention Modules into CNNs for Accurate and Generalizable Medical Image Classification
AU - Ullah, Zahid
AU - Hong, Minki
AU - Mahmood, Tahir
AU - Kim, Jihie
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/11
Y1 - 2025/11
N2 - Deep learning has demonstrated significant promise in medical image analysis; however, standard CNNs frequently encounter challenges in detecting subtle and intricate features vital for accurate diagnosis. To address this limitation, we systematically integrated attention mechanisms into five commonly used CNN backbones: VGG16, ResNet18, InceptionV3, DenseNet121, and EfficientNetB5. Each network was modified using either a Squeeze-and-Excitation block or a hybrid Convolutional Block Attention Module, allowing for more effective recalibration of channel and spatial features. We evaluated these attention-augmented models on two distinct datasets: (1) a Products of Conception histopathological dataset containing four tissue categories, and (2) a brain tumor MRI dataset that includes multiple tumor subtypes. Across both datasets, networks enhanced with attention mechanisms consistently outperformed their baseline counterparts on all measured evaluation criteria. Importantly, EfficientNetB5 with hybrid attention achieved superior overall results, with notable enhancements in both accuracy and generalizability. In addition to improved classification outcomes, the inclusion of attention mechanisms also advanced feature localization, thereby increasing robustness across a range of imaging modalities. Our study established a comprehensive framework for incorporating attention modules into diverse CNN architectures and delineated their impact on medical image classification. These results provide important insights for the development of interpretable and clinically robust deep learning-driven diagnostic systems.
AB - Deep learning has demonstrated significant promise in medical image analysis; however, standard CNNs frequently encounter challenges in detecting subtle and intricate features vital for accurate diagnosis. To address this limitation, we systematically integrated attention mechanisms into five commonly used CNN backbones: VGG16, ResNet18, InceptionV3, DenseNet121, and EfficientNetB5. Each network was modified using either a Squeeze-and-Excitation block or a hybrid Convolutional Block Attention Module, allowing for more effective recalibration of channel and spatial features. We evaluated these attention-augmented models on two distinct datasets: (1) a Products of Conception histopathological dataset containing four tissue categories, and (2) a brain tumor MRI dataset that includes multiple tumor subtypes. Across both datasets, networks enhanced with attention mechanisms consistently outperformed their baseline counterparts on all measured evaluation criteria. Importantly, EfficientNetB5 with hybrid attention achieved superior overall results, with notable enhancements in both accuracy and generalizability. In addition to improved classification outcomes, the inclusion of attention mechanisms also advanced feature localization, thereby increasing robustness across a range of imaging modalities. Our study established a comprehensive framework for incorporating attention modules into diverse CNN architectures and delineated their impact on medical image classification. These results provide important insights for the development of interpretable and clinically robust deep learning-driven diagnostic systems.
KW - attention mechanism
KW - convolutional neural networks
KW - medical image classification
KW - squeeze and excitation
UR - https://www.scopus.com/pages/publications/105023208656
U2 - 10.3390/math13223728
DO - 10.3390/math13223728
M3 - Article
AN - SCOPUS:105023208656
SN - 2227-7390
VL - 13
JO - Mathematics
JF - Mathematics
IS - 22
M1 - 3728
ER -