TY - JOUR
T1 - Automated Detection and Grading of Renal Cell Carcinoma in Histopathological Images via Efficient Attention Transformer Network
AU - Al-kuwari, Hissa
AU - Alshami, Belqes
AU - Al-Khinji, Aisha
AU - Haider, Adnan
AU - Arsalan, Muhammad
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/12
Y1 - 2025/12
N2 - Background: Renal Cell Carcinoma (RCC) is the most common type of kidney cancer and requires accurate histopathological grading for effective prognosis and treatment planning. However, manual grading is time-consuming, subjective, and susceptible to inter-observer variability. Objective: This study proposes EAT-Net (Efficient Attention Transformer Network), a dual-stream deep learning model designed to automate and enhance RCC grade classification from histopathological images. Method: EAT-Net integrates EfficientNetB0 for local feature extraction and a Vision Transformer (ViT) stream for capturing global contextual dependencies. The architecture incorporates Squeeze-and-Excitation (SE) modules to recalibrate feature maps, improving focus on informative regions. The model was trained and evaluated on two publicly available datasets, KMC-RENAL and RCCG-Net. Standard preprocessing was applied, and the model’s performance was assessed using accuracy, precision, recall, and F1-score. Results: EAT-Net achieved superior results compared to state-of-the-art models, with an accuracy of 92.25%, precision of 92.15%, recall of 92.12%, and F1-score of 92.25%. Ablation studies demonstrated the complementary value of the EfficientNet and ViT streams. Additionally, Grad-CAM visualizations confirmed that the model focuses on diagnostically relevant areas, supporting its interpretability and clinical relevance. Conclusion: EAT-Net offers an accurate, and explainable framework for RCC grading. Its lightweight architecture and high performance make it well-suited for clinical deployment in digital pathology workflows.
AB - Background: Renal Cell Carcinoma (RCC) is the most common type of kidney cancer and requires accurate histopathological grading for effective prognosis and treatment planning. However, manual grading is time-consuming, subjective, and susceptible to inter-observer variability. Objective: This study proposes EAT-Net (Efficient Attention Transformer Network), a dual-stream deep learning model designed to automate and enhance RCC grade classification from histopathological images. Method: EAT-Net integrates EfficientNetB0 for local feature extraction and a Vision Transformer (ViT) stream for capturing global contextual dependencies. The architecture incorporates Squeeze-and-Excitation (SE) modules to recalibrate feature maps, improving focus on informative regions. The model was trained and evaluated on two publicly available datasets, KMC-RENAL and RCCG-Net. Standard preprocessing was applied, and the model’s performance was assessed using accuracy, precision, recall, and F1-score. Results: EAT-Net achieved superior results compared to state-of-the-art models, with an accuracy of 92.25%, precision of 92.15%, recall of 92.12%, and F1-score of 92.25%. Ablation studies demonstrated the complementary value of the EfficientNet and ViT streams. Additionally, Grad-CAM visualizations confirmed that the model focuses on diagnostically relevant areas, supporting its interpretability and clinical relevance. Conclusion: EAT-Net offers an accurate, and explainable framework for RCC grading. Its lightweight architecture and high performance make it well-suited for clinical deployment in digital pathology workflows.
KW - deep learning
KW - efficientNet
KW - histopathology
KW - medical image classification
KW - renal cell carcinoma
KW - vision transformer
UR - https://www.scopus.com/pages/publications/105022808621
U2 - 10.3390/medsci13040257
DO - 10.3390/medsci13040257
M3 - Article
C2 - 41283258
AN - SCOPUS:105022808621
SN - 2076-3271
VL - 13
JO - Medical sciences
JF - Medical sciences
IS - 4
M1 - 257
ER -