Volume - 7 | Issue - 3 | september 2025
Published
25 July, 2025
Precise Gleason grading of prostate biopsy specimens is vital for determining the appropriate clinical management of prostate cancer. However, traditionally, subjective manual evaluation by pathologists is susceptible to inter-observer variability, contributing to variable diagnoses and a likelihood of less-than-optimal treatment decisions. Therefore, we present a hybrid deep-learning architecture, wherein a modified ResNet50 convolutional backbone has been amalgamated with a Vision Transformer (ViT) module with the aim of automated and standardized Gleason classification. The ResNet50 portion consists of 50 layers with bottleneck residual blocks inserted for texture and glandular pattern localization in contrast-enhanced histopathological images. The spatially rich feature maps are then forwarded to the ViT module that extracts long-range dependencies and contextual relationships across image patches through a combination of multi-head self-attention mechanisms and transformer encoders. In this manner, a combination of local feature extraction and global attention facilitates the model's learning of subtle morphological variations that are crucial for the differentiation of six different Gleason patterns on a large scale. The model was trained and validated on a balanced multiclass dataset of prostate biopsy images, achieving a classification accuracy of 99%, which is better than several existing deep-learning baselines. This hybrid architecture aims to enhance diagnostic consistency while providing a realistic, interpretable framework for implementation in clinical workflows geared toward high-throughput prostate cancer screening, especially in resource-limited healthcare settings.
KeywordsHybrid Vision-ResNet50 Gleason Grading Prostate Biopsy Images Deep Learning Histopathology Classification