Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
Esophageal
cancer remains one of the most lethal malignancies worldwide, where early
detection is essential for improving survival outcomes. Traditional diagnostic
methods such as endoscopy and histopathology are time-consuming,
resource-intensive, and subject to human variability. This study presents a
deep learning-based end-to-end diagnostic system for esophageal cancer
detection using image classification. The proposed model integrates a hybrid
architecture combining Swin Transformer and ResNet-50, capturing both global
contextual information and fine-grained local features to enhance
classification accuracy. Due to the absence of pixel-level annotated
segmentation masks, a Grad-CAM-based visualization technique is employed to
localize cancer-affected regions, providing interpretability and visual support
for clinical decisions. A confidence-based grading module is included to
estimate cancer severity levels—Low, Medium, or High—using model prediction
probabilities, thereby compensating for the lack of explicitly labeled grading
data. The model is trained and optimized under low-memory constraints, ensuring
efficient deployment in real-world environments, including low-resource
clinical settings. It is saved in a portable PyTorch .pth format, enabling
consistent inference across platforms. Additionally, a web interface built with
Flask allows users to upload endoscopic images and receive real-time
predictions, visual heatmaps, and grading feedback. Experimental results on a
dataset of cancerous and non-cancerous esophageal images demonstrate high
classification accuracy and reliable visual explanations, validating the
system's effectiveness. This work highlights the potential of artificial
intelligence in advancing diagnostic tools for esophageal cancer and offers a
practical solution for resource-limited healthcare settings.
Country : India
IRJIET, Volume 9, Issue 5, May 2025 pp. 35-41