Transformer Based Architecture for Out of Distribution in Polyp Segmentation

Laxmi Jha; Prakash Chandra Prasad

doi:https://doi.org/10.47001/IRJIET/2025.905022

Transformer Based Architecture for Out of Distribution in Polyp Segmentation

Laxmi JhaSoftware Engineer, Nepal Water Supply Corporation, Tripureshwor, NepalPrakash Chandra PrasadAssistant Professor, Department of Computer & Electronics Engineering, Pulchowk Campus, Nepal

Vol 9 No 5 (2025): Volume 9, Issue 5, May 2025 | Pages: 175-180

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 20-05-2025

doi.org/10.47001/IRJIET/2025.905022

Full Text PDF

Abstract

Accurate and real-time polyp segmentation is critical for early colorectal cancer detection in computer-aided diagnosis systems. We propose a novel deep learning-based segmentation model that integrates the strengths of transformer-based global feature extraction and multiscale contextual refinement. The architecture leverages the Pyramid Vision Transformer V2 (PVTv2-B1) as the encoder, which extracts hierarchical feature maps at four different scales: 64, 128, 320, and 512 channels. These multi-resolution features effectively capture global contextual representations essential for segmenting polyps with varying sizes and shapes. At the core of the model lies a dilated bottleneck block that enhances the receptive field without reducing spatial resolution. It comprises four parallel dilated convolutional branches with dilation rates of 1, 3, 5, and 7, followed by a channel fusion block using 1×1 convolution to aggregate contextual information. This module enables the network to learn robust multiscale features crucial for accurate segmentation. The decoder consists of three hierarchical decoder blocks, each composed of a transpose convolution layer for upsampling, followed by concatenation with the corresponding encoder skip connection and a double convolutional refinement block. These decoder stages progressively reconstruct the spatial resolution and refine boundary details. The final output is generated through bilinear upsampling and a 1×1 convolution to produce the segmentation mask. Evaluated on standard polyp segmentation datasets, the model achieves superior performance: IoU of 0.8395, Dice score of 0.9029, Recall of 0.9217, Precision of 0.9072 and a low Hausdorff Distance of 2.8736, indicating precise boundary prediction. Additionally, the model operates at 47 FPS, making it highly suitable for real-time clinical applications. This combination of transformer-based encoding, dilated context aggregation, and U-Net-inspired decoding demonstrates a powerful architecture for accurate and efficient medical image segmentation.

Keywords

Computer aided diagnosis, out-of-distribution, polyp segmentation, Dilated Convolutions, Pyramid vision transformer

Citation of this Article

Laxmi Jha, & Prakash Chandra Prasad. (2025). Transformer Based Architecture for Out of Distribution in Polyp Segmentation. International Research Journal of Innovations in Engineering and Technology - IRJIET, 9(5), 175-180. Article DOI https://doi.org/10.47001/IRJIET/2025.905022

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

Douglas A Corley, Christopher D Jensen, Amy R Marks, Wei K Zhao, Jeffrey K Lee, Chyke A Doubeni, Ann G Zauber, Jolanda De Boer, Bruce H Fireman, Joanne E Schottinger, et al. Adenoma detection rate and risk of colorectal cancer and death. New england journal of medicine, 370(14):1298–1306, 2014.
Kinalis, S. Nikoletseas, D. Patroumpa, and J. Rolim, “Biased sink mobility with adaptive stop times for low latency data collection in sensor networks,” Inf. Fusion, vol. 15, pp. 56–63, Jan. 2014.
Gregor Urban, Pushpak Tripathi, Talal Alkayali, Manan Mittal, Farnaz Jalali, William Karnes, and Pierre Baldi. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology, 155(4):1069–1078.e8, 2018.
Nazir and H. Hasbullah, “Mobile sink based routing protocol (MSRP) for prolonging network lifetime in clustered wireless sensor network,” in Proc. Int. Conf. Comput. Appl. Ind. Electron. (ICCAIE), pp. 624–629, Dec. 2010.
Md Mostafijur Rahman and Radu Marculescu. Medical image segmentation via cascaded attention decoding. pages 6222–6231, 2023.
Chalermek, R. Govindan, and D. Estrin, “Directed diffusion: A scalable and robust communication paradigm for sensor networks,” in Proc. ACM SIGMOBILE Int. Conf. Mobile Computer Network (MOBICOM), pp. 56–67, 2000.
Bin Xiao, Jinwu Hu, Weisheng Li, Chi-Man Pun, and Xiuli Bi. Ctnet: Contrastive transformer network for polyp segmentation. IEEE Transactions on Cybernetics, 2024.
Debesh Jha, Nikhil Kumar Tomar, Debayan Bhattacharya, and Ulas Bagci. Transrupnet for improved polyp segmentation.
Xiaoqi Zhao, Hongpeng Jia, Youwei Pang, Long Lv, Feng Tian, Lihe Zhang, Weibing Sun, and Huchuan Lu. Mˆ{2} snet: Multi-scale in multi-scale subtraction network for medical image segmentation. arXiv preprint arXiv:2303.10894, 2023..
Tao Zhou, Yizhe Zhang, Yi Zhou, Ye Wu, and Chen Gong. Can sam segment polyps? arXiv preprint arXiv:2304.07583, 2023.
Xiaoqi Zhao, Lihe Zhang, and Huchuan Lu. Automatic polyp segmentation via multiscale subtraction network. pages 120–130, 2021
Gregor Urban, Priyam Tripathi, Talal Alkayali, Mohit Mittal, Farid Jalali, William Karnes, and Pierre Baldi. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology, 155(4):1069–1078, 2018.
Jorge Bernal, F Javier S´anchez, Gloria Fern´andez-Esparrach, Debora Gil, Cristina Rodr´ıguez, and Fernando Vilari˜no. Wm-dova maps for accurate polyp highlighting in vs.

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

Transformer Based Architecture for Out of Distribution in Polyp Segmentation

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links