ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detectionopen access
- Authors
- Pal, Aditya; Rai, Hari Mohan; Yoo, Joon; Lee, Sang-Ryong; Park, Yooheon
- Issue Date
- Sep-2025
- Publisher
- MDPI
- Keywords
- lung and colon cancer detection; deep learning; ViT-DCNN; medical image classification; self-attention mechanism; performance evaluation
- Citation
- Cancers, v.17, no.18, pp 1 - 26
- Pages
- 26
- Indexed
- SCIE
SCOPUS
- Journal Title
- Cancers
- Volume
- 17
- Number
- 18
- Start Page
- 1
- End Page
- 26
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/61772
- DOI
- 10.3390/cancers17183005
- ISSN
- 2072-6694
2072-6694
- Abstract
- Background/Objectives: Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. Methods: The combination of the ViT's self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. Results: On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. Conclusions: By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model's interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Life Science and Biotechnology > Department of Food Science & Biotechnology > 1. Journal Articles
- College of Life Science and Biotechnology > Department of Biological and Environmental Science > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.