ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detection

Pal, Aditya; Rai, Hari Mohan; Yoo, Joon; Lee, Sang-Ryong; Park, Yooheon

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

ViT-DCNN: Vision Transformer with Deformable CNN Model for Lung and Colon Cancer Detectionopen access

Authors: Pal, Aditya; Rai, Hari Mohan; Yoo, Joon; Lee, Sang-Ryong; Park, Yooheon

Issue Date: Sep-2025

Publisher: MDPI

Keywords: lung and colon cancer detection; deep learning; ViT-DCNN; medical image classification; self-attention mechanism; performance evaluation

Citation: Cancers, v.17, no.18, pp 1 - 26

Pages: 26

Indexed: SCIE
SCOPUS

Journal Title: Cancers

Volume: 17

Number: 18

Start Page: 1

End Page: 26

URI: https://scholarworks.dongguk.edu/handle/sw.dongguk/61772

DOI: 10.3390/cancers17183005

ISSN: 2072-6694
2072-6694

Abstract: Background/Objectives: Lung and colon cancers remain among the most prevalent and fatal diseases worldwide, and their early detection is a serious challenge. The data used in this study was obtained from the Lung and Colon Cancer Histopathological Images Dataset, which comprises five different classes of image data, namely colon adenocarcinoma, colon normal, lung adenocarcinoma, lung normal, and lung squamous cell carcinoma, split into training (80%), validation (10%), and test (10%) subsets. In this study, we propose the ViT-DCNN (Vision Transformer with Deformable CNN) model, with the aim of improving cancer detection and classification using medical images. Methods: The combination of the ViT's self-attention capabilities with deformable convolutions allows for improved feature extraction, while also enabling the model to learn both holistic contextual information as well as fine-grained localized spatial details. Results: On the test set, the model performed remarkably well, with an accuracy of 94.24%, an F1 score of 94.23%, recall of 94.24%, and precision of 94.37%, confirming its robustness in detecting cancerous tissues. Furthermore, our proposed ViT-DCNN model outperforms several state-of-the-art models, including ResNet-152, EfficientNet-B7, SwinTransformer, DenseNet-201, ConvNext, TransUNet, CNN-LSTM, MobileNetV3, and NASNet-A, across all major performance metrics. Conclusions: By using deep learning and advanced image analysis, this model enhances the efficiency of cancer detection, thus representing a valuable tool for radiologists and clinicians. This study demonstrates that the proposed ViT-DCNN model can reduce diagnostic inaccuracies and improve detection efficiency. Future work will focus on dataset enrichment and enhancing the model's interpretability to evaluate its clinical applicability. This paper demonstrates the promise of artificial-intelligence-driven diagnostic models in transforming lung and colon cancer detection and improving patient diagnosis.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Life Science and Biotechnology > Department of Food Science & Biotechnology > 1. Journal Articles; College of Life Science and Biotechnology > Department of Biological and Environmental Science > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Sang-Ryong photo

Lee, Sang-Ryong: College of Life Science and Biotechnology (Department of Convergent Environmental Science)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE