Multi-scale feature retention and aggregation for colorectal cancer diagnosis using gastrointestinal imagesopen access
- Authors
- Haider, Adnan; Arsalan, Muhammad; Nam, Se Hyun; Hong, Jin Seong; Sultan, Haseeb; Park, Kang Ryoung
- Issue Date
- Oct-2023
- Publisher
- Elsevier Ltd
- Keywords
- Polyp segmentation; Instrument segmentation; Colorectal cancer diagnosis; Computer-assisted diagnosis; CCS-Net and MFRA-Net
- Citation
- Engineering Applications of Artificial Intelligence, v.125, pp 1 - 19
- Pages
- 19
- Indexed
- SCIE
SCOPUS
- Journal Title
- Engineering Applications of Artificial Intelligence
- Volume
- 125
- Start Page
- 1
- End Page
- 19
- URI
- https://scholarworks.dongguk.edu/handle/sw.dongguk/21106
- DOI
- 10.1016/j.engappai.2023.106749
- ISSN
- 0952-1976
1873-6769
- Abstract
- Colonoscopy is considered the gold standard for colorectal cancer diagnosis and prognosis. However, existing methods are less accurate and prone to overlooking lesions during gastrointestinal endoscopic examinations. Computer-assisted diagnosis combined with robot-assisted minimally invasive surgery (RMIS) can significantly help medical practitioners detect and treat lesions. Therefore, two novel architectures are developed for polyp and surgical instrument segmentation to aid colorectal cancer diagnosis, assessment, and treatment. Colorectal cancer segmentation network (CCS-Net) is the base network used in this study. It uses the maximum convolutional layers near the input image for effective feature extraction from low-level information. In addition, CCS-Net uses an efficient feature upsampling unit to efficiently increase the input spatial features' map size. Hence, CCS-Net is capable of providing a fair performance with satisfactory computational efficiency The multi-scale feature retention and aggregation network (MFRA-Net) is the final network in this study. MFRA-Net is developed to improve the segmentation accuracy of the CCS-Net further as it uses multi-scale feature retention to retain low-level spatial features and transfers them to deep stages of the network. MFRANet also combines multi-scale high-strided low-level information with high-level information to boost network segmentation performance. Finally, all the transferred multi-scale features from the early stages of the network are aggregated with high-level features in the deep levels of the network. This multi-scale feature retention and aggregation mechanism enables the network to maintain a better segmentation performance compared with other methods even with challenging blur, specular reflection, low contrast, and high variation cases. We evaluated both architectures on four challenging datasets: Kvasir-SEG, CVC-ClinicDB, Kvasir-Instrument, and the UW-Sinus-Surgery-Live dataset. The proposed method achieves dice similarity coefficients of 95.98%, 94.19%, 92.81%, and 88.57% for the CVC-ClinicDB, Kvasir-SEG, Kvasir-Instrument, and UW-Sinus-Surgery-Live datasets. The proposed method achieves superior segmentation performance compared with state-of-the-art methods and requires only 4.9 million trainable parameters for complete training. Therefore, the proposed networks can effectively assist health professionals in surgical procedures and colorectal cancer diagnosis through surgical instruments and polyp segmentation, respectively.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > Department of Electronics and Electrical Engineering > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.