Covariance attention and correlation-based knowledge distillation for semantic segmentation of lens flare-degraded road scenes
  • Song, Hyun Woo
  • Kang, Seon Jong
  • Lee, Yong Ho
  • Jeong, Min Su
  • Jeong, Seong In
  • ... Park, Kang Ryoung
  • 외 1명
Citations

SCOPUS

0

초록

With the continued advancement of autonomous driving technology, semantic segmentation methods for precisely recognizing forward-facing objects are now essential. However, lens flare, which frequently arises in road environments because of strong light sources such as the sun, street lamps, and vehicle headlights, diminishes image quality and significantly reduces semantic segmentation performance. To address this issue, prior research on frontal-viewing camera images integrated lens flare detection and restoration within the network, but this approach introduces substantial inference time and computational overhead. To solve these problems, this study presents a novel covariance attention and correlation information fusion-based knowledge distillation (CACKD) framework for semantic segmentation of lens flare-degraded road-scene images captured by frontal-viewing camera. The teacher model in CACKD aggregates multi-layer gradient-weighted class activation mapping (Grad-CAM) information to precisely localize both global and fine-grained regions affected by lens flare, and then concentrates processing on these targeted flare regions to restore the image. During semantic segmentation, it employs channel-wise covariance attention based on a covariance projection, and the student model reproduces this attention at corresponding locations; by re-aligning inter-channel correlations, meaningful co-activations are reinforced and unnecessary dependencies are attenuated, substantially improving prediction consistency. In parallel, inter-channel correlation information and covariance-based attention features from multiple teacher stages are transferred to a lightweight, single-stage student model through a hybrid-based knowledge distillation (KD) strategy, to achieve teacher-level accuracy while dramatically reducing the computational cost and inference time of the student model. Experiments conducted on the synthesized flare (Syn-flare) Cambridge-driving labeled video (CamVid) and Syn-flare Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago (KITTI) datasets, respectively, demonstrate that the CACKD student model achieves mean intersection over union (mIoU) scores of 74.48 % and 66.70 %, respectively, while requiring only 32.52 million parameters and outperforming state-of-the-art methods despite its smaller model size. Moreover, the proposed model operates effectively on embedded systems with limited computational resources, underscoring its suitability for real-world autonomous driving environments. © 2026 The Authors.

키워드

Camera lens flareCovariance attention and correlation information fusionInformation fusion of multi-layer gradient-weighted class activation mappingKnowledge distillationSemantic segmentation
제목
Covariance attention and correlation-based knowledge distillation for semantic segmentation of lens flare-degraded road scenes
저자
Song, Hyun WooKang, Seon JongLee, Yong HoJeong, Min SuJeong, Seong InLee, Ho WonPark, Kang Ryoung
DOI
10.1016/j.engappai.2026.115139
발행일
2026-09
유형
Article
저널명
Engineering Applications of Artificial Intelligence
179
페이지
1 ~ 31