VL-OrdinalFormer: Vision-Language-Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Knee osteoarthritis (KOA) severity assessment using the Kellgren-Lawrence (KL) grading system is essential for clinical decision-making, yet reliable discrimination between adjacent early stages, particularly KL1 and KL2, remains challenging due to subtle radiographic differences and inter-observer variability. This study investigates whether integrating ordinal regression with vision-language semantic alignment can improve fine-grained automated KOA grading. We propose VL-OrdinalFormer, a transformer-based framework that models KL severity as an ordered process and aligns visual features with clinically grounded textual descriptions. The model is evaluated using stratified five-fold cross-validation on the publicly available OAI kneeKL224 dataset (1656 test radiographs). The proposed approach achieves 70.29% accuracy, 70.19% macro F1-score, and 81.61% macro AUROC, outperforming both CNN and standard ViT baselines. Notably, class-wise analysis shows consistent improvements for clinically ambiguous intermediate grades, with gains of +6.6% for KL1 and +19.4% for KL2 compared to the VGG19 baseline. Robustness experiments further demonstrate stable performance under simulated acquisition and projection variability. These results indicate that combining ordinal modeling with vision-language alignment enhances discrimination of subtle disease stages while maintaining interpretability, supporting the potential of the proposed framework for reliable and clinically meaningful KOA grading.

키워드

knee osteoarthritisKellgren-Lawrence gradingvision transformerordinal regressionvision-language modelingCLASSIFICATION
제목
VL-OrdinalFormer: Vision-Language-Guided Ordinal Transformers for Interpretable Knee Osteoarthritis Grading
저자
Ullah, ZahidKim, Jihie
DOI
10.3390/math14060963
발행일
2026-03
유형
Article
저널명
Mathematics
14
6
페이지
1 ~ 26