Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

EdgeV-SE: Self-Reflective Fine-Tuning Framework for Edge-Deployable Vision-Language Modelsopen access

Authors
Jeon, YoonmoLee, SeunghunKim, Woongsup
Issue Date
Jan-2026
Publisher
MDPI
Keywords
Vision-Language Model (VLM); edge computing; self-reflective learning; consistency regularization; mutual learning; satellite IoT; NVIDIA Jetson; disaster analysis
Citation
Applied Sciences, v.16, no.2, pp 1 - 31
Pages
31
Indexed
SCIE
SCOPUS
Journal Title
Applied Sciences
Volume
16
Number
2
Start Page
1
End Page
31
URI
https://scholarworks.dongguk.edu/handle/sw.dongguk/63570
DOI
10.3390/app16020818
ISSN
2076-3417
2076-3417
Abstract
Featured Application The proposed framework enables the deployment of robust Vision-Language Models on resource-constrained off-the-shelf edge devices, such as the NVIDIA Jetson series. Its primary application is real-time disaster damage assessment using satellite imagery in communication-denied environments, facilitating immediate decision-making for first responders.Abstract The deployment of Vision-Language Models (VLMs) in Satellite IoT scenarios is critical for real-time disaster assessment but is often hindered by the substantial memory and compute requirements of state-of-the-art models. While parameter-efficient fine-tuning (PEFT) enables adaptation, with minimal computational overhead, standard supervised methods often fail to ensure robustness and reliability on resource-constrained edge devices. To address this, we propose EdgeV-SE, a self-reflective fine-tuning framework that significantly enhances the performance of VLM without introducing any inference-time overhead. Our framework incorporates an uncertainty-aware self-reflection mechanism with asymmetric dual pathways: a generative linguistic pathway and an auxiliary discriminative visual pathway. By estimating uncertainty from the linguistic pathway using a log-likelihood margin between class verbalizers, EdgeV-SE identifies ambiguous samples and refines its decision boundaries via consistency regularization and cross-pathway mutual learning. Experimental results on hurricane damage assessment demonstrate that our approach improves image classification accuracy, enhances image-text semantic alignment, and achieves superior caption quality. Notably, our work achieves these gains while maintaining practical deployment on a commercial off-the-shelf edge device such as NVIDIA Jetson Orin Nano, preserving the inference latency and memory footprint. Overall, our work contributes a unified self-reflective fine-tuning framework that improves robustness, calibration, and deployability of VLMs on edge devices.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > Department of Information and Communication Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Woong Sup photo

Kim, Woong Sup
College of Engineering (Department of Information and Communication Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE