Detailed Information

Cited 1 time in webofscience Cited 2 time in scopus
Metadata Downloads

Multimodal Food Image Classification with Large Language Models

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Jun-Hwa-
dc.contributor.authorKim, Nam-Ho-
dc.contributor.authorJo, Donghyeok-
dc.contributor.authorWon, Chee Sun-
dc.date.accessioned2024-12-10T00:00:14Z-
dc.date.available2024-12-10T00:00:14Z-
dc.date.issued2024-11-
dc.identifier.issn2079-9292-
dc.identifier.issn2079-9292-
dc.identifier.urihttps://scholarworks.dongguk.edu/handle/sw.dongguk/56353-
dc.description.abstractIn this study, we leverage advancements in large language models (LLMs) for fine-grained food image classification. We achieve this by integrating textual features extracted from images using an LLM into a multimodal learning framework. Specifically, semantic textual descriptions generated by the LLM are encoded and combined with image features obtained from a transformer-based architecture to improve food image classification. Our approach employs a cross-attention mechanism to effectively fuse visual and textual modalities, enhancing the model's ability to extract discriminative features beyond what can be achieved with visual features alone.-
dc.format.extent10-
dc.language영어-
dc.language.isoENG-
dc.publisherMDPI-
dc.titleMultimodal Food Image Classification with Large Language Models-
dc.typeArticle-
dc.publisher.location스위스-
dc.identifier.doi10.3390/electronics13224552-
dc.identifier.scopusid2-s2.0-85210254120-
dc.identifier.wosid001364377100001-
dc.identifier.bibliographicCitationElectronics, v.13, no.22, pp 1 - 10-
dc.citation.titleElectronics-
dc.citation.volume13-
dc.citation.number22-
dc.citation.startPage1-
dc.citation.endPage10-
dc.type.docTypeArticle-
dc.description.isOpenAccessY-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaPhysics-
dc.relation.journalWebOfScienceCategoryComputer Science, Information Systems-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryPhysics, Applied-
dc.subject.keywordAuthorfood image classification-
dc.subject.keywordAuthorfine-grained visual classification-
dc.subject.keywordAuthormultimodal image feature-
dc.subject.keywordAuthorlarge language model-
dc.subject.keywordAuthordeep learning-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > Department of Electronics and Electrical Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE