Detailed Information

Cited 12 time in webofscience Cited 13 time in scopus
Metadata Downloads

Global–local feature learning for fine-grained food classification based on Swin Transformer

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Jun-Hwa-
dc.contributor.authorKim, Namho-
dc.contributor.authorWon, Chee Sun-
dc.date.accessioned2024-08-08T11:31:07Z-
dc.date.available2024-08-08T11:31:07Z-
dc.date.issued2024-07-
dc.identifier.issn0952-1976-
dc.identifier.issn1873-6769-
dc.identifier.urihttps://scholarworks.dongguk.edu/handle/sw.dongguk/21694-
dc.description.abstractSeparable object parts, such as the head and tail in a bird, are vital for fine-grained visual classifications. For those objects without separable parts, the classification task relies only on local and global textural image features. Although the Swin Transformer architecture was proposed to efficiently capture both local and global visual features, it still exhibits a bias towards global features. Therefore, our goal is to enhance the local feature learning capability of the Swin Transformer by adding four new modules of the Local Feature Extraction Network (L-FEN), Convolution Patch-Merging (CP), Multi-Path (MP), and Multi-View (MV). The L-FEN enhances the Swin transformer with the improved local feature capture. The CP is a localized and hierarchical adaptation of the Swin's Patch Merging technique. The MP method integrates features across various Swin stages to accentuate local details. Meanwhile, the MV Swin transformer block supersedes traditional Swin blocks with those incorporating varied receptive fields, ensuring a broader scope of local feature capture. Our enhanced architecture, named Global–Local Swin Transformer (GL-Swin), is applied to solve a fine-grained food classification task. On three major food datasets: ISIA Food-500 UEC Food-256, and Food-101, our GL-Swin achieved accuracies of 66.75%, 85.78%, and 92.93% respectively, consistently outperforming other leading methods. © 2024 Elsevier Ltd-
dc.format.extent7-
dc.language영어-
dc.language.isoENG-
dc.publisherElsevier Ltd-
dc.titleGlobal–local feature learning for fine-grained food classification based on Swin Transformer-
dc.typeArticle-
dc.publisher.location네델란드-
dc.identifier.doi10.1016/j.engappai.2024.108248-
dc.identifier.scopusid2-s2.0-85187783530-
dc.identifier.wosid001206555100001-
dc.identifier.bibliographicCitationEngineering Applications of Artificial Intelligence, v.133, pp 1 - 7-
dc.citation.titleEngineering Applications of Artificial Intelligence-
dc.citation.volume133-
dc.citation.startPage1-
dc.citation.endPage7-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaAutomation & Control Systems-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryAutomation & Control Systems-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Multidisciplinary-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordAuthorCNN-
dc.subject.keywordAuthorDeep learning-
dc.subject.keywordAuthorFine-grained visual classification-
dc.subject.keywordAuthorFood dataset-
dc.subject.keywordAuthorVision transformer-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Engineering > Department of Electronics and Electrical Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE