Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning

Lee, Hojun; Cho, Hyunjun; Park, Jieun; Chae, Jinyeong; Kim, Jihie

Detailed Information

Cited 8 time in webofscience

Cited 12 time in scopus

Metadata Downloads

Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Hojun	-
dc.contributor.author	Cho, Hyunjun	-
dc.contributor.author	Park, Jieun	-
dc.contributor.author	Chae, Jinyeong	-
dc.contributor.author	Kim, Jihie	-
dc.date.accessioned	2023-04-27T13:40:40Z	-
dc.date.available	2023-04-27T13:40:40Z	-
dc.date.issued	2022-02	-
dc.identifier.issn	1424-8220	-
dc.identifier.issn	1424-8220	-
dc.identifier.uri	https://scholarworks.dongguk.edu/handle/sw.dongguk/3672	-
dc.description.abstract	Transformer-based approaches have shown good results in image captioning tasks. However, current approaches have a limitation in generating text from global features of an entire image. Therefore, we propose novel methods for generating better image captioning as follows: (1) The Global-Local Visual Extractor (GLVE) to capture both global features and local features. (2) The Cross Encoder-Decoder Transformer (CEDT) for injecting multiple-level encoder features into the decoding process. GLVE extracts not only global visual features that can be obtained from an entire image, such as size of organ or bone structure, but also local visual features that can be generated from a local region, such as lesion area. Given an image, CEDT can create a detailed description of the overall features by injecting both low-level and high-level encoder outputs into the decoder. Each method contributes to performance improvement and generates a description such as organ size and bone structure. The proposed model was evaluated on the IU X-ray dataset and achieved better performance than the transformer-based baseline results, by 5.6% in BLEU score, by 0.56% in METEOR, and by 1.98% in ROUGE-L.	-
dc.format.extent	13	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	MDPI	-
dc.title	Cross Encoder-Decoder Transformer with Global-Local Visual Extractor for Medical Image Captioning	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/s22041429	-
dc.identifier.scopusid	2-s2.0-85124348434	-
dc.identifier.wosid	000765175400001	-
dc.identifier.bibliographicCitation	Sensors, v.22, no.4, pp 1 - 13	-
dc.citation.title	Sensors	-
dc.citation.volume	22	-
dc.citation.number	4	-
dc.citation.startPage	1	-
dc.citation.endPage	13	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Chemistry	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Instruments & Instrumentation	-
dc.relation.journalWebOfScienceCategory	Chemistry, Analytical	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Instruments & Instrumentation	-
dc.subject.keywordAuthor	medical image captioning	-
dc.subject.keywordAuthor	deep learning	-
dc.subject.keywordAuthor	transformer	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Ji Hie photo

Kim, Ji Hie: College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE