Detailed Information

Cited 21 time in webofscience Cited 27 time in scopus
Metadata Downloads

HMTL: Heterogeneous Modality Transfer Learning for Audio-Visual Sentiment Analysisopen access

Authors
Seo, SanghyunNa, SanghyuckKim, Juntae
Issue Date
2020
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Sentiment analysis; Visualization; Data integration; Acoustics; Feature extraction; Analytical models; Data models; Multimodal sentiment analysis; heterogeneous transfer learning; data fusion
Citation
IEEE ACCESS, v.8, pp 140426 - 140437
Pages
12
Indexed
SCIE
SCOPUS
Journal Title
IEEE ACCESS
Volume
8
Start Page
140426
End Page
140437
URI
https://scholarworks.dongguk.edu/handle/sw.dongguk/7171
DOI
10.1109/ACCESS.2020.3006563
ISSN
2169-3536
Abstract
Multimodal sentiment analysis is an extended approach to traditional language-based sentiment analysis, which uses other relevant modality data. Multimodal sentiment analysis usually applies visual, textual, and acoustic representations for sentiment prediction. Recently, various data fusion methodologies have been proposed for multimodal sentiment analysis. In most cases, textual modality plays a major role, and visual and acoustic modalities are used as auxiliary sources for multimodal sentiment analysis. However, in general multimedia such as video, text transcripts of an individual's speech are not provided. Research on an audio-visual sentiment analysis methodology that does not depend on text modality is essential for multimodal sentiment analysis in real-world industrial applications. Therefore, it is important to improve audio-visual sentiment analysis because it currently exhibits lower performance than multimodal sentiment analysis, including text modality. In this paper, we propose heterogeneous modality transfer learning (HMTL) to utilize the knowledge of aligned text data as a source modality in transfer learning to improve audio-visual sentiment analysis performance. Our approach uses a decoder and adversarial learning techniques to reduce the gap between the source and target modalities in the embedded space for multimodal representation. Our proposed methodology experimentally outperformed recent unimodal and bimodal audio-visual sentiment analysis achievements.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Jun Tae photo

Kim, Jun Tae
College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)
Read more

Altmetrics

Total Views & Downloads

BROWSE