HMTL: Heterogeneous Modality Transfer Learning for Audio-Visual Sentiment Analysis

Seo, Sanghyun; Na, Sanghyuck; Kim, Juntae

Detailed Information

Cited 21 time in webofscience

Cited 27 time in scopus

Metadata Downloads

HMTL: Heterogeneous Modality Transfer Learning for Audio-Visual Sentiment Analysisopen access

Authors: Seo, Sanghyun; Na, Sanghyuck; Kim, Juntae

Issue Date: 2020

Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords: Sentiment analysis; Visualization; Data integration; Acoustics; Feature extraction; Analytical models; Data models; Multimodal sentiment analysis; heterogeneous transfer learning; data fusion

Citation: IEEE ACCESS, v.8, pp 140426 - 140437

Pages: 12

Indexed: SCIE
SCOPUS

Journal Title: IEEE ACCESS

Volume: 8

Start Page: 140426

End Page: 140437

URI: https://scholarworks.dongguk.edu/handle/sw.dongguk/7171

DOI: 10.1109/ACCESS.2020.3006563

ISSN: 2169-3536

Abstract: Multimodal sentiment analysis is an extended approach to traditional language-based sentiment analysis, which uses other relevant modality data. Multimodal sentiment analysis usually applies visual, textual, and acoustic representations for sentiment prediction. Recently, various data fusion methodologies have been proposed for multimodal sentiment analysis. In most cases, textual modality plays a major role, and visual and acoustic modalities are used as auxiliary sources for multimodal sentiment analysis. However, in general multimedia such as video, text transcripts of an individual's speech are not provided. Research on an audio-visual sentiment analysis methodology that does not depend on text modality is essential for multimodal sentiment analysis in real-world industrial applications. Therefore, it is important to improve audio-visual sentiment analysis because it currently exhibits lower performance than multimodal sentiment analysis, including text modality. In this paper, we propose heterogeneous modality transfer learning (HMTL) to utilize the knowledge of aligned text data as a source modality in transfer learning to improve audio-visual sentiment analysis performance. Our approach uses a decoder and adversarial learning techniques to reduce the gap between the source and target modalities in the embedded space for multimodal representation. Our proposed methodology experimentally outperformed recent unimodal and bimodal audio-visual sentiment analysis achievements.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kim, Jun Tae photo

Kim, Jun Tae: College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE