Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms

Kim, Junghun; An, Yoojin; Kim, Jihie

Detailed Information

Cited 8 time in webofscience

Cited 12 time in scopus

Metadata Downloads

Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Junghun	-
dc.contributor.author	An, Yoojin	-
dc.contributor.author	Kim, Jihie	-
dc.date.accessioned	2023-04-27T13:41:12Z	-
dc.date.available	2023-04-27T13:41:12Z	-
dc.date.issued	2022-09	-
dc.identifier.issn	2958-1796	-
dc.identifier.uri	https://scholarworks.dongguk.edu/handle/sw.dongguk/3829	-
dc.description.abstract	Attention has become one of the most commonly used mechanisms in deep learning approaches. The attention mechanism can help the system focus more on the feature space's critical regions. For example, high amplitude regions can play an important role for Speech Emotion Recognition (SER). In this paper, we identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention. To improve the attention area, we propose to use a Focus-Attention (FA) mechanism and a novel Calibration-Attention (CA) mechanism in combination with the multi-head self-attention. Through the FA mechanism, the network can detect the largest amplitude part in the segment. By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts. To evaluate the proposed method, experiments are performed with the IEMOCAP and RAVDESS datasets. Experimental results show that the proposed framework significantly outperforms the state-of-the-art approaches on both datasets. Copyright © 2022 ISCA.	-
dc.format.extent	5	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	International Speech Communication Association	-
dc.title	Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.21437/Interspeech.2022-299	-
dc.identifier.scopusid	2-s2.0-85140056928	-
dc.identifier.wosid	000900724500027	-
dc.identifier.bibliographicCitation	Interspeech 2022, v.2022-September, pp 136 - 140	-
dc.citation.title	Interspeech 2022	-
dc.citation.volume	2022-September	-
dc.citation.startPage	136	-
dc.citation.endPage	140	-
dc.type.docType	Proceedings Paper	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	foreign	-
dc.relation.journalResearchArea	Acoustics	-
dc.relation.journalResearchArea	Audiology & Speech-Language Pathology	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Acoustics	-
dc.relation.journalWebOfScienceCategory	Audiology & Speech-Language Pathology	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordAuthor	attention	-
dc.subject.keywordAuthor	emotion	-
dc.subject.keywordAuthor	speech recognition	-

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Ji Hie photo

Kim, Ji Hie: College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE