Detailed Information

Cited 8 time in webofscience Cited 12 time in scopus
Metadata Downloads

Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms

Full metadata record
DC Field Value Language
dc.contributor.authorKim, Junghun-
dc.contributor.authorAn, Yoojin-
dc.contributor.authorKim, Jihie-
dc.date.accessioned2023-04-27T13:41:12Z-
dc.date.available2023-04-27T13:41:12Z-
dc.date.issued2022-09-
dc.identifier.issn2958-1796-
dc.identifier.urihttps://scholarworks.dongguk.edu/handle/sw.dongguk/3829-
dc.description.abstractAttention has become one of the most commonly used mechanisms in deep learning approaches. The attention mechanism can help the system focus more on the feature space's critical regions. For example, high amplitude regions can play an important role for Speech Emotion Recognition (SER). In this paper, we identify misalignments between the attention and the signal amplitude in the existing multi-head self-attention. To improve the attention area, we propose to use a Focus-Attention (FA) mechanism and a novel Calibration-Attention (CA) mechanism in combination with the multi-head self-attention. Through the FA mechanism, the network can detect the largest amplitude part in the segment. By employing the CA mechanism, the network can modulate the information flow by assigning different weights to each attention head and improve the utilization of surrounding contexts. To evaluate the proposed method, experiments are performed with the IEMOCAP and RAVDESS datasets. Experimental results show that the proposed framework significantly outperforms the state-of-the-art approaches on both datasets. Copyright © 2022 ISCA.-
dc.format.extent5-
dc.language영어-
dc.language.isoENG-
dc.publisherInternational Speech Communication Association-
dc.titleImproving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.21437/Interspeech.2022-299-
dc.identifier.scopusid2-s2.0-85140056928-
dc.identifier.wosid000900724500027-
dc.identifier.bibliographicCitationInterspeech 2022, v.2022-September, pp 136 - 140-
dc.citation.titleInterspeech 2022-
dc.citation.volume2022-September-
dc.citation.startPage136-
dc.citation.endPage140-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassforeign-
dc.relation.journalResearchAreaAcoustics-
dc.relation.journalResearchAreaAudiology & Speech-Language Pathology-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryAcoustics-
dc.relation.journalWebOfScienceCategoryAudiology & Speech-Language Pathology-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordAuthorattention-
dc.subject.keywordAuthoremotion-
dc.subject.keywordAuthorspeech recognition-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Advanced Convergence Engineering > Department of Computer Science and Artificial Intelligence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Ji Hie photo

Kim, Ji Hie
College of Advanced Convergence Engineering (Department of Computer Science and Artificial Intelligence)
Read more

Altmetrics

Total Views & Downloads

BROWSE