ITRT(IT Research Trends)

Enhancing speech emotion recognition: a deep learning approach with self-attention and acoustic features

연구 분야: Artificial Intelligence

논문 키워드: #speech #emotional #emotions #acoustic #spectrograms

학회: The Journal of Supercomputing

초록

Speech emotion recognition (SER), which involves detecting and classifying emotions from speech signals, plays a crucial role in human–computer interaction. However, challenges such as variability in emotional expression and limited labeled data have hindered progress in this area. To address these issues, we propose a novel deep learning framework that combines multiple acoustic features, including MFCCs, Mel-spectrograms, and temporal-frequency domain features. Our model leverages three parallel CNN-LSTM branches for sequential feature extraction, followed by a self-attention mechanism to integrate the extracted representations. A final LSTM layer, along with dense layers, refines the classification process. This innovative fusion of features and attention mechanisms significantly enhances emotion recognition performance. Experimental evaluations demonstrate the effectiveness of our approach in improving classification accuracy.

Khadijeh Aghajani

Department of Computer Engineering University of Mazandaran Babolsar Iran

Iran

Mahbanou Zohrevandi

Department of Computer Engineering Malayer University Malayer Iran

Iran

📄 논문 정보

발행 연도	2025년
인용수	0
출판 국가	Iran
사이트	Springer
좋아요 수	0

Enhancing speech emotion recognition: a deep learning approach with self-attention and acoustic features

Enhancing speech emotion recognition: a deep learning approach with self-attention and acoustic features

📄 논문 정보

연관 논문 목록 (38건) 내 서재 담기

연관 논문 목록 (38건)