ITRT(IT Research Trends)

Research on Speech Recognition Methods with Emotional Description

연구 분야: Artificial Intelligence

논문 키워드: #computer #speech #communication #language #emojis

학회: 2024 International Conference on Artificial Intelligence and Power Systems (AIPS)

초록

The current speech recognition technology only focuses on the language information in the speech, ignoring the emotional information, which will affect the user's accurate understanding of the original speech. Therefore, it is necessary to realize the speech recognition with emotional description, and improve the user's experience of semantic recognition products. Firstly, emojis, which are widely used in computer communication, are selected as emotion labels, which are attached to the text of speech recognition as task output. According to the meaning and characteristics of emojis, discrete emotion classification and continuous emotion score were used to convert sample emotion labels into emojis, and Speech Recognition and emoji Prediction (SReP) dataset was proposed. Secondly, the end-to-end recognition model is constructed, and emoji recommendation is taken as a round in the speech recognition autoregressive process. The mixed recognition method of characters and emojis, the speech-text fusion module and the smooth regularization of new labels are designed, and the tasks are realized by using Hubert-based feature extractor and Conformer module. Experimental results on SReP dataset demonstrate the effectiveness of the proposed method.

Yingjie Qi

School of Software Xinjiang University Urumqi China

China

📄 논문 정보

발행 연도	2024년
인용수	109
출판 국가	China
사이트	IEEE
좋아요 수	0

Research on Speech Recognition Methods with Emotional Description

Research on Speech Recognition Methods with Emotional Description

📄 논문 정보

연관 논문 목록 (63건) 내 서재 담기

연관 논문 목록 (63건)