Improving Automatic Speech Recognition by Classifying Adult and Child Speakers into Separate Groups using Speech Rate Rhythmicity Parameter


연구 분야: Artificial Intelligence



학회: 2020 International Conference on Signal Processing and Communications (SPCOM)


초록

When children's speech is transcribed using acoustic models trained on adults' data, a severely degraded recognition performance is obtained. Similar degradations are noted on recognizing adults' speech using an automatic speech recognition (ASR) system trained on children's speech. This problem can be overcome by using two separate ASR systems for the two groups of speakers. But this approach requires an effective technique to detect whether the given data is from adult or child speaker. In this paper, we present a very simple and novel technique to do the same. The proposed approach is based on speechrate rhythmicity parameter (SRRP). Since the speaking-rates for adults and children differ significantly, the SRRP values are also very different for the two groups of speakers. Hence, by computing the SRRP value for a given speech utterance, it can be easily determined whether it is from adult or child speaker. The corresponding ASR systems can then be used to achieve improved recognition performance. Alternatively, existing techniques for improving children's speech recognition on adult data trained systems can be directly applied once it is known that the data is from a child speaker. Both these aspects have been experimentally validated in this work.


Author Profile
S. Shahnawazuddin

Department of Electronics and Communication Engineering NIT Patna India

Andorra
Author Profile
Tarun Sai Bandarupalli

Department of Electronics and Communication Engineering NIT Patna India

Andorra
Author Profile
R Chakravarthy

Department of Electronics and Communication Engineering NIT Patna India

Andorra

📄 논문 정보

발행 연도 2020년
인용수 4
출판 국가 Andorra
사이트 IEEE
좋아요 수 0

연관 논문 목록 (32건)