연구 분야: Artificial Intelligence
학회: Multimedia Tools and Applications
The speech command identification system has become a necessary tool to transcribe speech into text, for performing hands-free control of devices and hazardous processes, etc. It also finds applications in searching the contents online over voice and speech-to-text conversion for differently-abled persons. This work includes the extraction of the spectrogram from speech signals, applying 80% of the features to the 2D convolutional neural network (CNN) layered architecture, and creating CNN group models.CNN models are used to test features to recognize the words uttered by normal and Hearing-impaired (HI). The system's performance is assessed based on the recognition rate for spectrogram, Melspectrogram and Gammatonegram features and CNN. In addition, the speech intelligibility of HI speeches is enhanced using the phase spectrum compensation (PSC) technique. Decision-level fusion of spectrogram features for regular speech recognition, HI speech recognition without PSC and HI speech recognition with PSC have provided an accuracy of 95%, 98% and 99%, respectively. Twenty isolated words are considered for regular speech command recognition, and ten isolated digits are regarded for a HI speech recognition system. This automated speech command recognition is implemented in real-time using Raspberry Pi hardware, and the validation error for the test data is 0.57692%.
| 발행 연도 | 2023년 |
|---|---|
| 인용수 | 2 |
| 출판 국가 | Andorra, India |
| 사이트 | Springer |
| 좋아요 수 | 0 |