Development of Speech Technologies at Trunin-Donskoy’s School: From Sound Recognition to Natural Speech Recognition


연구 분야: Artificial Intelligence



학회: Pattern Recognition and Image Analysis


초록

The team of the speech recognition sector of the Computing Center of the Russian Academy of Sciences has participated in the development of speech technologies since their appearance in the Soviet Union in the 1960s. During this time, several generations of researchers have carried out research in this field. Accordingly, the approaches to solve problems of speech recognition have repeatedly undergone fundamental changes: the methods based on the recognition of individual sounds and parts of words using sets of rules obtained by expert means have given place to the methods of recognition and semantic interpretation of natural continuous speech based on mathematical models trained on large data sets. The review of the results begins with a description of the approach to building speech recognition systems, which was proposed by the head of the sector V.N. Trunin-Donskoy. The hardware-software approach has long been the “calling card” of the team and played a significant role in popularizing speech recognition and realizing the benefits of developing speech technologies in the Soviet Union. Solutions based on a combination of software and specialized hardware are now standard, but were new at that time. The development of the area, the complication of problems associated with the transition to discrete speech recognition with large dictionaries have resulted in the replacement of recognition methods based on the use of systems of expert rules with methods based on classical optimization algorithms, which were proposed and improved by the staff and graduate students of the team. Understanding the importance of the tasks of collecting, classifying, and annotating representative arrays of speech data was a feature of research at the Computing Center of the USSR Academy of Sciences. The work of the team members was significantly ahead of not only domestic but also modern foreign research in this area. A relevant area of applied work in the 1970s–1980s was the use of modern methods of digital speech processing, the creation of problem-oriented tools and language means that made it possible to simulate in real time the operation of components of speech recognition and signal processing systems, and in particular, interactive methods for filtering speech signals. The consistent increase in computing power and data corpus volumes has made it possible to move to the use of probabilistic speech modeling technologies, as well as the formulation and solution of natural speech recognition problems. Methods have been worked out and data corpuses and software systems have been developed for recognizing spontaneous speech, separating voices, determining gender, identifying key words in a speech stream, and classifying the subject of a speech message. Recent studies are related to the development of computationally efficient neural network methods and models for speech recognition and processing, which are intended for use in mobile devices.


Author Profile
V. Ya. Chuchupal

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences 119333 Moscow Russian Federation

Andorra
Author Profile
K. A. Makovkin

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences 119333 Moscow Russian Federation

Andorra

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (235건)