Generation and Analysis of Vocal Spectrograms: Combining Generative Adversarial Networks


연구 분야: Artificial Intelligence



학회: MIDA '24: Proceedings of the 2024 International Conference on Machine Intelligence and Digital Applications


초록

Vocal spectrogram is a representation of sound in the frequency domain and has important application value in fields such as music and speech. Through generative an adversarial network (GAN), realistic vocal spectrograms can be generated or analyzed using the generated spectrograms. This article introduces the basic principles and structure of GAN, including the design of generator and discriminator networks, discusses the data preparation and definition of loss function in vocal spectrogram generation, and describes in detail the steps of training GAN, including alternating training of generator and discriminator to generate more realistic vocal spectrograms. After generating vocal spectrograms, it further introduces how to use corresponding technologies and algorithms to analyze the generated spectrograms, and studies the evaluation indicators for the generated vocal spectrograms or analysis results. The vocal spectrogram generated by generative adversarial networks has high performance, with the highest clarity reaching 92%. The generation and analysis of vocal spectrograms can play an increasingly important role in audio processing and acoustic research, and bring new breakthroughs to the development of audio technology.


Author Profile
Zhe Yang

Weifang Engineering Vocational College China

China

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 China
사이트 ACM
좋아요 수 0

연관 논문 목록 (202건)