ITRT(IT Research Trends)

TACR-Net: Editing on Deep Video and Voice Portraits

연구 분야: Strategies

논문 키워드: #challenging #acoustic #distortion #forgery #arbitrary

학회: MM '21: Proceedings of the 29th ACM International Conference on Multimedia

초록

Utilizing an arbitrary speech clip to edit the mouth of the portrait in the target video is a novel yet challenging task. Despite impressive results have been achieved, there are still three limitations in the existing methods: 1) since the acoustic features are not completely decoupled from person identity, there is no global speech to facial features (i.e., landmarks, expression blendshape) mapping method. 2) the audio-driven talking face sequences generated by simple cascade structure usually lack of temporal consistency and spatial correlation, which leads to defects in the consistency of changes in details. 3) the operation of forgery is always at the video level, without considering the forgery of the voice, especially the synchronization of the converted voice and the mouth. To address these distortion problems, we propose a novel deep learning framework, named Temporal-Refinement Autoregressive-Cascade Rendering Network (TACR-Net) for audio-driven dynamic talking face editing. The proposed TACR-Net encodes facial expression blendshape based on the given acoustic features without separately training for special video. Then TACR-Net also involves a novel autoregressive cascade structure generator for video re-rendering. Finally, we transform the in-the-wild speech to the target portrait and obtain a photo-realistic and audio-realistic video.

📄 논문 정보

발행 연도	2021년
인용수	15
출판 국가	Andorra, China
사이트	ACM
좋아요 수	0

TACR-Net: Editing on Deep Video and Voice Portraits

TACR-Net: Editing on Deep Video and Voice Portraits

Luchuan Song

Bin Liu

Guojun Yin

Xiaoyi Dong

Yufei Zhang

Jiaxuan Bai

📄 논문 정보

연관 논문 목록 (66건)

TACR-Net: Editing on Deep Video and Voice Portraits

TACR-Net: Editing on Deep Video and Voice Portraits

📄 논문 정보

연관 논문 목록 (66건) 내 서재 담기

연관 논문 목록 (66건)