元宇宙/虚拟人

元宇宙虚拟人

元宇宙/虚拟人

发布日期: 2022-09-30

2022-09-30 更新

Facial Landmark Predictions with Applications to Metaverse

Authors:Qiao Han, Jun Zhao, Kwok-Yan Lam

This research aims to make metaverse characters more realistic by adding lip animations learnt from videos in the wild. To achieve this, our approach is to extend Tacotron 2 text-to-speech synthesizer to generate lip movements together with mel spectrogram in one pass. The encoder and gate layer weights are pre-trained on LJ Speech 1.1 data set while the decoder is retrained on 93 clips of TED talk videos extracted from LRS 3 data set. Our novel decoder predicts displacement in 20 lip landmark positions across time, using labels automatically extracted by OpenFace 2.0 landmark predictor. Training converged in 7 hours using less than 5 minutes of video. We conducted ablation study for Pre/Post-Net and pre-trained encoder weights to demonstrate the effectiveness of transfer learning between audio and visual speech data.
PDF

点此查看论文截图

木子已

https://ipaper.today/2022/09/30/2022-09-30-yuan-yu-zhou-xu-ni-ren/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

元宇宙虚拟人

Open-Set

2022-09-30 Open-Set

Open-Set

Domain Adaptation

2022-09-30 Domain Adaptation

Domain Adaptation

元宇宙/虚拟人

2022-09-30 更新

Facial Landmark Predictions with Applications to Metaverse

打赏用于支持本站流量费