GAN


2022-09-02 更新

REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer

Authors:Quanwei Yang, Xinchen Liu, Wu Liu, Hongtao Xie, Xiaoyan Gu, Lingyun Yu, Yongdong Zhang

Human Video Motion Transfer (HVMT) aims to, given an image of a source person, generate his/her video that imitates the motion of the driving person. Existing methods for HVMT mainly exploit Generative Adversarial Networks (GANs) to perform the warping operation based on the flow estimated from the source person image and each driving video frame. However, these methods always generate obvious artifacts due to the dramatic differences in poses, scales, and shifts between the source person and the driving person. To overcome these challenges, this paper presents a novel REgionto-whole human MOtion Transfer (REMOT) framework based on GANs. To generate realistic motions, the REMOT adopts a progressive generation paradigm: it first generates each body part in the driving pose without flow-based warping, then composites all parts into a complete person of the driving motion. Moreover, to preserve the natural global appearance, we design a Global Alignment Module to align the scale and position of the source person with those of the driving person based on their layouts. Furthermore, we propose a Texture Alignment Module to keep each part of the person aligned according to the similarity of the texture. Finally, through extensive quantitative and qualitative experiments, our REMOT achieves state-of-the-art results on two public benchmarks.
PDF 10 pages, 5 figures. Accepted by ACMMM2022

点此查看论文截图

HVTR: Hybrid Volumetric-Textural Rendering for Human Avatars

Authors:Tao Hu, Tao Yu, Zerong Zheng, He Zhang, Yebin Liu, Matthias Zwicker

We propose a novel neural rendering pipeline, Hybrid Volumetric-Textural Rendering (HVTR), which synthesizes virtual human avatars from arbitrary poses efficiently and at high quality. First, we learn to encode articulated human motions on a dense UV manifold of the human body surface. To handle complicated motions (e.g., self-occlusions), we then leverage the encoded information on the UV manifold to construct a 3D volumetric representation based on a dynamic pose-conditioned neural radiance field. While this allows us to represent 3D geometry with changing topology, volumetric rendering is computationally heavy. Hence we employ only a rough volumetric representation using a pose-conditioned downsampled neural radiance field (PD-NeRF), which we can render efficiently at low resolutions. In addition, we learn 2D textural features that are fused with rendered volumetric features in image space. The key advantage of our approach is that we can then convert the fused features into a high-resolution, high-quality avatar by a fast GAN-based textural renderer. We demonstrate that hybrid rendering enables HVTR to handle complicated motions, render high-quality avatars under user-controlled poses/shapes and even loose clothing, and most importantly, be efficient at inference time. Our experimental results also demonstrate state-of-the-art quantitative results.
PDF Accepted to 3DV 2022. See more results at https://www.cs.umd.edu/~taohu/hvtr/ Demo: https://www.youtube.com/watch?v=LE0-YpbLlkY

点此查看论文截图

Wavelet-Packets for Deepfake Image Analysis and Detection

Authors:Moritz Wolter, Felix Blanke, Raoul Heese, Jochen Garcke

As neural networks become able to generate realistic artificial images, they have the potential to improve movies, music, video games and make the internet an even more creative and inspiring place. Yet, the latest technology potentially enables new digital ways to lie. In response, the need for a diverse and reliable method toolbox arises to identify artificial images and other content. Previous work primarily relies on pixel-space CNNs or the Fourier transform. To the best of our knowledge, synthesized fake image analysis and detection methods based on a multi-scale wavelet representation, localized in both space and frequency, have been absent thus far. The wavelet transform conserves spatial information to a degree, which allows us to present a new analysis. Comparing the wavelet coefficients of real and fake images allows interpretation. Significant differences are identified. Additionally, this paper proposes to learn a model for the detection of synthetic images based on the wavelet-packet representation of natural and GAN-generated images. Our lightweight forensic classifiers exhibit competitive or improved performance at comparatively small network sizes, as we demonstrate on the FFHQ, CelebA and LSUN source identification problems. Furthermore, we study the binary FaceForensics++ fake-detection problem.
PDF Source code is available at https://github.com/gan-police/frequency-forensics and https://github.com/v0lta/PyTorch-Wavelet-Toolbox

点此查看论文截图

文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !
  目录