2022-06-26 更新
GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds
Authors:Jianfeng Xiang, Jiaolong Yang, Yu Deng, Xin Tong
Recent works have shown that 3D-aware GANs trained on unstructured single image collections can generate multiview images of novel instances. The key underpinnings to achieve this are a 3D radiance field generator and a volume rendering process. However, existing methods either cannot generate high-resolution images (e.g., up to 256X256) due to the high computation cost of neural volume rendering, or rely on 2D CNNs for image-space upsampling which jeopardizes the 3D consistency across different views. This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering. Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency. We avoid the otherwise prohibitively-expensive computation cost by applying 2D convolutions on a set of 2D radiance manifolds defined in the recent generative radiance manifold (GRAM) approach, and apply dedicated loss functions for effective GAN training at high resolution. Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results that significantly outperform existing methods.
PDF Project page: https://jeffreyxiang.github.io/GRAM-HD/
论文截图
Diffusion-GAN: Training GANs with Diffusion
Authors:Zhendong Wang, Huangjie Zheng, Pengcheng He, Weizhu Chen, Mingyuan Zhou
For stable training of generative adversarial networks (GANs), injecting instance noise into the input of the discriminator is considered as a theoretically sound solution, which, however, has not yet delivered on its promise in practice. This paper introduces Diffusion-GAN that employs a Gaussian mixture distribution, defined over all the diffusion steps of a forward diffusion chain, to inject instance noise. A random sample from the mixture, which is diffused from an observed or generated data, is fed as the input to the discriminator. The generator is updated by backpropagating its gradient through the forward diffusion chain, whose length is adaptively adjusted to control the maximum noise-to-data ratio allowed at each training step. Theoretical analysis verifies the soundness of the proposed Diffusion-GAN, which provides model- and domain-agnostic differentiable augmentation. A rich set of experiments on diverse datasets show that Diffusion-GAN can provide stable and data-efficient GAN training, bringing consistent performance improvement over strong GAN baselines for synthesizing photo-realistic images.
PDF Project homepage: https://github.com/Zhendong-Wang/Diffusion-GAN
论文截图
EFFGAN: Ensembles of fine-tuned federated GANs
Authors:Ebba Ekblom, Edvin Listo Zec, Olof Mogren
Generative adversarial networks have proven to be a powerful tool for learning complex and high-dimensional data distributions, but issues such as mode collapse have been shown to make it difficult to train them. This is an even harder problem when the data is decentralized over several clients in a federated learning setup, as problems such as client drift and non-iid data make it hard for federated averaging to converge. In this work, we study the task of how to learn a data distribution when training data is heterogeneously decentralized over clients and cannot be shared. Our goal is to sample from this distribution centrally, while the data never leaves the clients. We show using standard benchmark image datasets that existing approaches fail in this setting, experiencing so-called client drift when the local number of epochs becomes to large. We thus propose a novel approach we call EFFGAN: Ensembles of fine-tuned federated GANs. Being an ensemble of local expert generators, EFFGAN is able to learn the data distribution over all clients and mitigate client drift. It is able to train with a large number of local epochs, making it more communication efficient than previous works.
PDF
论文截图
Temporally Consistent Semantic Video Editing
Authors:Yiran Xu, Badour AlBahar, Jia-Bin Huang
Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e.g., changing object classes, modifying attributes, or transferring styles. However, applying these GAN-based editing to a video independently for each frame inevitably results in temporal flickering artifacts. We present a simple yet effective method to facilitate temporally coherent video editing. Our core idea is to minimize the temporal photometric inconsistency by optimizing both the latent code and the pre-trained generator. We evaluate the quality of our editing on different domains and GAN inversion techniques and show favorable results against the baselines.
PDF Project page: https://video-edit-gan.github.io/
论文截图
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?
Authors:Hwanil Choi, Wonjoon Chang, Jaesik Choi
Even though Generative Adversarial Networks (GANs) have shown a remarkable ability to generate high-quality images, GANs do not always guarantee the generation of photorealistic images. Occasionally, they generate images that have defective or unnatural objects, which are referred to as ‘artifacts’. Research to investigate why these artifacts emerge and how they can be detected and removed has yet to be sufficiently carried out. To analyze this, we first hypothesize that rarely activated neurons and frequently activated neurons have different purposes and responsibilities for the progress of generating images. In this study, by analyzing the statistics and the roles for those neurons, we empirically show that rarely activated neurons are related to the failure results of making diverse objects and inducing artifacts. In addition, we suggest a correction method, called ‘Sequential Ablation’, to repair the defective part of the generated images without high computational cost and manual efforts.
PDF Accepted at IJCAI-2022
论文截图
Structure-preserving GANs
Authors:Jeremiah Birrell, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu
Generative adversarial networks (GANs), a class of distribution-learning methods based on a two-player game between a generator and a discriminator, can generally be formulated as a minmax problem based on the variational representation of a divergence between the unknown and the generated distributions. We introduce structure-preserving GANs as a data-efficient framework for learning distributions with additional structure such as group symmetry, by developing new variational representations for divergences. Our theory shows that we can reduce the discriminator space to its projection on the invariant discriminator space, using the conditional expectation with respect to the sigma-algebra associated to the underlying structure. In addition, we prove that the discriminator space reduction must be accompanied by a careful design of structured generators, as flawed designs may easily lead to a catastrophic “mode collapse” of the learned distribution. We contextualize our framework by building symmetry-preserving GANs for distributions with intrinsic group symmetry, and demonstrate that both players, namely the equivariant generator and invariant discriminator, play important but distinct roles in the learning process. Empirical experiments and ablation studies across a broad range of data sets, including real-world medical imaging, validate our theory, and show our proposed methods achieve significantly improved sample fidelity and diversity — almost an order of magnitude measured in Fr\’echet Inception Distance — especially in the small data regime.
PDF 39 pages, 16 figures