GAN


2022-03-04 更新

FedMed-GAN: Federated Domain Translation on Unsupervised Cross-Modality Brain Image Synthesis

Authors:Guoyang Xie, Jinbao Wang, Yawen Huang, Yuexiang Li, Yefeng Zheng, Feng Zheng, Yaochu Jin

Utilizing multi-modal neuroimaging data has been proved to be effective to investigate human cognitive activities and certain pathologies. However, it is not practical to obtain the full set of paired neuroimaging data centrally since the collection faces several constraints, e.g., high examination cost, long acquisition time, and image corruption. In addition, these data are dispersed into different medical institutions and thus cannot be aggregated for centralized training considering the privacy issues. There is a clear need to launch a federated learning and facilitate the integration of the dispersed data from different institutions. In this paper, we propose a new benchmark for federated domain translation on unsupervised brain image synthesis (termed as FedMed-GAN) to bridge the gap between federated learning and medical GAN. FedMed-GAN mitigates the mode collapse without sacrificing the performance of generators, and is widely applied to different proportions of unpaired and paired data with variation adaptation property. We treat the gradient penalties by federally averaging algorithm and then leveraging differential privacy gradient descent to regularize the training dynamics. A comprehensive evaluation is provided for comparing FedMed-GAN and other centralized methods, which shows the new state-of-the-art performance by our FedMed-GAN. Our code has been released on the website: https://github.com/M-3LAB/FedMed-GAN
PDF

论文截图

Learning Efficient GANs for Image Translation via Differentiable Masks and co-Attention Distillation

Authors:Shaojie Li, Mingbao Lin, Yan Wang, Fei Chao, Ling Shao, Rongrong Ji

Generative Adversarial Networks (GANs) have been widely-used in image translation, but their high computation and storage costs impede the deployment on mobile devices. Prevalent methods for CNN compression cannot be directly applied to GANs due to the peculiarties of GAN tasks and the unstable adversarial training. To solve these, in this paper, we introduce a novel GAN compression method, termed DMAD, by proposing a Differentiable Mask and a co-Attention Distillation. The former searches for a light-weight generator architecture in a training-adaptive manner. To overcome channel inconsistency when pruning the residual connections, an adaptive cross-block group sparsity is further incorporated. The latter simultaneously distills informative attention maps from both the generator and discriminator of a pre-trained model to the searched generator, effectively stabilizing the adversarial training of our light-weight model. Experiments show that DMAD can reduce the Multiply Accumulate Operations (MACs) of CycleGAN by 13x and that of Pix2Pix by 4x while retaining a comparable performance against the full model. Our code can be available at https://github.com/SJLeo/DMAD.
PDF Accepted by IEEE Transactions on Multimedia (IEEE TMM)

论文截图

PetsGAN: Rethinking Priors for Single Image Generation

Authors:Zicheng Zhang, Yinglu Liu, Congying Han, Hailin Shi, Tiande Guo, Bowen Zhou

Single image generation (SIG), described as generating diverse samples that have similar visual content with the given single image, is first introduced by SinGAN which builds a pyramid of GANs to progressively learn the internal patch distribution of the single image. It also shows great potentials in a wide range of image manipulation tasks. However, the paradigm of SinGAN has limitations in terms of generation quality and training time. Firstly, due to the lack of high-level information, SinGAN cannot handle the object images well as it does on the scene and texture images. Secondly, the separate progressive training scheme is time-consuming and easy to cause artifact accumulation. To tackle these problems, in this paper, we dig into the SIG problem and improve SinGAN by fully-utilization of internal and external priors. The main contributions of this paper include: 1) We introduce to SIG a regularized latent variable model. To the best of our knowledge, it is the first time to give a clear formulation and optimization goal of SIG, and all the existing methods for SIG can be regarded as special cases of this model. 2) We design a novel Prior-based end-to-end training GAN (PetsGAN) to overcome the problems of SinGAN. Our method gets rid of the time-consuming progressive training scheme and can be trained end-to-end. 3) We construct abundant qualitative and quantitative experiments to show the superiority of our method on both generated image quality, diversity, and the training speed. Moreover, we apply our method to other image manipulation tasks (e.g., style transfer, harmonization), and the results further prove the effectiveness and efficiency of our method.
PDF AAAI 2022 (oral)

论文截图

Are conditional GANs explicitly conditional?

Authors:Houssem eddine Boulahbal, Adrian Voicila, Andrew Comport

This paper proposes two important contributions for conditional Generative Adversarial Networks (cGANs) to improve the wide variety of applications that exploit this architecture. The first main contribution is an analysis of cGANs to show that they are not explicitly conditional. In particular, it will be shown that the discriminator and subsequently the cGAN does not automatically learn the conditionality between inputs. The second contribution is a new method, called a contrario cGAN, that explicitly models conditionality for both parts of the adversarial architecture via a novel a contrario loss that involves training the discriminator to learn unconditional (adverse) examples. This leads to a novel type of data augmentation approach for GANs (a contrario learning) which allows to restrict the search space of the generator to conditional outputs using adverse examples. Extensive experimentation is carried out to evaluate the conditionality of the discriminator by proposing a probability distribution analysis. Comparisons with the cGAN architecture for different applications show significant improvements in performance on well known datasets including, semantic image synthesis, image segmentation, monocular depth prediction and “single label”-to-image using different metrics including Fr\’echet Inception Distance (FID), mean Intersection over Union (mIoU), Root Mean Square Error log (RMSE log) and Number of statistically-Different Bins (NDB).
PDF Accepted at BMVC2021

论文截图

ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation

Authors:Yahui Liu, Yajing Chen, Linchao Bao, Nicu Sebe, Bruno Lepri, Marco De Nadai

Recently, there has been an increasing interest in image editing methods that employ pre-trained unconditional image generators (e.g., StyleGAN). However, applying these methods to translate images to multiple visual domains remains challenging. Existing works do not often preserve the domain-invariant part of the image (e.g., the identity in human face translations), they do not usually handle multiple domains, or do not allow for multi-modal translations. This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation from pre-trained unconditional generators. The ISF manipulates the semantics of an input latent code to make the image generated from it lying in the desired visual domain. Our results in human face and animal manipulations show significantly improved results over the baselines. Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs. The code and data are available at: \url{https://github.com/yhlleo/stylegan-mmuit}.
PDF 14 pages, 15 figures

论文截图

文章作者: Harvey
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Harvey !
  目录