2022-07-13 更新
Unsupervised Joint Image Transfer and Uncertainty Quantification using Patch Invariant Networks
Authors:Christoph Angermann, Markus Haltmeier, Ahsan Raza Siyal
Unsupervised image transfer enables intra- and inter-modality transfer for medical applications where a large amount of paired training data is not abundant. To ensure a structure-preserving mapping from the input to the target domain, existing methods for unpaired medical image transfer are commonly based on cycle-consistency, causing additional computation resources and instability due to the learning of an inverse mapping. This paper presents a novel method for uni-directional domain mapping where no paired data is needed throughout the entire training process. A reasonable transfer is ensured by employing the GAN architecture and a novel generator loss based on patch invariance. To be more precise, generator outputs are evaluated and compared on different scales, which brings increased attention to high-frequency details as well as implicit data augmentation. This novel term also gives the opportunity to predict aleatoric uncertainty by modeling an input-dependent scale map for the patch residuals. The proposed method is comprehensively evaluated on three renowned medical databases. Superior accuracy on these datasets compared to four different state-of-the-art methods for unpaired image transfer suggests the great potential of this approach for uncertainty-aware medical image translation. Implementation of the proposed framework is released here: https://github.com/anger-man/unsupervised-image-transfer-and-uq.
PDF
点此查看论文截图
Semantic-Aware Latent Space Exploration for Face Image Restoration
Authors:Yanhui Guo, Fangzhou Luo
For image restoration, the majority of existing deep learning-based algorithms have a tendency to overfit the training data, resulting in poor performance when confronted with unseen degradations. To achieve more robust restoration, generative adversarial network (GAN) prior based methods have been proposed, demonstrating a promising capacity to restore photo-realistic and high-quality results. However, these methods are susceptible to semantic ambiguity, particularly with semantically relevant images such as facial images. In this paper, we propose a semantic-aware latent space exploration method for image restoration (SAIR). By explicitly modeling referenced semantics information, SAIR is able to reliably restore severely degraded images not only to high-resolution highly-realistic looks but also to correct semantics. Quantitative and qualitative experiments collectively demonstrate the effectiveness of the proposed SAIR. Our code can be found in https://github.com/Liamkuo/SAIR.
PDF
点此查看论文截图
Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs
Authors:Min Zhou, Chenchen Xu, Ye Ma, Tiezheng Ge, Yuning Jiang, Weiwei Xu
In this paper, we study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images. We note that image compositions, which contain not only global semantics but also spatial information, would largely affect layout results. Hence, we propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images. To obtain training images from images that already contain manually designed graphic layout data, previous work suggests masking design elements (e.g., texts and embellishments) as model inputs, which inevitably leaves hint of the ground truth. We study the misalignment between the training inputs (with hint masks) and test inputs (without masks), and design a novel domain alignment module (DAM) to narrow this gap. For training, we built a large-scale layout dataset which consists of 60,548 advertising posters with annotated layout information. To evaluate the generated layouts, we propose three novel metrics according to aesthetic intuitions. Through both quantitative and qualitative evaluations, we demonstrate that the proposed model can synthesize high-quality graphic layouts according to image compositions.
PDF Accepted by IJCAI 2022 (AI, THE ARTS AND CREATIVITY TRACK)