2022-08-11 更新
Towards Semantic Communications: Deep Learning-Based Image Semantic Coding
Authors:Danlan Huang, Feifei Gao, Xiaoming Tao, Qiyuan Du, Jianhua Lu
Semantic communications has received growing interest since it can remarkably reduce the amount of data to be transmitted without missing critical information. Most existing works explore the semantic encoding and transmission for text and apply techniques in Natural Language Processing (NLP) to interpret the meaning of the text. In this paper, we conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive. We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level. Firstly, we define the semantic concept of image data that includes the category, spatial arrangement, and visual feature as the representation unit, and propose a convolutional semantic encoder to extract semantic concepts. Secondly, we propose the image reconstruction criterion that evolves from the traditional pixel similarity to semantic similarity and perceptual performance. Thirdly, we design a novel RL-based semantic bit allocation model, whose reward is the increase in rate-semantic-perceptual performance after encoding a certain semantic concept with adaptive quantization level. Thus, the task-related information is preserved and reconstructed properly while less important data is discarded. Finally, we propose the Generative Adversarial Nets (GANs) based semantic decoder that fuses both locally and globally features via an attention module. Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image, and saves times of bit cost compared to standard codecs and other deep learning-based image codecs.
PDF
点此查看论文截图
Attribute Controllable Beautiful Caucasian Face Generation by Aesthetics Driven Reinforcement Learning
Authors:Xin Jin, Shu Zhao, Le Zhang, Xin Zhao, Qiang Deng, Chaoen Xiao
In recent years, image generation has made great strides in improving the quality of images, producing high-fidelity ones. Also, quite recently, there are architecture designs, which enable GAN to unsupervisedly learn the semantic attributes represented in different layers. However, there is still a lack of research on generating face images more consistent with human aesthetics. Based on EigenGAN [He et al., ICCV 2021], we build the techniques of reinforcement learning into the generator of EigenGAN. The agent tries to figure out how to alter the semantic attributes of the generated human faces towards more preferable ones. To accomplish this, we trained an aesthetics scoring model that can conduct facial beauty prediction. We also can utilize this scoring model to analyze the correlation between face attributes and aesthetics scores. Empirically, using off-the-shelf techniques from reinforcement learning would not work well. So instead, we present a new variant incorporating the ingredients emerging in the reinforcement learning communities in recent years. Compared to the original generated images, the adjusted ones show clear distinctions concerning various attributes. Experimental results using the MindSpore, show the effectiveness of the proposed method. Altered facial images are commonly more attractive, with significantly improved aesthetic levels.
PDF 13 pages, 5 figures. ACM Multimedia 2022 Technical Demos and Videos Program
点此查看论文截图
Disentangled Representation Learning Using ($β$-)VAE and GAN
Authors:Mohammad Haghir Ebrahimabadi
Given a dataset of images containing different objects with different features such as shape, size, rotation, and x-y position; and a Variational Autoencoder (VAE); creating a disentangled encoding of these features in the hidden space vector of the VAE was the task of interest in this paper. The dSprite dataset provided the desired features for the required experiments in this research. After training the VAE combined with a Generative Adversarial Network (GAN), each dimension of the hidden vector was disrupted to explore the disentanglement in each dimension. Note that the GAN was used to improve the quality of output image reconstruction.
PDF
点此查看论文截图
Unsupervised-learning-based method for chest MRI-CT transformation using structure constrained unsupervised generative attention networks
Authors:Hidetoshi Matsuo, Mizuho Nishio, Munenobu Nogami, Feibi Zeng, Takako Kurimoto, Sandeep Kaushik, Florian Wiesinger, Atsushi K Kono, Takamichi Murakami
The integrated positron emission tomography/magnetic resonance imaging (PET/MRI) scanner facilitates the simultaneous acquisition of metabolic information via PET and morphological information with high soft-tissue contrast using MRI. Although PET/MRI facilitates the capture of high-accuracy fusion images, its major drawback can be attributed to the difficulty encountered when performing attenuation correction, which is necessary for quantitative PET evaluation. The combined PET/MRI scanning requires the generation of attenuation-correction maps from MRI owing to no direct relationship between the gamma-ray attenuation information and MRIs. While MRI-based bone-tissue segmentation can be readily performed for the head and pelvis regions, the realization of accurate bone segmentation via chest CT generation remains a challenging task. This can be attributed to the respiratory and cardiac motions occurring in the chest as well as its anatomically complicated structure and relatively thin bone cortex. This paper presents a means to minimise the anatomical structural changes without human annotation by adding structural constraints using a modality-independent neighbourhood descriptor (MIND) to a generative adversarial network (GAN) that can transform unpaired images. The results obtained in this study revealed the proposed U-GAT-IT + MIND approach to outperform all other competing approaches. The findings of this study hint towards possibility of synthesising clinically acceptable CT images from chest MRI without human annotation, thereby minimising the changes in the anatomical structure.
PDF 27 pages, 12 figures