2023-04-26 更新
Spectral normalized dual contrastive regularization for image-to-image translation
Authors:Chen Zhao, Wei-Ling Cai, Zheng Yuan
Existing image-to-image(I2I) translation methods achieve state-of-the-art performance by incorporating the patch-wise contrastive learning into Generative Adversarial Networks. However, patch-wise contrastive learning only focuses on the local content similarity but neglects the global structure constraint, which affects the quality of the generated images. In this paper, we propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization, namely SN-DCR. To maintain consistency of the global structure and texture, we design the dual contrastive regularization using different feature spaces respectively. In order to improve the global structure information of the generated images, we formulate a semantically contrastive loss to make the global semantic structure of the generated images similar to the real images from the target domain in the semantic feature space. We use Gram Matrices to extract the style of texture from images. Similarly, we design style contrastive loss to improve the global texture information of the generated images. Moreover, to enhance the stability of model, we employ the spectral normalized convolutional network in the design of our generator. We conduct the comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
PDF
点此查看论文截图
Unsupervised CD in satellite image time series by contrastive learning and feature tracking
Authors:Yuxing Chen, Lorenzo Bruzzone
While unsupervised change detection using contrastive learning has been significantly improved the performance of literature techniques, at present, it only focuses on the bi-temporal change detection scenario. Previous state-of-the-art models for image time-series change detection often use features obtained by learning for clustering or training a model from scratch using pseudo labels tailored to each scene. However, these approaches fail to exploit the spatial-temporal information of image time-series or generalize to unseen scenarios. In this work, we propose a two-stage approach to unsupervised change detection in satellite image time-series using contrastive learning with feature tracking. By deriving pseudo labels from pre-trained models and using feature tracking to propagate them among the image time-series, we improve the consistency of our pseudo labels and address the challenges of seasonal changes in long-term remote sensing image time-series. We adopt the self-training algorithm with ConvLSTM on the obtained pseudo labels, where we first use supervised contrastive loss and contrastive random walks to further improve the feature correspondence in space-time. Then a fully connected layer is fine-tuned on the pre-trained multi-temporal features for generating the final change maps. Through comprehensive experiments on two datasets, we demonstrate consistent improvements in accuracy on fitting and inference scenarios.
PDF
点此查看论文截图
CoReFace: Sample-Guided Contrastive Regularization for Deep Face Recognition
Authors:Youzhe Song, Feng Wang
The discriminability of feature representation is the key to open-set face recognition. Previous methods rely on the learnable weights of the classification layer that represent the identities. However, the evaluation process learns no identity representation and drops the classifier from training. This inconsistency could confuse the feature encoder in understanding the evaluation goal and hinder the effect of identity-based methods. To alleviate the above problem, we propose a novel approach namely Contrastive Regularization for Face recognition (CoReFace) to apply image-level regularization in feature representation learning. Specifically, we employ sample-guided contrastive learning to regularize the training with the image-image relationship directly, which is consistent with the evaluation process. To integrate contrastive learning into face recognition, we augment embeddings instead of images to avoid the image quality degradation. Then, we propose a novel contrastive loss for the representation distribution by incorporating an adaptive margin and a supervised contrastive mask to generate steady loss values and avoid the collision with the classification supervision signal. Finally, we discover and solve the semantically repetitive signal problem in contrastive learning by exploring new pair coupling protocols. Extensive experiments demonstrate the efficacy and efficiency of our CoReFace which is highly competitive with the state-of-the-art approaches.
PDF
点此查看论文截图
Multi-crop Contrastive Learning for Unsupervised Image-to-Image Translation
Authors:Chen Zhao, Wei-Ling Cai, Zheng Yuan, Cheng-Wei Hu
Recently, image-to-image translation methods based on contrastive learning achieved state-of-the-art results in many tasks. However, the negatives are sampled from the input feature spaces in the previous work, which makes the negatives lack diversity. Moreover, in the latent space of the embedings,the previous methods ignore domain consistency between the generated image and the real images of target domain. In this paper, we propose a novel contrastive learning framework for unpaired image-to-image translation, called MCCUT. We utilize the multi-crop views to generate the negatives via the center-crop and the random-crop, which can improve the diversity of negatives and meanwhile increase the quality of negatives. To constrain the embedings in the deep feature space,, we formulate a new domain consistency loss function, which encourages the generated images to be close to the real images in the embedding space of same domain. Furthermore, we present a dual coordinate channel attention network by embedding positional information into SENet, which called DCSE module. We employ the DCSE module in the design of generator, which makes the generator pays more attention to channels with greater weight. In many image-to-image translation tasks, our method achieves state-of-the-art results, and the advantages of our method have been proved through extensive comparison experiments and ablation research.
PDF
点此查看论文截图
Unsupervised Synthetic Image Refinement via Contrastive Learning and Consistent Semantic and Structure Constraints
Authors:Ganning Zhao, Tingwei Shen, Suya You, C. -C. Jay Kuo
Ensuring the realism of computer-generated synthetic images is crucial to deep neural network (DNN) training. Due to different semantic distributions between synthetic and real-world captured datasets, there exists semantic mismatch between synthetic and refined images, which in turn results in the semantic distortion. Recently, contrastive learning (CL) has been successfully used to pull correlated patches together and push uncorrelated ones apart. In this work, we exploit semantic and structural consistency between synthetic and refined images and adopt CL to reduce the semantic distortion. Besides, we incorporate hard negative mining to improve the performance furthermore. We compare the performance of our method with several other benchmarking methods using qualitative and quantitative measures and show that our method offers the state-of-the-art performance.
PDF
点此查看论文截图
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning
Authors:Cheng Lu, Huayu Chen, Jianfei Chen, Hang Su, Chongxuan Li, Jun Zhu
Guided sampling is a vital approach for applying diffusion models in real-world tasks that embeds human-defined guidance during the sampling procedure. This paper considers a general setting where the guidance is defined by an (unnormalized) energy function. The main challenge for this setting is that the intermediate guidance during the diffusion sampling procedure, which is jointly defined by the sampling distribution and the energy function, is unknown and is hard to estimate. To address this challenge, we propose an exact formulation of the intermediate guidance as well as a novel training objective named contrastive energy prediction (CEP) to learn the exact guidance. Our method is guaranteed to converge to the exact guidance under unlimited model capacity and data samples, while previous methods can not. We demonstrate the effectiveness of our method by applying it to offline reinforcement learning (RL). Extensive experiments on D4RL benchmarks demonstrate that our method outperforms existing state-of-the-art algorithms. We also provide some examples of applying CEP for image synthesis to demonstrate the scalability of CEP on high-dimensional data.
PDF Accepted at ICML 2023