2023-04-28 更新
Unsupervised Synthetic Image Refinement via Contrastive Learning and Consistent Semantic-Structural Constraints
Authors:Ganning Zhao, Tingwei Shen, Suya You, C. -C. Jay Kuo
Ensuring the realism of computer-generated synthetic images is crucial to deep neural network (DNN) training. Due to different semantic distributions between synthetic and real-world captured datasets, there exists semantic mismatch between synthetic and refined images, which in turn results in the semantic distortion. Recently, contrastive learning (CL) has been successfully used to pull correlated patches together and push uncorrelated ones apart. In this work, we exploit semantic and structural consistency between synthetic and refined images and adopt CL to reduce the semantic distortion. Besides, we incorporate hard negative mining to improve the performance furthermore. We compare the performance of our method with several other benchmarking methods using qualitative and quantitative measures and show that our method offers the state-of-the-art performance.
PDF
点此查看论文截图
ContraNeRF: 3D-Aware Generative Model via Contrastive Learning with Unsupervised Implicit Pose Embedding
Authors:Mijeoong Kim, Hyunjoon Lee, Bohyung Han
Although 3D-aware GANs based on neural radiance fields have achieved competitive performance, their applicability is still limited to objects or scenes with the ground-truths or prediction models for clearly defined canonical camera poses. To extend the scope of applicable datasets, we propose a novel 3D-aware GAN optimization technique through contrastive learning with implicit pose embeddings. To this end, we first revise the discriminator design and remove dependency on ground-truth camera poses. Then, to capture complex and challenging 3D scene structures more effectively, we make the discriminator estimate a high-dimensional implicit pose embedding from a given image and perform contrastive learning on the pose embedding. The proposed approach can be employed for the dataset, where the canonical camera pose is ill-defined because it does not look up or estimate camera poses. Experimental results show that our algorithm outperforms existing methods by large margins on the datasets with multiple object categories and inconsistent canonical camera poses.
PDF 20 pages. For the project page, see https://cv.snu.ac.kr/research/ContraNeRF/
点此查看论文截图
Towards Precise Weakly Supervised Object Detection via Interactive Contrastive Learning of Context Information
Authors:Lai Qi
Weakly supervised object detection (WSOD) aims at learning precise object detectors with only image-level tags. In spite of intensive research on deep learning (DL) approaches over the past few years, there is still a significant performance gap between WSOD and fully supervised object detection. In fact, most existing WSOD methods only consider the visual appearance of each region proposal but ignore employing the useful context information in the image. To this end, this paper proposes an interactive end-to-end WSDO framework called JLWSOD with two innovations: i) two types of WSOD-specific context information (i.e., instance-wise correlation andsemantic-wise correlation) are proposed and introduced into WSOD framework; ii) an interactive graph contrastive learning (iGCL) mechanism is designed to jointly optimize the visual appearance and context information for better WSOD performance. Specifically, the iGCL mechanism takes full advantage of the complementary interpretations of the WSOD, namely instance-wise detection and semantic-wise prediction tasks, forming a more comprehensive solution. Extensive experiments on the widely used PASCAL VOC and MS COCO benchmarks verify the superiority of JLWSOD over alternative state-of-the-art approaches and baseline models (improvement of 3.6%~23.3% on mAP and 3.4%~19.7% on CorLoc, respectively).
PDF
点此查看论文截图
Density Invariant Contrast Maximization for Neuromorphic Earth Observations
Authors:Sami Arja, Alexandre Marcireau, Richard L. Balthazor, Matthew G. McHarg, Saeed Afshar, Gregory Cohen
Contrast maximization (CMax) techniques are widely used in event-based vision systems to estimate the motion parameters of the camera and generate high-contrast images. However, these techniques are noise-intolerance and suffer from the multiple extrema problem which arises when the scene contains more noisy events than structure, causing the contrast to be higher at multiple locations. This makes the task of estimating the camera motion extremely challenging, which is a problem for neuromorphic earth observation, because, without a proper estimation of the motion parameters, it is not possible to generate a map with high contrast, causing important details to be lost. Similar methods that use CMax addressed this problem by changing or augmenting the objective function to enable it to converge to the correct motion parameters. Our proposed solution overcomes the multiple extrema and noise-intolerance problems by correcting the warped event before calculating the contrast and offers the following advantages: it does not depend on the event data, it does not require a prior about the camera motion, and keeps the rest of the CMax pipeline unchanged. This is to ensure that the contrast is only high around the correct motion parameters. Our approach enables the creation of better motion-compensated maps through an analytical compensation technique using a novel dataset from the International Space Station (ISS). Code is available at \url{https://github.com/neuromorphicsystems/event_warping}
PDF