2023-04-17 更新
Self-Supervised Scene Dynamic Recovery from Rolling Shutter Images and Events
Authors:Yangguang Wang, Xiang Zhang, Mingyuan Lin, Lei Yu, Boxin Shi, Wen Yang, Gui-Song Xia
Scene Dynamic Recovery (SDR) by inverting distorted Rolling Shutter (RS) images to an undistorted high frame-rate Global Shutter (GS) video is a severely ill-posed problem, particularly when prior knowledge about camera/object motions is unavailable. Commonly used artificial assumptions on motion linearity and data-specific characteristics, regarding the temporal dynamics information embedded in the RS scanlines, are prone to producing sub-optimal solutions in real-world scenarios. To address this challenge, we propose an event-based RS2GS framework within a self-supervised learning paradigm that leverages the extremely high temporal resolution of event cameras to provide accurate inter/intra-frame information. % In this paper, we propose to leverage the event camera to provide inter/intra-frame information as the emitted events have an extremely high temporal resolution and learn an event-based RS2GS network within a self-supervised learning framework, where real-world events and RS images can be exploited to alleviate the performance degradation caused by the domain gap between the synthesized and real data. Specifically, an Event-based Inter/intra-frame Compensator (E-IC) is proposed to predict the per-pixel dynamic between arbitrary time intervals, including the temporal transition and spatial translation. Exploring connections in terms of RS-RS, RS-GS, and GS-RS, we explicitly formulate mutual constraints with the proposed E-IC, resulting in supervisions without ground-truth GS images. Extensive evaluations over synthetic and real datasets demonstrate that the proposed method achieves state-of-the-art and shows remarkable performance for event-based RS2GS inversion in real-world scenarios. The dataset and code are available at https://w3un.github.io/selfunroll/.
PDF
点此查看论文截图
Delta Denoising Score
Authors:Amir Hertz, Kfir Aberman, Daniel Cohen-Or
We introduce Delta Denoising Score (DDS), a novel scoring function for text-based image editing that guides minimal modifications of an input image towards the content described in a target prompt. DDS leverages the rich generative prior of text-to-image diffusion models and can be used as a loss term in an optimization problem to steer an image towards a desired direction dictated by a text. DDS utilizes the Score Distillation Sampling (SDS) mechanism for the purpose of image editing. We show that using only SDS often produces non-detailed and blurry outputs due to noisy gradients. To address this issue, DDS uses a prompt that matches the input image to identify and remove undesired erroneous directions of SDS. Our key premise is that SDS should be zero when calculated on pairs of matched prompts and images, meaning that if the score is non-zero, its gradients can be attributed to the erroneous component of SDS. Our analysis demonstrates the competence of DDS for text based image-to-image translation. We further show that DDS can be used to train an effective zero-shot image translation model. Experimental results indicate that DDS outperforms existing methods in terms of stability and quality, highlighting its potential for real-world applications in text-based image editing.
PDF Project page: https://delta-denoising-score.github.io/