发布日期: 2023-02-01

2023-02-01 更新

Edge-guided Multi-domain RGB-to-TIR image Translation for Training Vision Tasks with Challenging Labels

Authors:Dong-Guw Lee, Myung-Hwan Jeon, Younggun Cho, Ayoung Kim

The insufficient number of annotated thermal infrared (TIR) image datasets not only hinders TIR image-based deep learning networks to have comparable performances to that of RGB but it also limits the supervised learning of TIR image-based tasks with challenging labels. As a remedy, we propose a modified multidomain RGB to TIR image translation model focused on edge preservation to employ annotated RGB images with challenging labels. Our proposed method not only preserves key details in the original image but also leverages the optimal TIR style code to portray accurate TIR characteristics in the translated image, when applied on both synthetic and real world RGB images. Using our translation model, we have enabled the supervised learning of deep TIR image-based optical flow estimation and object detection that ameliorated in deep TIR optical flow estimation by reduction in end point error by 56.5\% on average and the best object detection mAP of 23.9\% respectively. Our code and supplementary materials are available at https://github.com/rpmsnu/sRGB-TIR.
PDF Accepted Contributed Paper to 2023 IEEE International Conference on Robotics and Automation (ICRA)

点此查看论文截图

Extremal Domain Translation with Neural Optimal Transport

Authors:Milena Gazdieva, Alexander Korotin, Daniil Selikhanovych, Evgeny Burnaev

We propose the extremal transport (ET) which is a mathematical formalization of the theoretically best possible unpaired translation between a pair of domains w.r.t. the given similarity function. Inspired by the recent advances in neural optimal transport (OT), we propose a scalable algorithm to approximate ET maps as a limit of partial OT maps. We test our algorithm on toy examples and on the unpaired image-to-image translation task.
PDF

点此查看论文截图

Few-shot Face Image Translation via GAN Prior Distillation

Authors:Ruoyu Zhao, Mingrui Zhu, Xiaoyu Wang, Nannan Wang

Face image translation has made notable progress in recent years. However, when training on limited data, the performance of existing approaches significantly declines. Although some studies have attempted to tackle this problem, they either failed to achieve the few-shot setting (less than 10) or can only get suboptimal results. In this paper, we propose GAN Prior Distillation (GPD) to enable effective few-shot face image translation. GPD contains two models: a teacher network with GAN Prior and a student network that fulfills end-to-end translation. Specifically, we adapt the teacher network trained on large-scale data in the source domain to the target domain with only a few samples, where it can learn the target domain’s knowledge. Then, we can achieve few-shot augmentation by generating source domain and target domain images simultaneously with the same latent codes. We propose an anchor-based knowledge distillation module that can fully use the difference between the training and the augmented data to distill the knowledge of the teacher network into the student network. The trained student network achieves excellent generalization performance with the absorption of additional knowledge. Qualitative and quantitative experiments demonstrate that our method achieves superior results than state-of-the-art approaches in a few-shot setting.
PDF

点此查看论文截图

木子已

https://ipaper.today/2023/02/01/2023-02-01-i2i-translation/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

I2I Translation

Vision Transformer

2023-02-01 Vision Transformer

Vision Transformer

Few-Shot

2023-02-01 Few-Shot

Few-Shot

I2I Translation

2023-02-01 更新

Edge-guided Multi-domain RGB-to-TIR image Translation for Training Vision Tasks with Challenging Labels

Extremal Domain Translation with Neural Optimal Transport

Few-shot Face Image Translation via GAN Prior Distillation

打赏用于支持本站流量费