I2I Translation


2023-03-23 更新

MEDIMP: Medical Images and Prompts for renal transplant representation learning

Authors:Leo Milecki, Vicky Kalogeiton, Sylvain Bodard, Dany Anglicheau, Jean-Michel Correas, Marc-Olivier Timsit, Maria Vakalopoulou

Renal transplantation emerges as the most effective solution for end-stage renal disease. Occurring from complex causes, a substantial risk of transplant chronic dysfunction persists and may lead to graft loss. Medical imaging plays a substantial role in renal transplant monitoring in clinical practice. However, graft supervision is multi-disciplinary, notably joining nephrology, urology, and radiology, while identifying robust biomarkers from such high-dimensional and complex data for prognosis is challenging. In this work, taking inspiration from the recent success of Large Language Models (LLMs), we propose MEDIMP — Medical Images and Prompts — a model to learn meaningful multi-modal representations of renal transplant Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE MRI) by incorporating structural clinicobiological data after translating them into text prompts. MEDIMP is based on contrastive learning from joint text-image paired embeddings to perform this challenging task. Moreover, we propose a framework that generates medical prompts using automatic textual data augmentations from LLMs. Our goal is to learn meaningful manifolds of renal transplant DCE MRI, interesting for the prognosis of the transplant or patient status (2, 3, and 4 years after the transplant), fully exploiting the available multi-modal data in the most efficient way. Extensive experiments and comparisons with other renal transplant representation learning methods with limited data prove the effectiveness of MEDIMP in a relevant clinical setting, giving new directions toward medical prompts. Our code is available at https://github.com/leomlck/MEDIMP.
PDF

点此查看论文截图

Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis

Authors:Hadrien Reynaud, Mengyun Qiao, Mischa Dombrowski, Thomas Day, Reza Razavi, Alberto Gomez, Paul Leeson, Bernhard Kainz

Image synthesis is expected to provide value for the translation of machine learning methods into clinical practice. Fundamental problems like model robustness, domain transfer, causal modelling, and operator training become approachable through synthetic data. Especially, heavily operator-dependant modalities like Ultrasound imaging require robust frameworks for image and video generation. So far, video generation has only been possible by providing input data that is as rich as the output data, e.g., image sequence plus conditioning in, video out. However, clinical documentation is usually scarce and only single images are reported and stored, thus retrospective patient-specific analysis or the generation of rich training data becomes impossible with current approaches. In this paper, we extend elucidated diffusion models for video modelling to generate plausible video sequences from single images and arbitrary conditioning with clinical parameters. We explore this idea within the context of echocardiograms by looking into the variation of the Left Ventricle Ejection Fraction, the most essential clinical metric gained from these examinations. We use the publicly available EchoNet-Dynamic dataset for all our experiments. Our image to sequence approach achieves an R2 score of 93%, which is 38 points higher than recently proposed sequence to sequence generation methods. A public demo is available here: bit.ly/3HTskPF. Code and models will be available at: https://github.com/HReynaud/EchoDiffusion.
PDF Under Review

点此查看论文截图

AptSim2Real: Approximately-Paired Sim-to-Real Image Translation

Authors:Charles Y Zhang, Ashish Shrivastava

Advancements in graphics technology has increased the use of simulated data for training machine learning models. However, the simulated data often differs from real-world data, creating a distribution gap that can decrease the efficacy of models trained on simulation data in real-world applications. To mitigate this gap, sim-to-real domain transfer modifies simulated images to better match real-world data, enabling the effective use of simulation data in model training. Sim-to-real transfer utilizes image translation methods, which are divided into two main categories: paired and unpaired image-to-image translation. Paired image translation requires a perfect pixel match, making it difficult to apply in practice due to the lack of pixel-wise correspondence between simulation and real-world data. Unpaired image translation, while more suitable for sim-to-real transfer, is still challenging to learn for complex natural scenes. To address these challenges, we propose a third category: approximately-paired sim-to-real translation, where the source and target images do not need to be exactly paired. Our approximately-paired method, AptSim2Real, exploits the fact that simulators can generate scenes loosely resembling real-world scenes in terms of lighting, environment, and composition. Our novel training strategy results in significant qualitative and quantitative improvements, with up to a 24% improvement in FID score compared to the state-of-the-art unpaired image-translation methods.
PDF

点此查看论文截图

文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !
  目录