视频生成

发布日期: 2024-08-31

2024-08-31 更新

SurGen: Text-Guided Diffusion Model for Surgical Video Generation

Authors:Joseph Cho, Samuel Schmidgall, Cyril Zakka, Mrudang Mathur, Rohan Shad, William Hiesinger

Diffusion-based video generation models have made significant strides, producing outputs with improved visual fidelity, temporal coherence, and user control. These advancements hold great promise for improving surgical education by enabling more realistic, diverse, and interactive simulation environments. In this study, we introduce SurGen, a text-guided diffusion model tailored for surgical video synthesis, producing the highest resolution and longest duration videos among existing surgical video generation models. We validate the visual and temporal quality of the outputs using standard image and video generation metrics. Additionally, we assess their alignment to the corresponding text prompts through a deep learning classifier trained on surgical data. Our results demonstrate the potential of diffusion models to serve as valuable educational tools for surgical trainees.
PDF

点此查看论文截图

木子已

https://ipaper.today/2024/08/31/2024-08-31-shi-pin-sheng-cheng/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

视频生成

LLM

2024-09-02 LLM

LLM

图像生成

2024-08-31 图像生成

图像生成

视频生成

2024-08-31 更新

SurGen: Text-Guided Diffusion Model for Surgical Video Generation

打赏用于支持本站流量费