D-OPSD: On-Policy Self-Distillation for Few-Step Diffusion Tuning

Name: D-OPSD: On-Policy Self-Distillation for Few-Step Diffusion Tuning
Uploaded: May 7, 2026
Duration: 490 s

Research Paper Review1.03K subscribers

80 views

May 7, 2026

8:10

We introduce an innovative training paradigm called D-OPSD, which allows you to continuously learn new concepts and styles while maintaining the performance of distilled diffusion models capable of efficient prime step inference. The existing fine-tuning method had a problem that reduced the model's reasoning ability, but the researchers solved it by using the context understanding ability of the latest model of LLM/VLM encoder. In this way, the model performs two roles of itself, the student and the teacher, and conducts self-distillation learning on its own execution path based on multi-mode information that combines text and images. As a result, you can effectively acquire new knowledge while preserving existing high-quality image generation through On-Policy learning without an external compensation function. Through experiments, it was proven to have excellent reasoning efficiency and generalization ability in both LoRA adaptation and overall fine-tuning environments. https://arxiv.org/pdf/2605.05204

Download

0 formats

No download links available.