Back to Browse

D-OPSD: On-Policy Self-Distillation for Few-Step Diffusion Tuning

80 views
May 7, 2026
8:10

We introduce an innovative training paradigm called D-OPSD, which allows you to continuously learn new concepts and styles while maintaining the performance of distilled diffusion models capable of efficient prime step inference. The existing fine-tuning method had a problem that reduced the model's reasoning ability, but the researchers solved it by using the context understanding ability of the latest model of LLM/VLM encoder. In this way, the model performs two roles of itself, the student and the teacher, and conducts self-distillation learning on its own execution path based on multi-mode information that combines text and images. As a result, you can effectively acquire new knowledge while preserving existing high-quality image generation through On-Policy learning without an external compensation function. Through experiments, it was proven to have excellent reasoning efficiency and generalization ability in both LoRA adaptation and overall fine-tuning environments. https://arxiv.org/pdf/2605.05204

Download

0 formats

No download links available.

D-OPSD: On-Policy Self-Distillation for Few-Step Diffusion Tuning | NatokHD