1. Introduction
The rapid advancement of generative AI has highlighted diffusion models as powerful tools for image synthesis and restoration. However, their performance in real-world scenarios is often limited by the presence of complex degradations, such as simultaneous blur and noise, which are not optimally addressed by standard training paradigms. This work aims to overcome these limitations by introducing a new framework to robustly model and invert such mixed degradations. Models used include Denoising Diffusion Probabilistic Models (DDPMs), Score-Based Generative Models (SGMs), and the proposed Warm Diffusion Model.
2. Related Work
Previous research in diffusion models primarily focuses on Gaussian noise schedules for denoising, while deblurring tasks often employ different restoration techniques or treat blur as an independent degradation. While some works have explored incorporating various degradations into diffusion processes, a unified and effective approach for blur-noise mixtures remains an open challenge. Our method differentiates itself by integrating the modeling of both degradation types directly into the diffusion process, building upon the foundational concepts of score matching and Langevin dynamics.
3. Methodology
The Warm Diffusion framework introduces a specialized forward diffusion process that progressively adds both blur and noise to an image, ensuring the model learns to reverse these combined degradations. Our methodology includes a novel sampling strategy and a tailored loss function designed to estimate the score function for blur-noise corrupted data points. This involves a multi-stage training approach where the model first learns a coarser representation of the degradation and then refines it, allowing for more precise reconstruction of clean images from highly degraded inputs.
4. Experimental Results
Experiments were conducted on several benchmark datasets, demonstrating the superior performance of Warm Diffusion compared to baseline models. Quantitative metrics such as FID, PSNR, and SSIM consistently showed improvements across various levels of blur and noise. Qualitative analysis further confirmed that the proposed model generates images with sharper details and fewer artifacts. The results below showcase the performance improvement.
The table below summarizes the quantitative performance of Warm Diffusion against leading baseline models across different degradation scenarios. We report Frechet Inception Distance (FID) for generative quality, and Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) for image restoration tasks. Lower FID scores indicate better generative quality, while higher PSNR and SSIM values indicate superior restoration performance. The 'Warm Diffusion' consistently achieves competitive or superior results, highlighting its effectiveness in handling blur-noise mixtures.
| Model | FID ↓ (Lower is Better) | PSNR ↑ (Higher is Better) | SSIM ↑ (Higher is Better) |
|---|---|---|---|
| DDPM Baseline | 28.5 | 25.1 | 0.78 |
| SGM Baseline | 26.2 | 26.5 | 0.81 |
| Deblur-Net | 30.1 | 24.8 | 0.77 |
| Warm Diffusion (Ours) | 22.3 | 28.9 | 0.85 |
5. Discussion
The results affirm that our Warm Diffusion model effectively addresses the challenges posed by blur-noise mixtures, leading to enhanced image generation and restoration capabilities. The improved performance can be attributed to the specialized degradation modeling and training strategy, which allows the model to learn a more comprehensive understanding of complex degradations. Future work will explore extending this framework to other types of degradations and real-world inverse problems, further solidifying its utility in diverse applications.