In this work, we show the first attempt at addressing domain adaptation in the noise space for image restoration. In particular, by leveraging the unique property of how auxiliary conditional inputs influence the multi-step denoising process, we derive a meaningful diffusion loss that guides the restoration model in progressively aligning both restored synthetic and real-world outputs with a target clean distribution. We refer to this method as denoising as adaptation. To prevent shortcuts during joint training, we present crucial strategies such as channel-shuffling layer and residual-swapping contrastive learning in the diffusion model. They implicitly blur the boundaries between conditioned synthetic and real data and prevent the reliance of the model on easily distinguishable features. Experimental results on three classical image restoration tasks, namely denoising, deblurring, and deraining, demonstrate the effectiveness of the proposed method.
Our main idea stems from the observation in the above figure (a). Here, we measure the noise prediction error of a diffusion model conditioned on a noisy version of the target image. The trend shows that conditions with fewer corruption levels facilitate lower prediction errors of the diffusion model. In other words, ``good'' conditions give low diffusion loss, and ``bad'' conditions lead to high diffusion loss. Our method leverages this phenomenon by taming a diffusion model conditioned on both the restored synthetic image and restored real image from the restoration network, as shown in the above figure (b). Both networks are jointly trained, with the restoration network optimized to provide good conditions to minimize the diffusion model's noise prediction error, aiming for a clean target distribution. After training, the diffusion model is discarded, leaving only the trained restoration network for inference.
The joint training of the restoration model and diffusion model, however, could lead to trivial solutions or shortcuts. For example, it is easy to distinguish the synthetic and real-world conditions by the pixel similarity or the channel index. Consequently, the restoration network will cheat the diffusion network by roughly degrading the high-frequency information in real-world images.
To mitigate this issue, we propose crucial strategies to fool the diffusion model, making it hard to discriminate between these two conditions. Specifically, we incorporate a channel-shuffling layer into the diffusion model and design a residual-swapping contrastive learning strategy to ensure the model genuinely learns to restore images accurately, rather than relying on easily distinguishable features. These strategies implicitly blur the boundaries between synthetic and real data, ensuring that both contribute effectively during joint training and facilitating their alignment with the target distribution. More details can be found in Section 3.2 of the paper.
We compare different domain adaptation (DA) strategies here: (a) Feature-space DA aligns the intermediate features across source and target domains. (b) Pixel-space DA translates source data to the ``style" of the target domain through adversarial learning. (c) The proposed noise-space DA is specifically designed for low-level vision. It gradually adapts the results from both source and target domains to the target clean image distribution, via multi-step denoising in the pixel-wise noise space. Particularly, the function network represents a restoration network in the context of image restoration.
Our work serves as a general domain adaptation strategy for various image restoration tasks, which is scalable and friendly to any restoration network. It is also interesting that for each type of architecture, our method can facilitate better performance as the complexity of the restoration network increases, demonstrating its effectiveness in addressing the overfitting problem of large models.
@article{liao2024denoising,
title={Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration},
author={Liao, Kang and Yue, Zongsheng and Wang, Zhouxia and Loy, Chen Change},
journal={arXiv preprint arXiv:2406.18516},
year={2024}
}