DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models

Abstract

This paper introduces DeLeaker, a novel dynamic inference-time reweighting method designed to mitigate semantic leakage in text-to-image (T2I) models. DeLeaker dynamically adjusts attention weights during the image generation process to reduce the accidental encoding of sensitive or unintended information. Experimental results demonstrate that DeLeaker effectively reduces semantic leakage while maintaining high image quality and alignment with the intended prompt. This approach provides a practical solution for enhancing privacy and control in generative AI systems.

1. Introduction

Text-to-image models have revolutionized content creation, but they are susceptible to semantic leakage, where unintended or sensitive information from the training data or prompt implicitly appears in generated images. This leakage can pose significant privacy and ethical concerns, making it crucial to develop effective mitigation strategies. This work addresses the problem of controlling and reducing such unintended information transfer during the inference phase of T2I generation. Models used in this article include various text-to-image diffusion models and the proposed DeLeaker system.

2. Related Work

Previous research on privacy in generative models has explored methods like differential privacy, data sanitization, and adversarial training to prevent information leakage. Other approaches have focused on watermarking or post-processing filters to obscure sensitive details in generated content. While these methods offer some protection, they often come with compromises in model performance or require retraining, highlighting the need for dynamic inference-time solutions.

3. Methodology

DeLeaker operates by dynamically reweighting attention mechanisms within text-to-image diffusion models during the inference process. It identifies potential leakage pathways by analyzing feature activations and prompt-image relationships. Based on this analysis, the system applies a targeted reweighting strategy to suppress the influence of features associated with unintended semantic information, effectively preventing its manifestation in the final output.

4. Experimental Results

Experiments demonstrate DeLeaker's efficacy in reducing semantic leakage across various T2I benchmarks and user-defined leakage scenarios. Metrics such as leakage detection rate and image quality assessments confirm that the method significantly lowers unintended information transfer without degrading the aesthetic or semantic fidelity of generated images. For instance, in a comparative study of leakage detection, DeLeaker consistently outperformed baseline methods.

Method	Leakage Detection Rate (LDR) ↓	FID Score (Image Quality) ↓
Baseline (No Mitigation)	0.75	18.5
Adversarial Debiasing	0.50	22.1
DeLeaker (Proposed)	0.15	19.2

The table above illustrates the performance of DeLeaker compared to baseline methods in mitigating semantic leakage. Leakage Detection Rate (LDR) measures the probability of detecting unintended information, while FID (Fréchet Inception Distance) assesses image quality. Lower LDR and FID values indicate better performance. DeLeaker consistently achieves a lower LDR with minimal impact on FID, showcasing its effectiveness.

5. Discussion

The results indicate that DeLeaker provides a robust and practical solution for mitigating semantic leakage in text-to-image models, addressing a critical privacy and ethical concern. Its inference-time nature makes it highly adaptable to existing models without requiring extensive retraining or data modification. Future work could explore integrating DeLeaker with user feedback mechanisms or extending its application to other generative model architectures.