基于双扩散模型的未知源-目标域生成式图像隐写方法

董云云; 张锦弘; 李钒效; 朱波; 周维

doi:10.7544/issn1000-1239.202550348

基于双扩散模型的未知源-目标域生成式图像隐写方法

A Generative Image Steganography Method Based on Dual Diffusion Models for Unknown Source-Target Domains

摘要

摘要: 社交网络的快速发展使得通过公共频道的数据传输变得无处不在，带来了显著的安全性和隐私问题。尽管数据加密能确保安全性，但其独特的格式可能引起怀疑。隐写术通过将秘密数据隐藏在普通媒体中，如图像、文本和音频，提供了一种隐蔽的替代方案，降低了被检测的风险。传统基于像素编辑的隐写方法易被隐写分析或压缩干扰所破译，而现有的生成式隐写技术多依赖针对性提示词或特定数据源，难以实现高泛化与灵活隐蔽。针对上述挑战，提出一种基于双扩散模型的未知源-目标域生成式图像隐写方法。该方法首先利用潜在扩散模型（latent diffusion model，LDM）对任意来源的秘密图像进行反演与重构优化，以获取高保真潜在表示。为了破坏LDM模型反演过程残留的结构信息对后续图像隐藏和提取的干扰，该方法提出了高斯结构重排对齐模块，通过伪随机正交扰动与空间重排对秘密图像潜在表示进行加密嵌入。该模块在保持秘密潜在表示的模长不变的同时，使其结构上更加服从高斯分布，对齐后续模块的输入，同时提高了安全性。最后，借助去噪扩散隐式模型（denoising diffusion implicit models，DDIM）对加密后的潜在表示进行去噪重采样，生成兼具目标域外观与秘密信息的含密图像。实验证明，所提方法无需针对特定源域训练，具备良好的跨域通用性；在4个公开数据集上，与多种对比方法相比，在秘密图像提取精度、含密图像不可感知度及抗隐写分析能力方面均取得了显著提升。

Abstract: The rapid expansion of social networks has made data transmission over public channels ubiquitous, raising serious security and privacy concerns. Although encryption can guarantee confidentiality, its distinctive format may arouse suspicion. Steganography offers a covert alternative by hiding secret data within ordinary media, such as images, text, or audio, thereby reducing the risk of detection. Traditional pixel-editing steganographic methods are vulnerable to steganalysis or compression artifacts, while existing generative approaches often depend on tailored prompts or specific data sources, limiting their generalization and flexibility. To address these challenges, we propose a source-agnostic, target-domain image steganography framework based on a dual diffusion model. First, this method employs a latent diffusion model (LDM) to invert and optimize arbitrary secret images, obtaining high-fidelity latent representations. To eliminate residual structural cues from the LDM inversion that could compromise subsequent hiding and extraction, this method introduces a Gaussian structural reordering alignment module which encrypts the secret’s latent representation via pseudo-random orthogonal perturbation and spatial shuffling, preserving its norm while enforcing a Gaussian-like distribution that aligns with downstream inputs and enhances security. Finally, we use a denoising diffusion implicit models (DDIM) sampler to denoise and decode the encrypted latent vectors, producing stego images that faithfully reflect the target domain’s appearance while carrying the hidden content. Experiments on four public datasets demonstrate that our method requires no source-specific training, achieves strong cross-domain generalization, and significantly outperforms competing techniques in extraction accuracy, imperceptibility, and resistance to steganalysis.

HTML全文

参考文献(33)

施引文献

资源附件(0)