VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis

Zeren Xiong1, Yue Yu1, Zedong Zhang1,
Shuo Chen2, Jian Yang1, Jun Li1*

1Nanjing University of Science and Technology
2Nanjing University
Code(coming soon) Award



Teaser figure.


Abstract

Creating novel images by fusing visual cues from multiple sources is a fundamental yet underexplored problem in image-to-image generation, with broad applications in artistic creation, virtual reality and visual media. Existing methods often face two key challenges: coexistent generation, where multiple objects are simply juxtaposed without true integration, and bias generation, where one object dominates the output due to semantic imbalance. To address these issues, we propose Visual Mixing Diffusion (VMDiff), a simple yet effective diffusion-based framework that synthesizes a single, coherent object by integrating two input images at both noise and latent levels. Our approach comprises: (1) a hybrid sampling process that combines guided denoising, inversion, and spherical interpolation with adjustable parameters to achieve structure-aware fusion, mitigating coexistent generation; and (2) an efficient adaptive adjustment module, which introduces a novel similarity-based score to automatically and adaptively search for optimal parameters, countering semantic bias. Experiments on a curated benchmark of 780 concept pairs demonstrate that our method outperforms strong baselines in visual quality, semantic consistency, and human-rated creativity.

Novel Object Synthesis


corgi
+
corgi
=
unknown
corgi + coffee machine
corgi
+
corgi
=
unknown
corgi + coffee machine
corgi
+
corgi
=
unknown
corgi + coffee machine
corgi
+
corgi
=
unknown
corgi + coffee machine



Framework

Breed mixing results


Comparisons with Multi-Concept Generation Methods.

Semantic style transfer results

Comparisons with Mixing and Image Editing Methods.

Novel object synthesis results


More Results.

Novel object synthesis results



Novel object synthesis results



Novel object synthesis results



Novel object synthesis results



Novel object synthesis results



Novel object synthesis results



Novel object synthesis results