# MOSAIC **Repository Path**: pipieger/MOSAIC ## Basic Information - **Project Name**: MOSAIC - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-02-17 - **Last Updated**: 2026-02-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement

Dong She^*, Siming Fu^*, Mushui Liu^*, Qiaoqiao Jin^*, Hualiang Wang^*,
Mu Liu, Jidong Jiang⁺
Fanqie AI Team, ByteDance

## 🔥 News - **2026-01-27**: 🎉 Congratulations! MOSAIC has been accepted by ICLR 2026! - **2026-01-27**: 🔥 Update the multi-subject data generation pipeline! - **2025-09-30**: 🔥 Release training/inference codes and [models](https://huggingface.co/ByteDance-FanQie/MOSAIC)(resolution 512x512). The vision of resolution 1024x1024 is coming soon. - **2025-09-30**: 🔥 Release the [SemAlign-MS-Subjects200K](https://huggingface.co/datasets/ByteDance-FanQie/SemAlign-MS-Subjects200K) dataset - **2025-09-02**: The [arXiv paper](https://arxiv.org/abs/2509.01977v1) of MOSAIC is released. - **2025-08-20**: The [project page](https://bytedance-fanqie-ai.github.io/MOSAIC/) of MOSAIC is released. ## 📖 Introduction We present MOSAIC, a representation-centric framework that rethinks multi-subject generation through explicit semantic correspondence and orthogonal feature disentanglement. Our key insight is that multi-subject generation requires precise semantic alignment at the representation level—knowing exactly which regions in the generated image should attend to which parts of each reference.

MOSAIC introduces two key supervisions: (1) Semantic Correspondence Attention Loss (blue region) enforces precise point-to-point alignment between reference tokens and their corresponding locations in the target latent, ensuring high consistency; (2) Multi-Reference Disentanglement Loss (green region) maximizes the divergence between different references’ attention distributions, pushing each subject into orthogonal representational subspaces. ## ⚡️ Quick Start ### 🔧 Requirements and Installation Install the requirements ```bash conda create -n mosaic python=3.10 -y conda activate mosaic pip install -r requirements.txt ``` ### ✍️ Inference ```bash python inference.py ``` It will download the Flux-dev model and the MOSAIC checkpoints, then execute the pre-configured demo case. You are then encouraged to experiment and create your own unique outputs by edit [example_cases.json](example_cases.json). ### 🚄 Training 1. Train with the [SemAlign-MS-Subjects200K](https://huggingface.co/datasets/ByteDance-FanQie/SemAlign-MS-Subjects200K) dataset ```bash huggingface-cli download --repo-type dataset ByteDance-FanQie/SemAlign-MS-Subjects200K --local-dir your_data_folder/SemAlign-MS-Subjects200K ``` 2. Train with customized dataset. To set up the preprocessing environment, first download the [DIFT](https://github.com/Tsingularity/dift) and [GeoAware](https://github.com/Junyi42/GeoAware-SC) codebases. Next, integrate our preprocessing scripts by copying [dift_point_matching.py](preprocess/dift_point_matching.py) and [geoaware_point_matching.py](preprocess/geoaware_point_matching.py) into their respective directories. You can then run these scripts to generate Semantic Correspondence for your own dataset. Then you could train the code by `bash train.sh` ## TODO To support research and the open-source community, we will release the entire project—including datasets, inference pipelines, and model weights. Thank you for your patience and continued support! 🌟 - ✅ Release tarining and inference codes. - ✅ Release the SemAlign-MS-Subjects200K dataset. - ✅ Release the multi-subject driven dataset generation pipeline. - ✅ Release model checkpoints (512x512). - ⬜ Release model checkpoints (1024x1024). ## Citation ``` @misc{she2025mosaicmultisubjectpersonalizedgeneration, title={MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement}, author={Dong She and Siming Fu and Mushui Liu and Qiaoqiao Jin and Hualiang Wang and Mu Liu and Jidong Jiang}, year={2025}, eprint={2509.01977}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.01977}, } ```