# RoboTransfer **Repository Path**: matrix_ming_tsai/RoboTransfer ## Basic Information - **Project Name**: RoboTransfer - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-08-25 - **Last Updated**: 2025-08-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # *RoboTransfer*: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer [![🌐 Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://horizonrobotics.github.io/robot_lab/robotransfer) [![πŸ“„ arXiv](https://img.shields.io/badge/πŸ“„-arXiv-b31b1b)](https://arxiv.org/abs/2505.23171) [![πŸŽ₯ Video](https://img.shields.io/badge/πŸŽ₯-Video-red)](https://youtu.be/dGXKtqDnm5Q) [![中文介绍](https://img.shields.io/badge/中文介绍-07C160?logo=wechat&logoColor=white)](https://mp.weixin.qq.com/s/c9-1HPBMHIy4oEwyKnsT7Q) [![ζœΊε™¨δΉ‹εΏƒδ»‹η»](https://img.shields.io/badge/ζœΊε™¨δΉ‹εΏƒδ»‹η»-07C160?logo=wechat&logoColor=white)](https://mp.weixin.qq.com/s/Hj2h3nxO8XxPeqd3OhctKA) > ***RoboTransfer***, a diffusion-based video generation framework for robotic data synthesis. Unlike previous methods, RoboTransfer integrates multi-view geometry with explicit control over scene components, such as background and object attributes. By incorporating cross-view feature interactions and global depth/normal conditions, RoboTransfer ensures geometry consistency across views. This framework allows fine-grained control, including background edits and object swaps. Overall Framework ## βœ… Setup Environment We use uv to manage dependencies, to get our environments: ```bash git clone https://github.com/HorizonRobotics/RoboTransfer.git --recursive cd RoboTransfer export UV_HTTP_TIMEOUT=600 uv sync uv pip install -e . ``` ## πŸš€ Inference ```bash uv run main.py ``` ## πŸ“ˆ More Inference Data Update the dependencies of the data pipeline. ```bash uv sync --extra data ``` ### βš™οΈ For more sim data You can obtain more simulation data from the [RoboTwin CVPR Challenge](https://github.com/RoboTwin-Platform/RoboTwin/tree/CVPR-Challenge-2025-Round1). You can then use the process_sim.sh script to convert raw data (.pickle files and .hdf5) into the RoboTransfer format with geometric conditioning. ```bash script/process_sim.sh ``` ### πŸ€– For more real data For real-world data collected by the ALOHA-AgileX robot system, access the dataset [RoboTransfer-RealData](https://huggingface.co/datasets/HorizonRobotics/RoboTransfer-RealData). You can then process raw RGB images using the process_real.sh script to convert them into RoboTransfer format with geometric conditioning. ```bash script/process_real.sh ``` ## πŸ™Œ Acknowledgement RoboTransfer builds upon the following amazing projects and models: 🌟 [Video-Depth-Anything](https://github.com/DepthAnything/Video-Depth-Anything) 🌟 [Lotus](https://github.com/EnVision-Research/Lotus) 🌟 [GPT4o](https://platform.openai.com/docs/models/gpt-4o) 🌟 [GroundSam](https://github.com/IDEA-Research/Grounded-Segment-Anything) 🌟 [IOPaint](https://github.com/Sanster/IOPaint) ## βš–οΈ License This project is licensed under the [Apache License 2.0](LICENSE). See the `LICENSE` file for details. ## πŸ“š Citation If you use RoboTransfer in your research or projects, please cite: ```bibtex @misc{liu2025robotransfergeometryconsistentvideodiffusion, title={RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer}, author={Liu Liu and Xiaofeng Wang and Guosheng Zhao and Keyu Li and Wenkang Qin and Jiaxiong Qiu and Zheng Zhu and Guan Huang and Zhizhong Su}, year={2025}, eprint={2505.23171}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2505.23171}, } ```