# WebSynthesis **Repository Path**: nilbody_0/WebSynthesis ## Basic Information - **Project Name**: WebSynthesis - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-30 - **Last Updated**: 2025-10-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # WebSynthesis overview
## World Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis WebSynthesis is a framework integrating **world model** learning and **Monte Carlo Tree Search** (MCTS), designed to significantly reduce the cost of online synthesis of high-quality Web UI trajectories. Through a two-stage curriculum, including UI fundamental understanding and UI behavior cloning, the policy agent acquires web navigation capabilities. ![framwork](./figure/framework.jpg) ## News * [08/07/2025] Our Paper is now available in [Arxiv](https://www.arxiv.org/abs/2507.04370). * [07/07/2025] We upload some of the training data and LoRA weights. * [06/07/2025] We release the code of WebSynthesis. ## Two-stage Curriculum Learning ![class](./figure/class.jpg) ## Project Structure The WebSynthesis project is organized as follows: ```shell WebSynthesis/ ├── config_files/ # Configuration files for different tasks (0.json, 1.json, ...) ├── data/ # Generated data from MCTS runs (created during execution) ├── figure/ # Images and figures used in the README and documentation ├── models/ # Model-related code │ ├── get_response.py # Functions for getting responses from models │ └── models.py # Model definitions and implementations ├── utils/ # Utility functions and helper modules │ ├── new_obs_opt.py # New observation optimization utilities │ ├── obs_opt.py # Observation optimization utilities │ ├── prune_mcts.py # MCTS pruning utilities │ ├── query_llm.py # LLM query utilities │ ├── search_utils.py # Search utilities │ ├── text_utils.py # Text processing utilities │ ├── traj_utils.py # Trajectory utilities │ └── treeNode.py # TreeNode implementation for MCTS ├── webMCTS/ # WebMCTS core implementation │ ├── base.py # Base classes for MCTS │ ├── mcts.py # MCTS algorithm implementation │ ├── prompt.py # Prompt templates for MCTS │ └── task.py # Task definitions for MCTS ├── webmcts-ttraj/ # Traceable trajectories (created during execution) ├── webmcts-vtraj/ # Valuable trajectories (created during execution) ├── data/ # Data generated by MCTS runs (created during execution) ├── fuzzy_match.json # Fuzzy matching cache ├── merge.py # Script to merge and process MCTS data ├── README.md # Project documentation ├── run.py # Main script to run MCTS tasks └── run.sh # Shell script to run MCTS on all config files ``` Key directories and files: - `config_files/`: Contains JSON configuration files for different MCTS tasks (numbered 0.json, 1.json, etc.) - `webMCTS/`: Core implementation of the WebMCTS algorithm - `utils/`: Helper modules for various functions like observation optimization, LLM querying, and trajectory processing - `models/`: Model-related code for the world model, reward model, and policy agents - `run.py`: Main entry point for running MCTS tasks - `run.sh`: Shell script to automatically run MCTS on all configuration files - `merge.py`: Script to process and merge the generated MCTS data into valuable and traceable trajectories ## Data Collection (MCTS) 1. **Clone the GitHub Repository:** ``` git clone https://github.com/LucusFigoGao/WebSynthesis.git ``` 2. **Collection:** ```bash cd WebMCTS bash run.sh # run MCTS pyhton merge.py # merge the data ``` ## Data Resources ### UI Fundamental Understanding | Model Name | Base Model | Training Data | LoRA | | :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: | | TextUI-Cap-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [TextUI-dense-caption-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-caption2k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Cap-7B) | | TextUI-Func-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [TextUI-functionality-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-function6k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Func-7B) | | TextUI-Trans-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [TextUI-state-transition-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-transmission7k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Trans-7B) | ### UI Behavior Cloning | Model Name | Base Model | Training Data | LoRA | | :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: | | WebSynthesis-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [WebSynthesis-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/tree/main/websynthesis.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis) | | OS-Genesis-TextUI-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [OS-Genesis-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/tree/main/os_genesis_sft7k.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis) | ### Collection of World Model State Transition | Model Name | Base Model | Training Data | LoRA | | :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: | | World Model 7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [world-model-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/world-model-training-data-27k.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis/) | 🙏 Many thanks to the following open-source projects for their raw data contributions: - [ICLR'25] [OS-Genesis-Web-Data](https://huggingface.co/datasets/OS-Copilot/OS-Genesis-web-data) - [ACL'25] [AgentTrek-Web-Data](https://huggingface.co/datasets/xlangai/AgentTrek) - [ICLR'24] [WebArena-Web-Data](https://github.com/web-arena-x/webarena/tree/main/resources) ## Main Experiment ![main-exp](./figure/main-exp.png) ## Ablation Studies ![aba-exp](./figure/ablation.png) ## Scaling Analysis ![scaling-exp](./figure/scaling.jpg) ## Case Study on WebMCTS ![case](./figure/case_webmcts.jpg) ## Case Study on World Model ![case](./figure/world_model.jpg) ## Citation 📖 🫶 If you are interested in our work or find this repository / our data helpful, please consider using the following citation format when referencing our paper: ```bibtex @misc{gao2025websynthesisworldmodelguidedmctsefficient, title={WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis}, author={Yifei Gao and Junhong Ye and Jiaqi Wang and Jitao Sang}, year={2025}, eprint={2507.04370}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2507.04370}, } ```