# WebSynthesis

**Repository Path**: nilbody_0/WebSynthesis

## Basic Information

- **Project Name**: WebSynthesis
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-30
- **Last Updated**: 2025-10-30

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# WebSynthesis

<img src="./figure/Inro.jpg" alt="overview" style="zoom:20%; margin: 0 auto; display: block;" />

<div style='display:flex; gap: 0.25rem; '>
<a href="https://www.arxiv.org/abs/2507.04370"><img src="https://img.shields.io/badge/arXiv-2507.04370-b31b1b.svg"></a>
<a href="https://huggingface.co/datasets/yifeigao/WebSynthesis"><img src="https://img.shields.io/badge/Project%20Page-onlink-orange"></a>
<a href='LICENSE'><img src='https://img.shields.io/badge/License-MIT.svg'></a>
</div>

## World Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis
WebSynthesis is a framework integrating **world model** learning and **Monte Carlo Tree Search** (MCTS), designed to significantly reduce the cost of online synthesis of high-quality Web UI trajectories. Through a two-stage curriculum, including UI fundamental understanding and UI behavior cloning, the policy agent acquires web navigation capabilities.

![framwork](./figure/framework.jpg)

## News
* [08/07/2025] Our Paper is now available in [Arxiv](https://www.arxiv.org/abs/2507.04370).
* [07/07/2025] We upload some of the training data and LoRA weights.
* [06/07/2025] We release the code of WebSynthesis.


## Two-stage Curriculum Learning
![class](./figure/class.jpg)

## Project Structure

The WebSynthesis project is organized as follows:

```shell
WebSynthesis/
├── config_files/              # Configuration files for different tasks (0.json, 1.json, ...)
├── data/                      # Generated data from MCTS runs (created during execution)
├── figure/                    # Images and figures used in the README and documentation
├── models/                    # Model-related code
│   ├── get_response.py        # Functions for getting responses from models
│   └── models.py              # Model definitions and implementations
├── utils/                     # Utility functions and helper modules
│   ├── new_obs_opt.py         # New observation optimization utilities
│   ├── obs_opt.py             # Observation optimization utilities
│   ├── prune_mcts.py          # MCTS pruning utilities
│   ├── query_llm.py           # LLM query utilities
│   ├── search_utils.py        # Search utilities
│   ├── text_utils.py          # Text processing utilities
│   ├── traj_utils.py          # Trajectory utilities
│   └── treeNode.py            # TreeNode implementation for MCTS
├── webMCTS/                   # WebMCTS core implementation
│   ├── base.py                # Base classes for MCTS
│   ├── mcts.py                # MCTS algorithm implementation
│   ├── prompt.py              # Prompt templates for MCTS
│   └── task.py                # Task definitions for MCTS
├── webmcts-ttraj/             # Traceable trajectories (created during execution)
├── webmcts-vtraj/             # Valuable trajectories (created during execution)
├── data/                      # Data generated by MCTS runs (created during execution)
├── fuzzy_match.json           # Fuzzy matching cache
├── merge.py                   # Script to merge and process MCTS data
├── README.md                  # Project documentation
├── run.py                     # Main script to run MCTS tasks
└── run.sh                     # Shell script to run MCTS on all config files
```

Key directories and files:
- `config_files/`: Contains JSON configuration files for different MCTS tasks (numbered 0.json, 1.json, etc.)
- `webMCTS/`: Core implementation of the WebMCTS algorithm
- `utils/`: Helper modules for various functions like observation optimization, LLM querying, and trajectory processing
- `models/`: Model-related code for the world model, reward model, and policy agents
- `run.py`: Main entry point for running MCTS tasks
- `run.sh`: Shell script to automatically run MCTS on all configuration files
- `merge.py`: Script to process and merge the generated MCTS data into valuable and traceable trajectories

## Data Collection (MCTS)

1. **Clone the GitHub Repository:**
   ```
   git clone https://github.com/LucusFigoGao/WebSynthesis.git
   ```

2. **Collection:**
   ```bash
   cd WebMCTS
   bash run.sh          # run MCTS
   pyhton merge.py      # merge the data
   ```

## Data Resources 

### UI Fundamental Understanding

|   Model Name    |                           Base Model                                            |                           Training Data                                            |                           LoRA                            |
| :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: |
| TextUI-Cap-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)            | [TextUI-dense-caption-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-caption2k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Cap-7B)  |
| TextUI-Func-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [TextUI-functionality-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-function6k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Func-7B)  |
| TextUI-Trans-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)            | [TextUI-state-transition-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/textui-transmission7k.json) | [🤗 link](https://huggingface.co/yifeigao/WebSynthesis/tree/main/TextUI-Trans-7B)  |

### UI Behavior Cloning
|   Model Name    |                           Base Model                                            |                           Training Data                                            |                           LoRA                            |
| :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: |
| WebSynthesis-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)            | [WebSynthesis-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/tree/main/websynthesis.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis)  |
| OS-Genesis-TextUI-7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | [OS-Genesis-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/tree/main/os_genesis_sft7k.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis)  |

### Collection of World Model State Transition
|   Model Name    |                           Base Model                                            |                           Training Data                                            |                           LoRA                            |
| :-------------: | :-------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :---------------------------------------------------------: |
| World Model 7B | [Qwen2.5-Instruct-7B](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)            | [world-model-training-data](https://huggingface.co/datasets/yifeigao/WebSynthesis/blob/main/world-model-training-data-27k.json) | [🤗 coming soon](https://huggingface.co/yifeigao/WebSynthesis/)  |

🙏 Many thanks to the following open-source projects for their raw data contributions:
- [ICLR'25] [OS-Genesis-Web-Data](https://huggingface.co/datasets/OS-Copilot/OS-Genesis-web-data)
- [ACL'25] [AgentTrek-Web-Data](https://huggingface.co/datasets/xlangai/AgentTrek)
- [ICLR'24] [WebArena-Web-Data](https://github.com/web-arena-x/webarena/tree/main/resources)

## Main Experiment
![main-exp](./figure/main-exp.png)

## Ablation Studies
![aba-exp](./figure/ablation.png)

## Scaling Analysis
![scaling-exp](./figure/scaling.jpg)

## Case Study on WebMCTS
![case](./figure/case_webmcts.jpg)

## Case Study on World Model
![case](./figure/world_model.jpg)


## Citation 📖

🫶 If you are interested in our work or find this repository / our data helpful, please consider using the following citation format when referencing our paper:

```bibtex
@misc{gao2025websynthesisworldmodelguidedmctsefficient,
   title={WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis}, 
   author={Yifei Gao and Junhong Ye and Jiaqi Wang and Jitao Sang},
   year={2025},
   eprint={2507.04370},
   archivePrefix={arXiv},
   primaryClass={cs.AI},
   url={https://arxiv.org/abs/2507.04370}, 
}
```