# CMoE **Repository Path**: philip7/CMoE ## Basic Information - **Project Name**: CMoE - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-17 - **Last Updated**: 2025-12-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # CMoE Implementation for the paper [CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference](https://arxiv.org/abs/2502.04416). ## Dependencies ```bash conda create -n cmoe python=3.11 conda activate cmoe conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.4 -c pytorch -c nvidia pip install datasets==2.21.0 pip install transformers==4.47.1 pip install accelerate==1.2.1 pip install sentencepiece==0.2.0 pip install protobuf==5.29.2 pip install matplotlib==3.10.0 pip install lap==0.5.12 pip install peft==0.14.0 ``` Note: please modify the version of some packages for your own environment. ## Quick Start Download the models from [Huggingface](https://huggingface.co/), then the you can run the code run_cmoe.py. Set model path as 'MODEL_PATH'. You can run the pre-defined testing script 'run.sh' by: ```bash bash run.sh ``` Or resetting the hyperparameters to run customized setting. For example, run S2A2E16 with 2,048 fine-tuning data on wikitext2: ```python python run_cmoe.py $MODEL_PATH wikitext2 \ --nshared 2 \ --nactivated 2 \ --nexperts 16 \ --nsamples 2048 \ --extra-lr 0.001 --bias-speed 0.001 --new-eval ``` ## Evaluation Our code automatically run ppl eval. If you want to do evaluation on downstream tasks, you can add the arg `--eval-zero`, where the code is implemented by [Wanda](https://github.com/locuslab/wanda). ## Cite If you found this work useful, please consider citing: ``` @article{pei2025cmoe, title={CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference}, author={Pei, Zehua and Zou, Lancheng and Zhen, Hui-Ling and Yu, Xianzhi and Liu, Wulong and Pan, Sinno Jialin and Yuan, Mingxuan and Yu, Bei}, journal={arXiv preprint arXiv:2502.04416}, year={2025} } ```