# llm_simple **Repository Path**: wangcaho/llm_simple ## Basic Information - **Project Name**: llm_simple - **Description**: No description available - **Primary Language**: C++ - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-04-02 - **Last Updated**: 2026-04-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Orange Pi LLM推理 支持OrangePi LLM推理,当前测试硬件版本: Orange Pi 20T 24GB ## 安装 [安装文档](orangepi_install.md) ## QWen2 支持*Qwen2ForCausalLM*模型 ### 模型下载 建议通过git直接从modelscope下载(需要安装git lfs),比如DeepSeek-R1-Distill-Qwen-1.5B: ```git clone https://www.modelscope.cn/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.git``` ### 权重转换 #### BF16模型 (以Qwen2.5-3B-Instruct为例,请将路径替换为自己的路径) ```python3 /data/llm_simple/scripts/convert_qwen2_weight.py --input_model_path /ssd/models/Qwen2.5-3B-Instruct --output_dir /ssd/models/Qwen2.5-3B-Instruct_converted ``` #### AWQ模型 (以Qwen2.5-14B-Instruct-AWQ为例,请将路径替换为自己的路径) ```python3 /data/llm_simple/scripts/convert_qwen2_awq_weight.py --input_model_path /ssd/models/Qwen2.5-14B-Instruct-AWQ --output_dir /ssd/models/Qwen2.5-14B-Instruct-AWQ_converted``` ### 运行 *请将脚本中的路径改为自己的路径* ```bash scripts/example_text_completion_deepseek_r1_qwen2.5_1.5B_bf16_orangepi.sh``` ### 性能(输入256token/输出256token) |模型大小|ttft(ms)|decode(ms/token)| |---|---|---| |1.5B|461|142| |3B|776|284| |7B|3215|881| |3B-AWQ|3215|113| |7B-AWQ|2358|206 |14B-AWQ|8181|653| ## LLAMA2 ### 权重转换 #### LLAMA2-7B FP16 (支持llama官方发布的格式, 包含tokenizer.model,params.json,consolidated.00.pth文件) ```python3 scripts/convert_llama2_weight.py --input_dir --model_size 7B --output_dir ``` #### LLAMA2-7B-AWQ 4bit 权重下载链接:[model.safetensors](https://huggingface.co/TheBloke/Llama-2-7B-AWQ/blob/main/model.safetensors) ```python3 scripts/convert_llama_awq_4bit.py --input_safetensor --output_dir ``` #### LLAMA2-13B-AWQ 4bit 权重下载链接:[model.safetensors](https://huggingface.co/TheBloke/Llama-2-13B-AWQ/resolve/main/model.safetensors) ```python3 scripts/convert_llama_awq_4bit.py --input_safetensor --output_dir ``` ### 运行 *请将转化后的权重文件夹,配置文件, tokenizer文件拷贝到设备上并修改bash文件中对应的路径* 1. ```bash scripts/example_chat_llama2_7B_fp16_orangepi.sh``` 2. ```bash scripts/example_text_completion_llama2_7B_fp16_orangepi.sh``` 3. ```bash scripts/example_chat_llama2_7B_awq_4bit_orangepi.sh``` 4. ```bash scripts/example_text_completion_llama2_7B_awq_4bit_orangepi.sh``` 5. ```bash scripts/example_chat_llama2_13B_awq_4bit_orangepi.sh``` 6. ```bash scripts/example_text_completion_llama2_13B_awq_4bit_orangepi.sh``` ### 性能 |场景|ttft(ms)|decode(ms/token)| |---|---|---| |llama2-7B-AWQ-4bit|886|176.7| |llama2-7B-FP16|4498|568.4| |llama2-13B-AWQ-4bit|1819|320.1|