# web-rwkv-py

**Repository Path**: sallon/web-rwkv-py

## Basic Information

- **Project Name**: web-rwkv-py
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-05-10
- **Last Updated**: 2025-05-10

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Web-RWKV-Py
Python binding for [`web-rwkv`](https://github.com/cryscan/web-rwkv).

# Todos
- [x] Basic V5 inference support
- [x] Support V4, V5, V6 and V7
- [ ] Batched inference

# Usage
1. Install python and rust.
2. Install maturin by

   ```bash
   $ pip install maturin
   ```
4. Build and install:

   ```bash
   $ maturin develop --release
   ```

5. Try using `web-rwkv` in python:

   ```python
   import web_rwkv_py as wrp

   model = wrp.Model(
      "/path/to/model.st", # model path
      quant=0,             # int8 quantization layers
      quant_nf4=0,         # nf4 quantization layers
      quant_sf4=0,         # sf4 quantization layers
   )
   model.clear_state()
   logits = model.run([114, 514])
   ```
   
# Advanced Usage
1. Get, clone and load current state:

   ```python
   logits = model.run([114, 514])
   state = model.back_state(wrp.StateDevice.Gpu)
   # state = model.back_state(wrp.StateDevice.Cpu)
   state_cloned = state.deep_clone()

   model.load_state(state_cloned)
   logits = model.run([1919, 810])
   ```
   
2. Return predictions of all tokens (not only the last's):

   ```python
   logits, state = model.run_full([114, 514, 1919, 810], state=None)
   assert(len(logits) == 4)
   ```