# har

**Repository Path**: davidfrz123/har

## Basic Information

- **Project Name**: har
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-05
- **Last Updated**: 2026-03-09

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# HarnessRL: Learning to Harness LLM Agents via Reinforcement Learning

HarnessRL is a reinforcement learning framework that trains large language models to operate as autonomous software engineering agents. The system uses Group Relative Policy Optimization (GRPO) with execution-verified rewards from sandboxed Docker environments, enabling LLMs to learn multi-step tool-use trajectories for real-world coding tasks drawn from SWE-bench and similar benchmarks.

## Quick Start

### Installation

```bash
# Clone and install
git clone https://github.com/your-org/harnessrl.git
cd harnessrl
pip install -e ".[dev]"
```

### Configuration

```bash
# Copy environment template and fill in your keys
cp .env.example .env

# Configs live in configs/ -- override via Hydra CLI or edit YAML directly
```

### Run

```bash
# 1. Collect trajectories from sandbox environments
make collect-data CONFIG="--config-name=collect"

# 2. Launch GRPO training
make train CONFIG="--config-name=train_grpo"

# 3. Evaluate the trained checkpoint
make evaluate CONFIG="--config-name=evaluate"
```

## Project Structure

```
har_code/
  configs/          # Hydra YAML configuration files
  docker/           # Dockerfiles for sandbox environments
  harnessrl/        # Core library (rewards, rollout, sandbox, data, utils)
  scripts/          # Entry-point scripts for training, evaluation, data collection
  tests/            # Unit and integration tests
  notebooks/        # Analysis and visualization notebooks
```

## Citation

```bibtex
@inproceedings{harnessrl2025,
  title     = {HarnessRL: Learning to Harness LLM Agents via Reinforcement Learning},
  author    = {TODO},
  booktitle = {TODO},
  year      = {2025},
}
```