# BCQ

**Repository Path**: lpf521824/BCQ

## Basic Information

- **Project Name**: BCQ
- **Description**: Author's PyTorch implementation of BCQ for continuous and discrete actions
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-01-25
- **Last Updated**: 2021-01-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Batch-Constrained Deep Q-Learning (BCQ)

Batch-Constrained deep Q-learning (BCQ) is the first batch deep reinforcement learning, an algorithm which aims to learn offline without interactions with the environment.

BCQ was first introduced in our [ICML 2019 paper](https://arxiv.org/abs/1812.02900) which focused on continuous action domains. A discrete-action version of BCQ was introduced in a followup [Deep RL workshop NeurIPS 2019 paper](https://arxiv.org/abs/1910.01708). Code for each of these algorithms can be found under their corresponding folder. 

### Bibtex

```
@inproceedings{fujimoto2019off,
  title={Off-Policy Deep Reinforcement Learning without Exploration},
  author={Fujimoto, Scott and Meger, David and Precup, Doina},
  booktitle={International Conference on Machine Learning},
  pages={2052--2062},
  year={2019}
}
```

```
@article{fujimoto2019benchmarking,
  title={Benchmarking Batch Deep Reinforcement Learning Algorithms},
  author={Fujimoto, Scott and Conti, Edoardo and Ghavamzadeh, Mohammad and Pineau, Joelle},
  journal={arXiv preprint arXiv:1910.01708},
  year={2019}
}
```