# BCQ **Repository Path**: lpf521824/BCQ ## Basic Information - **Project Name**: BCQ - **Description**: Author's PyTorch implementation of BCQ for continuous and discrete actions - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-01-25 - **Last Updated**: 2021-01-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Batch-Constrained Deep Q-Learning (BCQ) Batch-Constrained deep Q-learning (BCQ) is the first batch deep reinforcement learning, an algorithm which aims to learn offline without interactions with the environment. BCQ was first introduced in our [ICML 2019 paper](https://arxiv.org/abs/1812.02900) which focused on continuous action domains. A discrete-action version of BCQ was introduced in a followup [Deep RL workshop NeurIPS 2019 paper](https://arxiv.org/abs/1910.01708). Code for each of these algorithms can be found under their corresponding folder. ### Bibtex ``` @inproceedings{fujimoto2019off, title={Off-Policy Deep Reinforcement Learning without Exploration}, author={Fujimoto, Scott and Meger, David and Precup, Doina}, booktitle={International Conference on Machine Learning}, pages={2052--2062}, year={2019} } ``` ``` @article{fujimoto2019benchmarking, title={Benchmarking Batch Deep Reinforcement Learning Algorithms}, author={Fujimoto, Scott and Conti, Edoardo and Ghavamzadeh, Mohammad and Pineau, Joelle}, journal={arXiv preprint arXiv:1910.01708}, year={2019} } ```