# DRL **Repository Path**: wolf953/DRL ## Basic Information - **Project Name**: DRL - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-01-26 - **Last Updated**: 2022-03-21 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Deep Reinforcement Learning 1. **Overview.** * Reinforcement Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_1.pdf)] [[lecture note](https://github.com/wangshusen/DeepLearning/blob/master/LectureNotes/DRL/DRL.pdf)] [[Video (in Chinese)](https://youtu.be/vmkRMvhCW5c)]. * Value-Based Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_2.pdf)] [[Video (in Chinese)](https://youtu.be/jflq6vNcZyA)]. * Policy-Based Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_3.pdf)] [[Video (in Chinese)](https://youtu.be/qI0vyfR2_Rc)]. * Actor-Critic Methods [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_4.pdf)] [[Video (in Chinese)](https://youtu.be/xjd7Jq9wPQY)]. * AlphaGo [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_5.pdf)] [[Video (in Chinese)](https://youtu.be/zHojAp5vkRE)]. 2. **TD Learning.** * Sarsa [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_1.pdf)] [[Video (in Chinese)](https://youtu.be/-cYWdUubB6Q)]. * Q-learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_2.pdf)] [[Video (in Chinese)](https://youtu.be/Ymy2w3DGn2U)]. * Multi-Step TD Target [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_3.pdf)] [[Video (in Chinese)](https://youtu.be/UqTP138IATc)]. 3. **Advanced Topics on Value-Based Learning.** * Experience Replay (ER) & Prioritized ER [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_1.pdf)] [[Video (in Chinese)](https://youtu.be/rhslMPmj7SY)]. * Overestimation, Target Network, & Double DQN [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_2.pdf)] [[Video (in Chinese)](https://youtu.be/X2-56QN79zc)]. * Dueling Networks [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_3.pdf)] [[Video (in Chinese)](https://youtu.be/DBux6cA0EoM)]. 4. **Policy Gradient with Baseline.** * Policy Gradient with Baseline [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_1.pdf)] [[Video (in Chinese)](https://youtu.be/yNEqbptitZs)]. * REINFORCE with Baseline [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_2.pdf)] [[Video (in Chinese)](https://youtu.be/Ob78ADXTQNo)]. * Advantage Actor-Critic (A2C) [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_3.pdf)] [[Video (in Chinese)](https://youtu.be/mtT4TSGSon8)]. * REINFORCE versus A2C [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_4.pdf)] [[Video (in Chinese)](https://youtu.be/hN9WMIMMeAI)]. 5. **Advanced Topics on Policy-Based Learning.** * Trust-Region Policy Optimization (TRPO). * Policy Network + RNNs. 6. **Dealing with Continuous Action Space.** * Discrete versus Continuous Control. [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_1.pdf)] * Deterministic Policy Gradient (DPG) for Continuous Control. [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_2.pdf)] * Stochastic Policy Gradient for Continuous Control. [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_3.pdf)] 7. **Multi-Agent Reinforcement Learning.** * Basics and Challenges [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_1.pdf)] [[Video (in Chinese)](https://youtu.be/KN-XMQFTD0o)]. * Centralized VS Decentralized [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_2.pdf)] [[Video (in Chinese)](https://youtu.be/0HV1hsjd1y8)]. 8. **Imitation Learning.** * Inverse Reinforcement Learning. * Generative Adversarial Imitation Learning (GAIL).