# DRL_udacity

**Repository Path**: prg/DRL_udacity

## Basic Information

- **Project Name**: DRL_udacity
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-05-14
- **Last Updated**: 2021-05-14

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# DRL_udacity
**Welcome to Deep Reinforment Learning world!**  
This is an explaintable and modified version of udacity DRL homework~  

- DQN: modified from Udacity repo, tested on Breakout-v0 env.  
- PPO: wrote by myself, tested on Pendulum-v0 and BipedalWalker-v2 envs.
- policy gradient: REINFORCE with baseline and entropy loss, tested on CartPole-v0  
- monte-carlo: modified version, tested on BlackJack env.  
- Temporal Difference: modified version, tested on CliffWalking-v0