# DRL_udacity **Repository Path**: prg/DRL_udacity ## Basic Information - **Project Name**: DRL_udacity - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-05-14 - **Last Updated**: 2021-05-14 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # DRL_udacity **Welcome to Deep Reinforment Learning world!** This is an explaintable and modified version of udacity DRL homework~ - DQN: modified from Udacity repo, tested on Breakout-v0 env. - PPO: wrote by myself, tested on Pendulum-v0 and BipedalWalker-v2 envs. - policy gradient: REINFORCE with baseline and entropy loss, tested on CartPole-v0 - monte-carlo: modified version, tested on BlackJack env. - Temporal Difference: modified version, tested on CliffWalking-v0