CartPole Reinforcement Learning

Comparing Monte Carlo, Q-Learning and SARSA algorithms

Tech Stack: Python OpenAI Gym NumPy
Categories: Reinforcement Learning

Environment Details

Actions

  • Discrete (2 actions)
  • 0: Push left
  • 1: Push right

Reward

  • +1 for each step
  • Termination at 200 steps
  • Solved at 195+ avg

Termination

  • Pole angle > ±12°
  • Cart position > ±2.4
  • Episode length > 200

Implemented Algorithms

Monte Carlo

Q-Learning

SARSA

Technical Implementation

View on GitHub