Skip to main content Link Search Menu Expand Document (external link)

Deep Reinforcement Learning

Stable Baselines rely on TF 1.x but Stable Baselines v3 rely on PyTorch.

Benchmarks

Best model from CleanRL:

ppo_atari_envpool.py --exp-name a2c --update-epochs 1 --num-minibatches 1 --norm-adv False --num-envs 64 --clip-vloss False --vf-coef 0.25 --anneal-lr False --num-steps 5 --track.

Fun fact

  • Running on my CPU was faster than Colab GPU (for the MinAtar environment), possibly because the data was not high dimensional and the network was not very deep

Comments