A lot of recent RL research for continuous actions has focused on policy gradient algorithms and actor-critic architectures. A quadrotor is (i) an easy-to-understand mobile robot platform whose (ii) ...