After comparing various algorithms and brainstorming about their methods of implementation, I came up with the Rainbow Hessel et al., 2017 and Soft Actor-Critic Haarnoja et al., 2019 as the most in-demand and recent algorithms, whose implementation in mlpack would be crucial.

Thus, I propose to add them to the existing RL codebase. Here are the details of what I expect to have accomplished at the end of the summer.

  • Improving the current QLearning implementation.
  • Implementing Rainbow as an improvement on DQN. This would include adding the following as extensions:
    • Dueling DQN
    • Noisy DQN
    • Categorical DQN
    • N-step DQN
  • Writing test cases for each of the implementations
  • Implementing Soft Actor-Critic (SAC) for continuous action space, along with its tests
  • Creating detailed docs for all the above implementations
  • Creating necessary environments, for proper testing of algorithms above (after discussing with mentors)



Nishant Kumar


  • Rahul Prabhu
  • Marcus Edel