After comparing various algorithms and brainstorming about their methods of implementation, I came up with the Rainbow Hessel et al., 2017 and Soft Actor-Critic Haarnoja et al., 2019 as the most in-demand and recent algorithms, whose implementation in mlpack would be crucial.
Thus, I propose to add them to the existing RL codebase. Here are the details of what I expect to have accomplished at the end of the summer.
- Improving the current QLearning implementation.
- Implementing Rainbow as an improvement on DQN. This would include adding the following as extensions:
- Dueling DQN
- Noisy DQN
- Categorical DQN
- N-step DQN
- Writing test cases for each of the implementations
- Implementing Soft Actor-Critic (SAC) for continuous action space, along with its tests
- Creating detailed docs for all the above implementations
- Creating necessary environments, for proper testing of algorithms above (after discussing with mentors)