Shogun is a powerful machine learning toolkit. The project has a long history and a huge codebase. Some parts are very outdated and not well-designed. Polishing the codebase and bringing Shogun to modern design will make it much easier to developers, and as such make the project more attractive for scientists to implement their work in. This GSoC project aims at re-designing Shogun’s data representation and some APIs, including features, labels and preprocessors, and bringing novel un-templated data classes with support for lazy evaluation to Shogun. By the end of this project, we expect an improvement of maintainability, stability, and beauty to the codebase of Shogun.



Wuwei Lin


  • Heiko Strathmann
  • Viktor Gal