The package mlr is a comprehensive machine learning toolkit for R, providing a standardized interface to over sixty machine learning R packages, in combination with a wide range of features related to visualization, data manipulation, model evaluation and selection, and parameter tuning. Even though mlr offers the possibility of performing automatic data preprocessing when applying a machine learning algorithm, the current implementation is relatively limited in scope and functionality. This project seeks to extend mlr's capability in this regard, by developing a supplementary package mlrCPO with an API that gives more flexibility to the user, and by providing access to a wider range of preprocessing methods. The project introduces a first-class CPO ("Composable Preprocessing Operator") object that represents a particular data transformation procedure, and which can be organized in pipelines using a composition operator %>>%. Many CPO classes implementing the most popular and widely used preprocessing methods are implemented.




  • Lars Kotthoff
  • Bernd Bischl