Contributor
Parismita Das

Max Margin Interval Trees


Mentors
Torsten Hothorn, Alexandre Drouin
Organization
R project for statistical computing

There are few R packages available for interval regression, a machine learning problem which is important in genomics and medicine. Like usual regression, the goal is to learn a function that inputs a feature vector and outputs a real-valued prediction. Unlike usual regression, each output in the training set is an interval of acceptable values (rather than one value). In the terminology of the survival analysis literature, this is regression with “left, right, and interval censored” output/response data.

Max margin interval trees is a new nonlinear model for this problem (Drouin et al., 2017). A dynamic programming algorithm is used to find the optimal split point for each feature. The dynamic programming algorithm has been implemented in C++ and there are wrappers to this solver in R and Python (https://github.com/aldro61/mmit). The Python package includes a decision tree learner. However there is not yet an implementation of the decision tree learner in the R package. The goal of this project is to write an R package that implements the decision tree learner in R, using partykit.