Contributor
Juan Martín Loyola

Bayesian Additive Regression Trees in PyMC3


Mentors
Austin Rochford, Osvaldo Martin
Organization
NumFOCUS

Bayesian Additive Regression Trees (BART) is a Bayesian nonparametric approach to estimating functions using regression trees. A BART model consist on a sum of regression trees with (homoskedastic) normal additive noise. Regression trees are defined by recursively partitioning the input space, and defining a local model in each resulting region of input space in order to approximate some unknown function. BARTs are useful and flexible model to capture interactions and non-linearities and have been proved useful tools for variable selection.

Bayesian Additive Regression Trees will allow PyMC3 users to perform regressions with a “canned” non-parametric model. By simple calling a method, users will obtain the mean regressor plus the uncertainty estimation in a fully Bayesian way. This can be used later to predict on hold-out data. Furthermore, the implemented BART model will allow experience users to specify their own priors for the specific problem they are tackling, improving performance substantially.