Contributor
Caleb Herbel

Dynamic Partitioning for Interval Joins in Asterix Database


Mentors
prestonc@apache, Tin Vu
Organization
The Apache Software Foundation

Asterix currently requires a static range hint in the query when performing interval joins. The range hint supplies split points for input data, then data is partitioned and sent to separate nodes based on those split points. A static query hint is nice because it gives the user more control over the split points and data partitioning, but it would be nice to have the option to run a query without having to process the data and its split points in advance. Within the last year, code was added to Asterix that implemented a spatial join with a dynamic range hint. This code can pick split points (or tiles) before the join. For my project, I am going to take that code, learn how it works, and then refactor and write code to make it work with interval data and interval joins.