AsterixDB currently attempts to perform an efficient join using a hybrid-hash-join and uses a nested-loop-join when hybrid-hash-join is not appropriate. If the data is already sorted, there may be cases where a merge join would be more efficient. This project will migrate an existing merge join from an outdated AsterixDB repository and will build a new query plan off of that to implement a parallel sort merge join for data across many partitions.

Student

Stephen Ermshar

Mentors

  • Prestonc@Apache
  • Ali Alsuliman
close

2019