Beam has a number of classic streaming SQL benchmarks known as "Nexmark" coded up in both raw Java and also Beam SQL. So far, expanding functionality has been the focus of Beam SQL so there is little known about performance - we know only that it is a pretty straightforward mapping from SQL to Beam that should work OK a lot of the time. It would be interesting to see where the bottlenecks are when these SQL benchmarks are translated via Beam SQL into a Beam pipeline and then again translated to the native capabilities of e.g. Spark. Flink and Dataflow.

Student

Kai Jiang

Mentors

  • Kenneth Knowles
close

2018