This project aims to add support for File Loads method of inserting data into BigQuery for streaming pipelines. The PR - #7655 for [BEAM-6553] added support in the Python SDK for writing to BigQuery using File Loads method for Batch pipelines. However, support still needs to be added for Streaming pipelines.

Streaming pipelines with non-default Windowing, Triggering and Accumulation mode should be able to write data to BigQuery using file loads method. In case of failure, the pipeline should fail atomically. This means that each record should be loaded into BigQuery at-most-once.

The JIRA issue for this project is [BEAM-6611].

Student

Tanay Tummalapalli

Mentors

  • Pablo Estrada
close

2019