The BigQuery connector uses the standard BigQuery API which has a low transfer rate about 50MB / batch. In my current situation this results in 60k rows per batch. With analyzing 1-3 million rows the transfer takes between 20-40 minutes when the data is pulled in the in-memory engine.
With Storage API (High Throughput setting in the Simba drivers) the transfer rate can be significantly increased when people analyzing biggers datasets.
Implemented in | 11.5 |
Thanks for a great idea. We will have a look at it.