Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: progress bar during query in QueryJob.to_dataframe and QueryJob.to_arrow #343

Closed
tswast opened this issue Oct 26, 2020 · 2 comments · Fixed by #352
Closed

feat: progress bar during query in QueryJob.to_dataframe and QueryJob.to_arrow #343

tswast opened this issue Oct 26, 2020 · 2 comments · Fixed by #352
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Oct 26, 2020

Is your feature request related to a problem? Please describe.

When running a query via the %%bigquery magics or waiting for it to finish via QueryJob.to_dataframe or QueryJob.to_arrow, an argument progress_bar_type is accepted.

Currently, this only shows the progress of the query results download. It would be great if it would also give an indicator while the query is executing.

Describe the solution you'd like

When a value is passed to progress_bar_type, show some kind of progress bar. Ideally, it would work similarly to the UI. For example,

  • Show the job state in the progress bar description. For example, is it currently "pending" (queued, waiting for resources) or "running" (actually executing).
  • Show how many "stages" there are via length of the query_plan.
  • Find the latest incomplete stage.
  • Use parallel_inputs as the total amount of work (per stage)
  • Use completed_parallel_inputs as the amount of work completed so far.

To populate this, instead of calling result() once:

  • Call result(timeout=[a few seconds]) every few seconds.
  • Call job.reload() to fetch the latest job statistics.
  • Update the progress bar.
  • Repeat.

Describe alternatives you've considered

Additional context

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Oct 26, 2020
@tswast tswast added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Oct 26, 2020
@tswast
Copy link
Contributor Author

tswast commented Oct 27, 2020

Some additional thoughts: with dynamic query planning, we don't actually know how many stages will be required, so using these stats as a progress bar could be misleading.

It'd still be useful to show that the query is making progress somehow.

@tswast
Copy link
Contributor Author

tswast commented Oct 27, 2020

If we do show stage progress, we'll want to periodically update the description with Stage 1/X and elapsed seconds like the UI does.

https://github.com/tqdm/tqdm/blob/0f823e79f303b4a93ef1381badb1e65757e5070f/tqdm/std.py#L1413

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants