Skip to content

Conversation

lidalei
Copy link
Contributor

@lidalei lidalei commented Jul 27, 2022

This PR fixes an issue with BigQueryInsertJobOperator. If the task reaches timeout set by task execution_timeout, on_kill will be called but the self.job_id is None. This is because the function _submit_job is a blocking call but self.job_id is only set after it. This PR is hugely inspired by #22955.

@lidalei lidalei requested a review from turbaszek as a code owner July 27, 2022 14:40
@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Jul 27, 2022
@lidalei lidalei marked this pull request as draft July 27, 2022 14:44
@lidalei lidalei marked this pull request as ready for review July 29, 2022 08:50
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks cool

@potiuk
Copy link
Member

potiuk commented Jul 29, 2022

Some errors though

@lidalei
Copy link
Contributor Author

lidalei commented Jul 29, 2022

Some errors though

Fixed the failed test case. The Conflict exception will be raised when we call _begin.

Traceback (most recent call last):
  File "/Users/dalei/.pyenv/versions/3.8.10/lib/python3.8/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/Users/dalei/go/src/github.com/lidalei/airflow/airflow/providers/google/common/hooks/base_google.py", line 463, in inner_wrapper
    return func(self, *args, **kwargs)
  File "/Users/dalei/go/src/github.com/lidalei/airflow/airflow/providers/google/cloud/hooks/bigquery.py", line 1542, in insert_job
    job._begin()
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1298, in _begin
    super(QueryJob, self)._begin(client=client, retry=retry, timeout=timeout)
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 510, in _begin
    api_response = client._call_api(
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 759, in _call_api
    return call()
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/api_core/retry.py", line 283, in retry_wrapped_func
    return retry_target(
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/api_core/retry.py", line 190, in retry_target
    return target()
  File "/Users/dalei/go/src/github.com/lidalei/airflow/venv/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 494, in api_request
    raise exceptions.from_http_response(response)
google.api_core.exceptions.Conflict: 409 POST https://bigquery.googleapis.com/bigquery/v2/projects/xxx/jobs?prettyPrint=false: Already Exists: Job xxx:EU.abc_test
Location: EU
Job ID: abc_test

@lidalei lidalei requested review from josh-fell and potiuk July 29, 2022 15:01
@potiuk potiuk merged commit e84d753 into apache:main Aug 4, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 4, 2022

Awesome work, congrats on your first merged pull request!

@hazemAmr0
Copy link

good work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants