Executor leaves no background work upon exiting `start` contextmanager #2920

jcrist · 2020-07-06T23:47:01Z

Previously when the executor.start() contextmanager would exit early (say on an interrupt or error), there might be background work still happening, since the executor wouldn't wait for or cancel the remaining submitted tasks.

We now attempt to cancel all pending work running on the executor, and wait for tasks that haven't completed in cases where we can't efficiently/robustly cancel them. This ensures that when executor.start() exits, no tasks will continue to progress.

For a DaskExecutor with a temporary cluster, we shutdown the cluster, so no lingering work can persist.
For a DaskExecutor with an external cluster (specified with an address) or an inproc cluster (using local threads only), we stop all pending tasks and wait for all running tasks to finish.
For a LocalDaskExecutor, we terminate the backing pool as quickly as possible, then wait for the pool to close.
- For a process pool, this terminates the processes
- For a thread pool, we attempt to interrupt the threads (CPython specific).
A LocalExecutor runs everything in the main thread, so no background tasks can persist.

This is a precursor for the cancellation implementation (#2771).

Please describe your work and make sure your PR:

adds new tests (if appropriate)
add a changelog entry in the changes/ directory (if appropriate)
updates docstrings for any new functions or function arguments, including docs/outline.toml for API reference docs (if appropriate)

Previously, the dask executors could leave work continuuing in the background upon exit from the `start` contextmanager. This could lead to odd behavior, where tasks from a flow run might still be running after `flow.run()` has completed. This PR changes the semantics of the `start` contextmanager so that no lingering work remains after an executor has exited the `start` contextmanager. - For a `DaskExecutor` with a temporary cluster, we shutdown the cluster, so no lingering work can persist. - For a `DaskExecutor` with an external cluster (specified with an `address`), we stop all pending tasks and wait for all running tasks to finish. - For a `LocalDaskExecutor`, we terminate the backing pool as quickly as possible, then wait for the pool to close. - For a process pool, this terminates the processes - For a thread pool, we attempt to interrupt the threads (CPython specific). - A `LocalExecutor` runs everything in the main thread, so no background tasks can persist. This is a precursor to getting Cancellation to work in a robust way.

codecov · 2020-07-07T02:07:00Z

Codecov Report

Merging #2920 into master will decrease coverage by 0.03%.
The diff coverage is 90.65%.

The current limits are really old.

cicdw

Nice, this feels like a solid enhancement. Left a few minor comments.

requirements.txt

src/prefect/engine/executors/dask.py

jcrist · 2020-07-08T02:27:58Z

I believe all comments have been addressed.

jcrist added 4 commits July 6, 2020 15:58

Add logger to executor classes

2edeb45

A few fixups

17850e7

A few fixups, tests

2a56da4

jcrist requested review from joshmeek and lauralorenz as code owners July 6, 2020 23:47

jcrist added 2 commits July 6, 2020 18:49

Add changelog entry [skip ci]

1e04a9e

Fixup

1ad69ba

Bump dask requirements

63c5244

The current limits are really old.

cicdw requested changes Jul 8, 2020

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

src/prefect/engine/executors/dask.py Show resolved Hide resolved

src/prefect/engine/executors/dask.py Show resolved Hide resolved

Respond to comments

e0164b3

cicdw approved these changes Jul 8, 2020

View reviewed changes

jcrist merged commit 30c88b3 into PrefectHQ:master Jul 8, 2020

jcrist deleted the cleanup-tasks-executor branch July 8, 2020 02:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Executor leaves no background work upon exiting `start` contextmanager #2920

Executor leaves no background work upon exiting `start` contextmanager #2920

Uh oh!

jcrist commented Jul 6, 2020 •

edited

Loading

Uh oh!

codecov bot commented Jul 7, 2020

Uh oh!

cicdw left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcrist commented Jul 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Executor leaves no background work upon exiting start contextmanager #2920

Executor leaves no background work upon exiting start contextmanager #2920

Uh oh!

Conversation

jcrist commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 7, 2020

Codecov Report

Uh oh!

cicdw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcrist commented Jul 8, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Executor leaves no background work upon exiting `start` contextmanager #2920

Executor leaves no background work upon exiting `start` contextmanager #2920

jcrist commented Jul 6, 2020 •

edited

Loading