Skip to content

Conversation

@epwalsh
Copy link
Member

@epwalsh epwalsh commented Sep 14, 2022

This allows users to implement their own subclasses to customize how the BeakerExecutor allocates resources to run each step. Here's an example of a custom implementation that makes jobs preemptible if they're not run on AllenNLP clusters:

https://github.com/allenai/tango-beaker-template/blob/0346e8719388cf8f4cc0c80ac713ae14f570f7e0/scheduler.py

This allows users to implement their own subclasses to customize
how the `BeakerExecutor` allocates resources to run each step.
@epwalsh epwalsh requested a review from dirkgr September 14, 2022 21:03
Comment on lines +141 to +143
return ResourceAssignment(
cluster=cluster_to_use, resources=task_resources, priority=self.priority
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method allowed to return a value that says "We can't schedule this one right now."? I'm thinking of a scheduler that doesn't queue more than a few jobs ahead of time, and then waits to see which cluster frees up first. Though once again that's a problem that goes away when Beaker fixes https://github.com/allenai/beaker/issues/2544.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, no. But that would be nice. It would take some refactoring though so should probably be a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Just kidding, that was easy: 76e5e4d

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And cb438bd

)
steps_left_to_run.discard(step)
elif isinstance(exc, ResourceAssignmentError):
submitted_steps.discard(step_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mean they are discarded forever, does it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, they get put back into the queue, or I guess kept in the queue. The important queue here is steps_left_to_run.

@epwalsh epwalsh merged commit d34fe09 into main Sep 14, 2022
@epwalsh epwalsh deleted the beaker-scheduler branch September 14, 2022 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants