Skip to content

[17.0] Dead jobs are likely not requeued when max_retries is 0 #814

@sersanchus

Description

@sersanchus

According to the documentation, setting max_retries to 0 should allow jobs to be retried infinitely. However, upon reviewing the code in runner.py, it appears that jobs with max_retries=0 are mistakenly set to 'JobFoundDead' instead of being requeued.

This seems to be due to the logic in the _query_requeue_dead_jobs method, where the condition for exceeding max_retries does not properly account for the 0 value as "infinite retries." As a result, jobs are marked as failed even though they should be requeued indefinitely.

Expected behavior:
Jobs with max_retries=0 should be requeued without limit when found dead, in line with the documentation.

Actual behavior:
Jobs with max_retries=0 are being marked as 'JobFoundDead' and not requeued.

Code reference:
See runner.py, method _query_requeue_dead_jobs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions