Skip to content

Replace start-period and start-interval with initial-timeout and initial-interval #49633

@MrQubo

Description

@MrQubo

Description

My use case for start-period and start-interval is to have initial transition from starting health status to healthy health status be faster, and from what I have found this was a motivating use case for implementing them.
Here's an example of how I would use it in docker-compose.yml:

services:
  postgres:
    image: postgres:13.8-bullseye
    healthcheck:
      test: [ "CMD-SHELL", "pg_isready" ]
      interval: 2m
      timeout: 10s
      retries: 3
      start_period: 1m
      start_interval: 1s
    [...]
  app:
    depends_on:
      postgres:
        condition: service_healthy
    [...]

Without start-interval and with the interval of 2m I would have to wait 2 minutes for the app container to start. With the start-interval of 1s I dont' have to wait that long. However, after postgres service transitions from starting to healthy, it's quite resource-wasteful to run healthcheck every second. Unfortunately that's what happens. For this use case I would prefer a behaviour, where the interval becomes 2m after first successful healthcheck.

After trying to use the current implementation I have not found a satisfying solution. I would like to suggest an alternative design, instead of start-period and start-interval, to better accommodate the use case of healthcheck as a condition for starting a dependant container.

  • The new implementation would add two optional options --initial-interval=DURATION and --initial-timeout=DURATION.
  • If initial-interval is set, then the first healthcheck runs after initial-interval seconds, otherwise it runs after interval seconds. If the status of the container is starting and the initial-interval is set, then healthcheck will be run again initial-interval seconds after the previous one completes, otherwise it will be run again after interval seconds.
  • On the first successful healthcheck, the health status of container is changed from starting to healthy.
  • If the status of the container is still starting after initial-timeout seconds, then the status of the container is changed from starting to unhealthy.

In my opinion such implementation would be easier and more intuitive then the current one with start-period and start-interval.

Some additional points to be considered:

  • Instead of adding new options it's possible to make --start-period and --start-interval behave as --initial-timeout and --initial-interval. It's worth noting, what are the differences in my proposal and the current implementation. The container changes health status to unhealthy exactly after start-period seconds, instead of start-period plus the number of retries required. The healthcheck interval is switched from start-interval to interval after first successful healtcheck or start-period seconds passes, instead of always waiting for start-period seconds to pass.
  • The naming I proposed is a bit confusing. --initial-interval is just --interval used during starting health status. But --initial-timeout is not related to --timeout in such a manner.
  • What should the default value of initial-timeout be? Reasonable possibilities I can think of are: same as timeout, some fixed amount like 30s, infinity (i.e., health status will never switch directly from starting to unhealthy).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions