-
Notifications
You must be signed in to change notification settings - Fork 13
Remove queued duration minimum threshold #2184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Generate changelog in
|
| // it doesn't necessarily mean there's a queue at all. We assume anything longer than | ||
| // this threshold, which should be longer than pauses in most cases, is the result | ||
| // of queueing. | ||
| private static final long QUEUED_DURATION_MINIMUM_THRESHOLD_NANOS = 250_000_000L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1230 was what originally added this threshold. Per discussions with @carterkozak & @pkoenig10 , we explicitly do not add queue metrics for cached executors in tritium clients (support for this was added in #1012).
|
Released 0.100.0 |
Before this PR
In our internal authentication service, we have a single threaded executor where, at any given time, there is at most one executing task and one queued task. The code looks something like:
Metrics seem to indicate that the queued duration p99 is longer than the duration p99. Here are metrics from our internal test environment.
This should be impossible, given how this executor is used.
But it happens because
TaggedMetricsExecutorServiceis simply dropping any samples below the threshold. This causes the value of the queued duration metrics to be artificially inflated - especially for executors that typically have short queue duration.It's confusing for measurements to simply be dropped in this way and causes the resulting metrics to be misleading.
After this PR
TaggedMetricsExecutorServiceno longer excludes measurements from the queued duration metric. This metric now accurately captures the time between submission and execution for all submitted tasks.