-
Notifications
You must be signed in to change notification settings - Fork 3.1k
exporter/prometheusremotewrite: Improve logging of outgoing requests #41856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exporter/prometheusremotewrite: Improve logging of outgoing requests #41856
Conversation
case <-ticker.C: | ||
// In normal state, wIndex and rIndex will differ by one. To avoid having -1 as a final value, we set it to 0 as minimum. | ||
lag := max(0, int64(prweWAL.wWALIndex.Load()-prweWAL.rWALIndex.Load())) | ||
logger.Info("recording lag log", zap.Int64("wIndex", int64(prweWAL.wWALIndex.Load())), zap.Int64("rIndex", int64(prweWAL.wWALIndex.Load())), zap.Int64("lag", lag)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log line was being spammed. I don't see any value in a log line that tells us we're updating a certain metric.
} | ||
|
||
if err != nil { | ||
return consumererror.NewPermanent(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I'm missing something, but all these NewPermanent
errors are only adding "New Permanent" to the log message, and nothing more. I removed all of those calls because having "New Permanent" in our logs doesn't seem relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I quickly learned that this is needed for retries :). It's unfortunate that the side effect is a very polluted log experience
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so with that in mind are you sure about removing the consumererror.NewPermanent
in the other places like
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/41856/files#diff-2631d8a8855fdc08cf077c0ab99544147ca73eec572bc871d8fe717e826eda38L359 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good catch. I was trying to get rid of permanent errors from WAL and ended up doing a bit more by accident when doing my replace. Good catch!
Signed-off-by: Arthur Silva Sens <[email protected]>
632903e
to
08fdc08
Compare
Not sure why the changelog CI is failing 🤔 |
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
2f1e9d3
to
11d96fc
Compare
Alright, I think the PR is ready for review but I honestly don't understand why changelog CI is failing 😭 |
// Reference for different behavior according to status code: | ||
// https://github.com/prometheus/prometheus/pull/2552/files#diff-ae8db9d16d8057358e49d694522e7186 | ||
if resp.StatusCode >= 200 && resp.StatusCode < 300 { | ||
prwe.settings.Logger.Info("remote write request successful", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this should probably be debug-level logging? This is essentially request logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah fair point, I was not sure which level we should put this. I was just trying to understand if requests were being made at all.
I'll update this to debug
Signed-off-by: Arthur Silva Sens <[email protected]>
Rebase on latest mais solved the problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one small nit, otherwise LGTM.
Signed-off-by: Arthur Silva Sens <[email protected]>
Based on #41992 (comment), I'm marking this PR as 'ready to merge' |
Description
We've received a bunch of indirect and direct feedback that it's hard to understand what is going on when the exporter fails to send Remote Write requests.
I'm addressing that by improving our logging around our outgoing requests.
Fixes #39703