-
Notifications
You must be signed in to change notification settings - Fork 3.1k
[exporter/prometheusremotewrite] Fix WAL deadlock #37630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
Signed-off-by: Arthur Silva Sens <[email protected]>
28b8e79
to
df49125
Compare
dashpole
approved these changes
Feb 3, 2025
Signed-off-by: Arthur Silva Sens <[email protected]>
dashpole
approved these changes
Feb 3, 2025
tombrk
approved these changes
Feb 4, 2025
yiquanzhou
added a commit
to dash0hq/opentelemetry-collector-contrib
that referenced
this pull request
Feb 5, 2025
* main: (392 commits) fix(deps): update module golang.org/x/text to v0.22.0 (open-telemetry#37686) [exporter/bmchelix] Second PR of New component: BMC Helix Exporter (open-telemetry#37350) chore(deps): update otel/opentelemetry-collector-contrib docker tag to v0.119.0 (open-telemetry#37688) [chore] fix codeowners allowlist (open-telemetry#37684) chore(deps): update otel/opentelemetry-collector docker tag to v0.119.0 (open-telemetry#37687) Update jpkroehling's affiliation (open-telemetry#37683) fix(deps): update module github.com/clickhouse/clickhouse-go/v2 to v2.30.3 (open-telemetry#37655) fix(deps): update all opentelemetry collector contrib packages to v0.119.0 (open-telemetry#37666) fix(deps): update module github.com/elastic/go-docappender/v2 to v2.4.0 (open-telemetry#37667) fix(deps): update all golang.org/x packages (open-telemetry#37680) [exporter/prometheusremotewrite] Fix WAL deadlock (open-telemetry#37630) fix(deps): update opentelemetry-go monorepo (open-telemetry#37673) fix(deps): update module github.com/shirou/gopsutil/v4 to v4.25.1 (open-telemetry#37671) fix(deps): update module github.com/spf13/pflag to v1.0.6 (open-telemetry#37658) fix(deps): update all github.com/aws packages (open-telemetry#37661) [chore] Prepare release 0.119.0 (open-telemetry#37660) make update-otel OTEL_VERSION=v0.119.0 OTEL_STABLE_VERSION=v1.25.0 (open-telemetry#37656) add documentation and warning log for deprecating AccessTokenPassthrough (open-telemetry#37575) chore: add myself, echlebek, as a codeowner (open-telemetry#37650) [processor/transform] Add support for flat configuration style (open-telemetry#37444) ...
yiquanzhou
added a commit
to dash0hq/opentelemetry-collector-contrib
that referenced
this pull request
Feb 5, 2025
* main: (392 commits) fix(deps): update module golang.org/x/text to v0.22.0 (open-telemetry#37686) [exporter/bmchelix] Second PR of New component: BMC Helix Exporter (open-telemetry#37350) chore(deps): update otel/opentelemetry-collector-contrib docker tag to v0.119.0 (open-telemetry#37688) [chore] fix codeowners allowlist (open-telemetry#37684) chore(deps): update otel/opentelemetry-collector docker tag to v0.119.0 (open-telemetry#37687) Update jpkroehling's affiliation (open-telemetry#37683) fix(deps): update module github.com/clickhouse/clickhouse-go/v2 to v2.30.3 (open-telemetry#37655) fix(deps): update all opentelemetry collector contrib packages to v0.119.0 (open-telemetry#37666) fix(deps): update module github.com/elastic/go-docappender/v2 to v2.4.0 (open-telemetry#37667) fix(deps): update all golang.org/x packages (open-telemetry#37680) [exporter/prometheusremotewrite] Fix WAL deadlock (open-telemetry#37630) fix(deps): update opentelemetry-go monorepo (open-telemetry#37673) fix(deps): update module github.com/shirou/gopsutil/v4 to v4.25.1 (open-telemetry#37671) fix(deps): update module github.com/spf13/pflag to v1.0.6 (open-telemetry#37658) fix(deps): update all github.com/aws packages (open-telemetry#37661) [chore] Prepare release 0.119.0 (open-telemetry#37660) make update-otel OTEL_VERSION=v0.119.0 OTEL_STABLE_VERSION=v1.25.0 (open-telemetry#37656) add documentation and warning log for deprecating AccessTokenPassthrough (open-telemetry#37575) chore: add myself, echlebek, as a codeowner (open-telemetry#37650) [processor/transform] Add support for flat configuration style (open-telemetry#37444) ...
chengchuanpeng
pushed a commit
to chengchuanpeng/opentelemetry-collector-contrib
that referenced
this pull request
Feb 8, 2025
I was taking a look over open-telemetry#20875 and hoping to finish it. Fixes open-telemetry#19363 Fixes open-telemetry#24399 Fixes open-telemetry#15277 --- As mentioned in open-telemetry#24399 (comment), I used a library to help me understand how the deadlock was happening. (1st commit). It showed that `persistToWal` was trying to acquire the lock, while `readPrompbFromWal` held it forever. I changed the strategy here and instead of using fs.Notify, and all that complicated logic around it, we're just using a pub/sub strategy between the writer and reader Go routines. The reader go routine, once finding an empty WAL, will now release the lock immediately and wait for a notification from the writer. While previously it would hold the lock while waiting for a write that would never happen. --------- Signed-off-by: Arthur Silva Sens <[email protected]>
zeck-ops
pushed a commit
to zeck-ops/opentelemetry-collector-contrib
that referenced
this pull request
Apr 23, 2025
I was taking a look over open-telemetry#20875 and hoping to finish it. Fixes open-telemetry#19363 Fixes open-telemetry#24399 Fixes open-telemetry#15277 --- As mentioned in open-telemetry#24399 (comment), I used a library to help me understand how the deadlock was happening. (1st commit). It showed that `persistToWal` was trying to acquire the lock, while `readPrompbFromWal` held it forever. I changed the strategy here and instead of using fs.Notify, and all that complicated logic around it, we're just using a pub/sub strategy between the writer and reader Go routines. The reader go routine, once finding an empty WAL, will now release the lock immediately and wait for a notification from the writer. While previously it would hold the lock while waiting for a write that would never happen. --------- Signed-off-by: Arthur Silva Sens <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was taking a look over #20875 and hoping to finish it.
Fixes #19363
Fixes #24399
Fixes #15277
As mentioned in #24399 (comment), I used a library to help me understand how the deadlock was happening. (1st commit). It showed that
persistToWal
was trying to acquire the lock, whilereadPrompbFromWal
held it forever.I changed the strategy here and instead of using fs.Notify, and all that complicated logic around it, we're just using a pub/sub strategy between the writer and reader Go routines.
The reader go routine, once finding an empty WAL, will now release the lock immediately and wait for a notification from the writer. While previously it would hold the lock while waiting for a write that would never happen.