-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Trace ID aware load-balancing exporter #1231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace ID aware load-balancing exporter #1231
Conversation
Each commit in this PR is a step towards the final solution and can be reviewed individually. |
a319e1e
to
156554f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments. Thank you for this work @jpkrohling. Very nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that either splits the incoming batches into multiple batches, one per trace
Does this exist? might be better just to mention groupbytrace
if it's the only existing option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will, once #1235 is addressed :-) I think groupbytrace
isn't suitable for all cases, although I would certainly recommend it for most cases that I can think of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the example show groupbytrace
since it's basically required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but I think I would prefer to have the future "splitbatch" there instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example is cracking me up. I love how flexible the collector is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps use a jumphash? I'm pretty sure it will work in this use case and it's significantly less code.
https://github.com/dgryski/go-jump
We present jump consistent hash, a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better job of evenly dividing the key space among the buckets and of evenly dividing the workload when the number of buckets changes. Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching.
I'm no expert, but I think this will work if you just order the IPs you're load balancing over in increasing order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it the first time in @annanay25's branch and found it highly interesting, especially because it's implemented 5 lines written by a PerlMonks. I decided to read the paper, and this statement in the abstract intrigued me:
Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching
Reading further, I realized that jumphash might not be a good choice for our use case at all:
This ability to add or remove buckets in any order can be valuable for cache servers where the servers are purely a performance improvement. But for data storage applications, where each bucket represents a different shard of the data, it is not acceptable for shards to simply disappear, because that shard is only place where the corresponding data is stored. Typically this is handled by either making the shards redundant (with several replicas), or being able to quickly recover a replacement, or accepting lower availability for some data. Server death thus does not cause reallocation of data; only capacity changes do. This means that shards can be assigned numerical ids in increasing order as they are added, so that the active bucket ids always fill the range [0, num_buckets)
Basically, we can scale up, but can't scale down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you. I knew we used jumphash with memcached in Cortex so I thought it might be useful here, but you are correct. We use a statefulset to force identity/order so that new shards are always appended to the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace double negative (never false) to positive (always true)
c661a0e
to
531ff5d
Compare
@bogdandrutu would you like me to split this PR? It would be roughly 8 PRs, if we divide the almost 4k diffs by 500. Or would you like to try to review commit by commit as suggested in open-telemetry/opentelemetry-collector#1952 ? |
Build failure seems to be unrelated:
|
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
531ff5d
to
61f3be2
Compare
Looking again at this PR, it isn't that big after all: half of it are changes to
Tests are almost an exact match:
|
@jpkrohling one the reasons we ask for a small PR first is to understand if we want it in the repo. We don't want contributors to spend time on something that potentially may be rejected (not saying this one will be rejected, I did not review it). First PR with config and README is a great way to get this initial agreement. |
To be honest, I just wanted to write a PoC to keep the discussion in open-telemetry/opentelemetry-collector#1724 going. This PR shouldn't be reviewed before we have an agreement there. If this is indeed the direction we want to go, I'll split this PR in this way:
|
I haven't heard anything on this yet, which I want to understand as that there are no objections to the design. I'll start splitting this into 4 PRs, hence, I'm closing this one. |
Signed-off-by: Bogdan Drutu <[email protected]>
* Prepare for releasing v0.13.0 * Update CHANGELOG.md for v0.13.0 release
Description: Added a Trace ID aware load-balancing exporter.
Link to tracking Issue: open-telemetry/opentelemetry-collector#1724
Testing: some manual tests (missing Kubernetes e2e tests for DNS resolver) + unit tests
Documentation: readme file