Skip to content

Conversation

jpkrohling
Copy link
Member

Description: Added a Trace ID aware load-balancing exporter.

Link to tracking Issue: open-telemetry/opentelemetry-collector#1724

Testing: some manual tests (missing Kubernetes e2e tests for DNS resolver) + unit tests

Documentation: readme file

@jpkrohling
Copy link
Member Author

Each commit in this PR is a step towards the final solution and can be reviewed individually.

@jpkrohling jpkrohling force-pushed the jpkrohling/1724-loadbalancing-processor branch 2 times, most recently from a319e1e to 156554f Compare October 9, 2020 14:05
Copy link
Contributor

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments. Thank you for this work @jpkrohling. Very nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that either splits the incoming batches into multiple batches, one per trace

Does this exist? might be better just to mention groupbytrace if it's the only existing option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will, once #1235 is addressed :-) I think groupbytrace isn't suitable for all cases, although I would certainly recommend it for most cases that I can think of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the example show groupbytrace since it's basically required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but I think I would prefer to have the future "splitbatch" there instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example is cracking me up. I love how flexible the collector is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps use a jumphash? I'm pretty sure it will work in this use case and it's significantly less code.

https://github.com/dgryski/go-jump

We present jump consistent hash, a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better job of evenly dividing the key space among the buckets and of evenly dividing the workload when the number of buckets changes. Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching.

I'm no expert, but I think this will work if you just order the IPs you're load balancing over in increasing order.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw it the first time in @annanay25's branch and found it highly interesting, especially because it's implemented 5 lines written by a PerlMonks. I decided to read the paper, and this statement in the abstract intrigued me:

Its main limitation is that the buckets must be numbered sequentially, which makes it more suitable for data storage applications than for distributed web caching

Reading further, I realized that jumphash might not be a good choice for our use case at all:

This ability to add or remove buckets in any order can be valuable for cache servers where the servers are purely a performance improvement. But for data storage applications, where each bucket represents a different shard of the data, it is not acceptable for shards to simply disappear, because that shard is only place where the corresponding data is stored. Typically this is handled by either making the shards redundant (with several replicas), or being able to quickly recover a replacement, or accepting lower availability for some data. Server death thus does not cause reallocation of data; only capacity changes do. This means that shards can be assigned numerical ids in increasing order as they are added, so that the active bucket ids always fill the range [0, num_buckets)

Basically, we can scale up, but can't scale down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thank you. I knew we used jumphash with memcached in Cortex so I thought it might be useful here, but you are correct. We use a statefulset to force identity/order so that new shards are always appended to the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace double negative (never false) to positive (always true)

@jpkrohling jpkrohling force-pushed the jpkrohling/1724-loadbalancing-processor branch from c661a0e to 531ff5d Compare October 14, 2020 13:10
@jpkrohling jpkrohling marked this pull request as ready for review October 14, 2020 13:11
@jpkrohling jpkrohling requested a review from a team October 14, 2020 13:11
@jpkrohling
Copy link
Member Author

jpkrohling commented Oct 14, 2020

@bogdandrutu would you like me to split this PR? It would be roughly 8 PRs, if we divide the almost 4k diffs by 500. Or would you like to try to review commit by commit as suggested in open-telemetry/opentelemetry-collector#1952 ?

@jpkrohling
Copy link
Member Author

Build failure seems to be unrelated:

ok  	github.com/open-telemetry/opentelemetry-collector-contrib/extension/jmxmetricsextension	0.054s	coverage: 100.0% of statements
--- FAIL: TestWithEnvVarsIntegration (5.00s)
    integration_test.go:150: 
        	Error Trace:	integration_test.go:150
        	Error:      	Condition never satisfied
        	Test:       	TestWithEnvVarsIntegration
FAIL
coverage: 98.3% of statements
FAIL	github.com/open-telemetry/opentelemetry-collector-contrib/extension/jmxmetricsextension/subprocess	5.029s
FAIL
make[2]: *** [../../Makefile.Common:58: do-unit-tests-with-cover] Error 1

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@jpkrohling jpkrohling force-pushed the jpkrohling/1724-loadbalancing-processor branch from 531ff5d to 61f3be2 Compare October 15, 2020 19:19
@jpkrohling
Copy link
Member Author

Looking again at this PR, it isn't that big after all: half of it are changes to go.sum. Then, quite a big chunk is for tests. The readme has yet another ~150 lines. In the end, the actual business code is under 400 lines:

$ find . -name "*.go" -a -not -name "*_test.go" | xargs cat | grep -v "^\/\/" | sort | uniq | wc -l
397

Tests are almost an exact match:

$ find . -name "*_test.go" | xargs cat | grep -v "^\/\/" | sort | uniq | wc -l
398

@tigrannajaryan
Copy link
Member

@jpkrohling one the reasons we ask for a small PR first is to understand if we want it in the repo. We don't want contributors to spend time on something that potentially may be rejected (not saying this one will be rejected, I did not review it). First PR with config and README is a great way to get this initial agreement.
I still highly recommend to split it that way. No need for 8 PRs, you line count analysis shows that is can be 2 or 3 PRs.

@jpkrohling
Copy link
Member Author

To be honest, I just wanted to write a PoC to keep the discussion in open-telemetry/opentelemetry-collector#1724 going. This PR shouldn't be reviewed before we have an agreement there. If this is indeed the direction we want to go, I'll split this PR in this way:

  1. skeleton (and readme)
  2. consistent hash ring
  3. resolvers (interface, DNS and static)
  4. processor implementation

@jpkrohling
Copy link
Member Author

jpkrohling commented Oct 22, 2020

I haven't heard anything on this yet, which I want to understand as that there are no objections to the design. I'll start splitting this into 4 PRs, hence, I'm closing this one.

@jpkrohling jpkrohling closed this Oct 22, 2020
dyladan referenced this pull request in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
* Prepare for releasing v0.13.0

* Update CHANGELOG.md for v0.13.0 release
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants