Skip to content

Conversation

Erog38
Copy link
Contributor

@Erog38 Erog38 commented May 7, 2025

Description

Adds a new timeBox object which has the sole purpose of generating unique attributes per second. This allows for telemetrygen to not override it's own values when sending to backends.

Link to tracking issue

Fixes #39933

Testing

Additional test added to cover new functionality

Documentation

New Flag added along with help criteria

@Erog38 Erog38 requested review from a team, codeboten and mx-psi as code owners May 7, 2025 19:31
@github-actions github-actions bot added the cmd/telemetrygen telemetrygen command label May 7, 2025
@Erog38 Erog38 changed the title Fix/overriding timeseries [cmd/telemetrygen] Add ability to not ensure uniqueness of timeseries within a one second window May 7, 2025
@Erog38 Erog38 changed the title [cmd/telemetrygen] Add ability to not ensure uniqueness of timeseries within a one second window [cmd/telemetrygen] Add ability to ensure uniqueness of timeseries within a one second window May 7, 2025
Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this via rate.Limit instead of adding a new thing? And can we make this configurable? OTLP supports up to nanosecond precision

numMetrics int // how many metrics the worker has to generate (only when duration==0)
enforceUnique bool // if true, the worker will generate unique timeseries
totalDuration time.Duration // how long to run the test for (overrides `numMetrics`)
limitPerSecond rate.Limit // how many metrics per second to generate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we just have a limit of 1 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit wouldn't allow for load-test levels of telemetry to be generated for backend systems without an astronomical number of binaries running. This creates new timeseries to ensure we're sending unique values within the provided time limit.

@Erog38
Copy link
Contributor Author

Erog38 commented May 8, 2025

Can we do this via rate.Limit instead of adding a new thing?

Somewhat answered this from your other comment, but the difference here is that we still want to be able to generate close to the same amount of load configurable by rate, but will enforce that the timeseries generated are unique within a set amount of time.

And can we make this configurable? OTLP supports up to nanosecond precision

Totally, updates incoming.

@Erog38
Copy link
Contributor Author

Erog38 commented May 8, 2025

I may have misunderstood the idea of rate.Limit being used here after looking back at it. The main issue is that the uniqueness should be guaranteed across all workers, by default we're creating a rate.Limiter with each worker currently and I'd love to not refactor that for the possible implications of what a global rate Limiter would entail 😅

@Erog38
Copy link
Contributor Author

Erog38 commented May 8, 2025

I was looking at go's rate.Limiter functionality to see if it made sense, and I don't think it does since we aren't actually wanting to limit generation of timeseries, just to ensure that within X configurable amount of time, we're using a unique timeseries.

This pattern seems to hold up well with my tests against a prom remote write backend ( telemetrygen -> collector -> prometheus compatible backend ) at 1s and 15s intervals now that the flag is configurable.

This change does mean that telemetrygen just became a cardinality generator as well though, as setting a high rate with --unique-timeseries-duration set to a value above the --duration will only generate new timeseries during the entirety of the test.

Possibly dangerous, but a good testing tool to have in our belt imo.

@Erog38
Copy link
Contributor Author

Erog38 commented May 20, 2025

@mx-psi Any additional feedback on this guy?

@atoulme
Copy link
Contributor

atoulme commented Jun 3, 2025

@mx-psi @codeboten please review as codeowners.

@Erog38 Erog38 changed the title [cmd/telemetrygen] Add ability to ensure uniqueness of timeseries within a one second window [cmd/telemetrygen] Add ability to ensure uniqueness of timeseries within a time period Jun 9, 2025
@Erog38
Copy link
Contributor Author

Erog38 commented Jun 9, 2025

Hey all, I'm still here and open to additional feedback on this one.

@mx-psi
Copy link
Member

mx-psi commented Jun 10, 2025

Sorry, I am a bit under the water currently but will take a look at this eventually

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jun 25, 2025
@Erog38
Copy link
Contributor Author

Erog38 commented Jun 26, 2025

Just checking in so this doesn't end up completely disappearing 😅

@Erog38
Copy link
Contributor Author

Erog38 commented Jun 26, 2025

/remove-label stale

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jul 11, 2025
@Erog38
Copy link
Contributor Author

Erog38 commented Jul 11, 2025

/remove-label stale

@github-actions github-actions bot removed the Stale label Jul 12, 2025
@Erog38
Copy link
Contributor Author

Erog38 commented Jul 14, 2025

@mx-psi I know you've been overwhelmed, if there's anything I can take off your plate to help out with this one please let me know. I've been using this against our systems since this PR opened and it's been a killer help to ensure I don't get errors from the prometheus backend when sending via an otel collector.

@atoulme
Copy link
Contributor

atoulme commented Jul 20, 2025

@codeboten would you please review?

Copy link
Member

@mx-psi mx-psi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for taking this long to reply. This looks good to me to merge as is if CI passes

@mx-psi mx-psi merged commit 9c4554f into open-telemetry:main Jul 21, 2025
177 checks passed
Dylan-M pushed a commit to Dylan-M/opentelemetry-collector-contrib that referenced this pull request Aug 5, 2025
…hin a time period (open-telemetry#39935)

#### Description

Adds a new `timeBox` object which has the sole purpose of generating
unique attributes per second. This allows for telemetrygen to not
override it's own values when sending to backends.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes open-telemetry#39933

<!--Describe what testing was performed and which tests were added.-->
#### Testing

Additional test added to cover new functionality

#### Documentation

New Flag added along with help criteria

---------

Co-authored-by: Pablo Baeyens <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telemetrygen creates more timeseries per second than can be ingested

4 participants