Rename feature of the metrics transform processor #347

JingboWangGoogle · 2020-06-19T15:01:16Z

Description: Rename Functionality including renaming the metrics name and the labels

Link to tracking Issue: #332

Testing: unit tests, 100% coverage within module

Documentation: Detailed comments in code describing the responsibility of each function

codecov · 2020-06-19T15:13:51Z

Codecov Report

Merging #347 into master will increase coverage by 0.04%.
The diff coverage is 50.00%.

@@            Coverage Diff             @@
##           master     #347      +/-   ##
==========================================
+ Coverage   83.48%   83.53%   +0.04%     
==========================================
  Files         171      169       -2     
  Lines        9261     9037     -224     
==========================================
- Hits         7732     7549     -183     
+ Misses       1199     1161      -38     
+ Partials      330      327       -3

Flag	Coverage Δ
#integration	`?`
#unit	`?`

Impacted Files	Coverage Δ
exporter/awsxrayexporter/translator/aws.go	`95.50% <ø> (-0.54%)`	⬇️
exporter/sapmexporter/config.go	`45.45% <ø> (-54.55%)`	⬇️
exporter/sapmexporter/factory.go	`85.71% <ø> (-14.29%)`	⬇️
exporter/signalfxexporter/config.go	`90.69% <ø> (ø)`
exporter/signalfxexporter/exporter.go	`85.41% <ø> (ø)`
exporter/signalfxexporter/factory.go	`100.00% <ø> (ø)`
extension/observer/k8sobserver/handler.go	`73.25% <ø> (-1.14%)`	⬇️
processor/k8sprocessor/factory.go	`100.00% <ø> (ø)`
processor/k8sprocessor/processor.go	`94.02% <ø> (+1.36%)`	⬆️
processor/metricstransformprocessor/factory.go	`100.00% <ø> (ø)`
... and 41 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c72b1c5...97b6351. Read the comment docs.

flands · 2020-06-19T15:27:22Z

Is this just for renaming metrics (not labels)? If so, seems similar to spanprocessor -- should this be called metricprocessor? Also, why is this in contrib? This belongs in core.

draffensperger · 2020-06-19T18:28:19Z

Hi @flands we chatted a fair bit with @bogdandrutu and @jrcamp about that (mostly in an offline meeting, but some part of the conversation is captured in the associated issue, #332

We talked about the value of having the rename processor live in core, but in tension with that would be the value of being able to co-locate all the processing/renaming/aggregating for a single metric in a single place.

Our concensus coming out of that was that initially this would all live in contrib and would be a single processor that allows for metric, label and label value renaming as well as basic aggregations (dropping labels, combining label values down to a single value). In the future there would be the potential to have a rename-only metric processor in core that could possibly share underlying code with the rename functionality of that processor.

(As an aside, we have been using the Draft feature of PRs as a way to have preliminary review by us as intern hosts before sending it out to the community for official review - hope that's OK)

for PR

processor/metricstransformprocessor/config.go

processor/metricstransformprocessor/config_test.go

processor/metricstransformprocessor/factory.go

processor/metricstransformprocessor/metrics_transform_processor.go

james-bebbington

Left some comments from initial scan through. Overall structure looks good.

Note you'll also need to make the processor package a go module since its in Contrib.

processor/metricstransformprocessor/doc.go

processor/metricstransformprocessor/config.go

processor/metricstransformprocessor/metrics_transform_processor_test.go

processor/metricstransformprocessor/config.go

processor/metricstransformprocessor/metrics_transform_processor.go

JingboWangGoogle · 2020-06-28T21:32:56Z

@asuresh4 Please let me know if the changes here look good. If so, I will go ahead and resolve the two conflicting files. I have resolved the conflicting files before, but when another PR is merged, these files will be conflict again. Therefore, I will wait for your approval on the rest of the changes this time, and only resolve these once before it merges. Thanks! :)

go.mod

processor/metricstransformprocessor/metrics_transform_processor.go

james-bebbington · 2020-06-29T08:39:05Z

processor/metricstransformprocessor/metrics_transform_processor.go

+}
+
+// validNewName determines if the new name is a valid one. An invalid one is one that already exists.
+func (mtp *metricsTransformProcessor) validNewName(transform Transform, nameToMetricMapping map[string]*metricspb.Metric) bool {


I still think this function would be more accurately called something like metricNameExists(...) bool.

Also, I didn't catch before - what was the main justification for keeping this validation? I'm still 50/50 on whether its needed at all. While it would imply an error to have two metrics with the same name, I don't think there's any other code to validate that there aren't already duplicates in this list (which if there were, could result in unexpected behaviour in your code), and I'm not sure its worth paying the cost of allocating this extra map, having more complex code, etc.

It's likely an error will be thrown at export time if this situation occurs anyway.

If you do agree with me about removing this validation, you can remove a huge amount of the code in the core functions of this file, i.e. the transform function can be simplfied to:

for _, metric := range data.Metrics { transform := nameToTransformMapping[metric.MetricDescriptor.Name] if transform.Action == Insert { metric = mtp.createCopy(metric) mds[i].Metrics = append(metricPtrs, metricCopy) } mtp.update(metricCopy, transform) }

The nameToTransformMapping can be made once on initialization

So this validNewName and other validation functions came out of discussions with Dave and Quentin. Basically, when these unexpected behaviors occur, not only does it appear in the output data, but there will be an error message logged, so that the errors can be more viewable to the users

Also regarding simplifying the code this way, since we want the transformations to perform in the order from the list. I would think we need to iterate through the transforms instead of the metric, which then involves managing the metrics mapping as we go though transformations.

Yea okay that's true. I was hoping to find a solution that would avoid having to allocate an extra map on each call to ConsumeMetrics, but if we want to guarantee the order of operations that is probably not possible.

I am also slightly concerned that if we ever extend this processor to support filtering beyond just "name", then it won't make sense to have a simple map anymore, but we can figure out how to handle that later

processor/metricstransformprocessor/metrics_transform_processor.go

james-bebbington · 2020-06-29T08:56:49Z

processor/metricstransformprocessor/metrics_transform_processor.go

+				}
+				// if name is updated, the map has to be updated
+				nameToMetricMapping[transform.NewName] = nameToMetricMapping[transform.MetricName]
+				delete(nameToMetricMapping, transform.MetricName)


This creates a bit of weird behaviour where the order of the actions may matter in terms of determining whether an error will occur. This is another reason to just remove this logic entirely imo (see comment on L182)

I thought the order of transformations here should matter because when I discussed with Dave and Quentin, the transforms list should be an ordered action list that will perform the transforms one by one to the metrics. Do you think that this shouldn't be the case?

processor/metricstransformprocessor/metrics_transform_processor.go

james-bebbington · 2020-06-30T02:41:20Z

processor/metricstransformprocessor/metrics_transform_processor.go

+				delete(nameToMetricMapping, transform.MetricName)
+			} else if transform.Action == Insert {
+				var newMetric *metricspb.Metric
+				mds[i].Metrics, newMetric = mtp.insert(metric, mds[i].Metrics, transform)


Nit: you can put mds[i] into a variable so you don't need to dereference it on each iteration

I actually tried to use data directly, which is mds[i] from the for loop, but this way, the original copy of the metric is not updated correctly since mds is a list of non-pointers.

Oh interesting. I guess you could set data := &mds[i] instead

james-bebbington · 2020-06-30T02:46:51Z

processor/metricstransformprocessor/metrics_transform_processor.go

+}
+
+// validNewLabel determines if the new label is a valid one. An invalid one is one that already exists.
+func (mtp *metricsTransformProcessor) validNewLabel(labelKeys []*metricspb.LabelKey, newLabel string) bool {


This code is relatively minimal and straightforward, but even still, similar to above, I think I'm a fan of doing little / no validation here and allowing this processor to be lean and fast. If the user screws up, they will export bad data, or more likely get an error at export time.

Happy to be convinced otherwise on this one though.

This would be for the same reasoning as the other validations. (This one is to guard against renaming a label to a name that already exists because this should be done through an aggregation) These are all to keep the boundaries of the expected behaviors, and once the operations step out of bounds, the processor is able to notice. However, after thinking about the single responsibility of the processor, I do see how these validations should be removed. Since the processors should only worry about transforming the data specified by the user even the user's specification might imply a possible error. I also don't see any other processors that validate data processing, so from a convention stand point, I also think removing the validations might be reasonable. However, this also means that users may have data transformed in ways that might result in bad output metric data. If this is ok for processors, then with your approval, I would love to remove the validations because this also makes the code a lot more straightforward.

My opinion is lets start with not having it and keep the code very simple. We can add it in later if people want it. But I'm okay with leaving it there if you feel strongly about it

No problem! The current plan is removing the validations for the purpose of starting simple, but I will keep this code saved in another branch. If it is needed later, we will put this back in. If we later decide to add this back in, I will need to modify the logging process as well. Therefore, before I do any additional work on that, I want to make sure the work is needed.

This reverts commit 5bb23b1.

* Rename GetTracer to NewTracer * Drop New prefix

This PR adds a `checkfile` tool which validates the presence of a file for each component. It also deprecates `checkdoc`. Please see the related issue (open-telemetry#347) for additional details on the reason.

JingboWang1997-1 added 7 commits June 17, 2020 23:48

rename features built with on test file

02f013e

modified files to enable easier dev testing

5c1bdfc

functionaing update and insert rename metric name

13469a9

finished implementation and testing 100% coverage

f16433e

function comment

9f6284c

cleanup repo for PR

a5658bd

code cleanup

6145298

JingboWang1997-1 added 2 commits June 19, 2020 20:54

label rename and update

56771e6

rename label feature in dev looks good, so merge into rename to prepare

d7ea1ab

for PR

JingboWangGoogle changed the title ~~Rename~~ Rename feature of the metrics transform processor Jun 19, 2020

fix lint build issue

f8c62ab

draffensperger reviewed Jun 22, 2020

View reviewed changes

JingboWang1997-1 added 2 commits June 22, 2020 15:06

minor revision based on PR comments and lint checks

f6278a4

change import order for lint check

60eec93

JingboWangGoogle mentioned this pull request Jun 22, 2020

metrics transform processor README #339

Closed

james-bebbington reviewed Jun 23, 2020

View reviewed changes

JingboWang1997-1 added 4 commits June 23, 2020 15:24

make it a go module

dd12425

setup chek, add local makefile

da04adf

tidy module on root level

18e9bce

comments and metric name adjustment

2055801

quentinmit suggested changes Jun 23, 2020

View reviewed changes

JingboWang1997-1 added 6 commits June 23, 2020 20:50

formatting and variable names

9504432

clean up tests to address PR comments

fbc0001

error logging formatting and benchmark test

a7551f5

able to perform transformations on multiple metrics

29d9b9c

fix lint

a32a45a

readme

4512455

remove hardcoding like transforms[0]

b877123

JingboWangGoogle marked this pull request as ready for review June 25, 2020 16:10

JingboWangGoogle requested a review from a team June 25, 2020 16:10

JingboWang1997-1 added 4 commits June 25, 2020 16:16

unchange the root level module files

060116f

code cleanup with some optimization

7490a37

setup for aggregation

06e12b0

clear empty branches

5bb23b1

tigrannajaryan assigned asuresh4 Jun 26, 2020

james-bebbington reviewed Jun 30, 2020

View reviewed changes

JingboWang1997-1 added 3 commits June 30, 2020 14:44

merge abort

3753f8c

address PR comments, cleanup code

a6a7f9a

Revert "clear empty branches"

97b6351

This reverts commit 5bb23b1.

JingboWangGoogle mentioned this pull request Jun 30, 2020

Rename feature of the metrics transform processor (metrics and labels) #376

Merged

JingboWangGoogle closed this Jun 30, 2020

JingboWangGoogle mentioned this pull request Jul 15, 2020

Aggregation across labels and label values in metrics transform processor #417

Merged

ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022

Rename GetTracer to Tracer (#347)

0f052af

* Rename GetTracer to NewTracer * Drop New prefix

Rename feature of the metrics transform processor #347

Rename feature of the metrics transform processor #347

Uh oh!

Conversation

JingboWangGoogle commented Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jun 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

flands commented Jun 19, 2020

Uh oh!

draffensperger commented Jun 19, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

james-bebbington left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JingboWangGoogle commented Jun 28, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james-bebbington Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james-bebbington Jul 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james-bebbington Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JingboWangGoogle Jun 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JingboWangGoogle Jul 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

JingboWangGoogle commented Jun 19, 2020 •

edited

Loading

codecov bot commented Jun 19, 2020 •

edited

Loading

james-bebbington Jun 30, 2020 •

edited

Loading

james-bebbington Jul 2, 2020 •

edited

Loading

james-bebbington Jun 30, 2020 •

edited

Loading

JingboWangGoogle Jun 30, 2020 •

edited

Loading

JingboWangGoogle Jul 1, 2020 •

edited

Loading