Skip to content

Conversation

pmatyjasek-sumo
Copy link
Contributor

Description:
Add ability to associate metadata tags in k8s processor using pod UID rather than just IP

Link to tracking Issue: #1146

Testing:
Unit tests

Documentation:
In-code comments

@pmatyjasek-sumo pmatyjasek-sumo requested a review from a team January 27, 2021 14:23
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 27, 2021

CLA Signed

The committers are authorized under a signed CLA.

  • ✅ Patryk Matyjasek (d90e3af2238249273579227b6f9c46a52ee15ae9)

Base automatically changed from master to main January 28, 2021 00:57
@pmatyjasek-sumo pmatyjasek-sumo force-pushed the pm_k8s_pod_uid_labels_impl branch from d90e3af to 44c8f1e Compare January 28, 2021 12:45
@codecov
Copy link

codecov bot commented Jan 28, 2021

Codecov Report

Merging #2199 (9721178) into main (63b2f33) will increase coverage by 0.00%.
The diff coverage is 99.06%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2199   +/-   ##
=======================================
  Coverage   90.45%   90.46%           
=======================================
  Files         397      397           
  Lines       19563    19614   +51     
=======================================
+ Hits        17695    17743   +48     
- Misses       1406     1408    +2     
- Partials      462      463    +1     
Flag Coverage Δ
integration 69.37% <ø> (+0.06%) ⬆️
unit 89.25% <99.06%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...cessor/k8sprocessor/observability/observability.go 91.66% <50.00%> (-8.34%) ⬇️
processor/k8sprocessor/factory.go 97.36% <100.00%> (+0.03%) ⬆️
processor/k8sprocessor/kube/client.go 100.00% <100.00%> (ø)
processor/k8sprocessor/options.go 100.00% <100.00%> (ø)
processor/k8sprocessor/pod_association.go 100.00% <100.00%> (ø)
processor/k8sprocessor/processor.go 100.00% <100.00%> (ø)
receiver/k8sclusterreceiver/watcher.go 95.29% <0.00%> (-2.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63b2f33...9721178. Read the comment docs.

@pmatyjasek-sumo pmatyjasek-sumo force-pushed the pm_k8s_pod_uid_labels_impl branch from 44c8f1e to 9725609 Compare January 28, 2021 13:07
Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really nice!

return
}
}
c.Pods[pod.Status.PodIP] = newPod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get it right that the same pod will exist twice in the list, once keyed by the IP and once by the ID?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually discussed this with @pmatyjasek-sumo while he was working on the extension :) Having a single map and two references to the pod just makes the things easier, since this are different domains that have no risk of conflicts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpkrohling as @pmm-sumo mentioned we had some discussion on that and we decided that it should be ok if 2 keys from map will point to the same reference. With this we avoid to use two loops to iterate through two lists of Pods (one by IP and second by UID)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's fine, but you might want to add the current size of this map in a metric somewhere: this is the kind of data that can easily explode in terms of number of elements. It's not unusual to have tens of thousands of pods in a given cluster, so, having this info would help people running the processor to identify any possible memory issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this done? If you don't to do it in this PR, create an issue to track this.

pmatyjasek-sumo and others added 2 commits January 29, 2021 11:18
Co-authored-by: Juraci Paixão Kröhling <[email protected]>
- Add more tests for added code
- Refactor Association table
- Remove unnecessary annonymous function
dyladan referenced this pull request in dynatrace-oss-contrib/opentelemetry-collector-contrib Jan 29, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 6, 2021
@jpkrohling
Copy link
Member

What's the current state of this PR? Is this ready to be reviews again?

@pmatyjasek-sumo
Copy link
Contributor Author

@jpkrohling yes it is ready for review. I've addressed previous comments in latest commit.

@github-actions github-actions bot removed the Stale label Feb 9, 2021
@jpkrohling
Copy link
Member

I have only one concern about the size the map might get, as it grows twice the rate of the number of pods being used. I believe the size of this map should be part of the set of metrics for the collector, as it would help correlate a possible increase in memory usage with the number of items in this map.

@pmatyjasek-sumo
Copy link
Contributor Author

@jpkrohling that is a good idea, I'll put this metric soon

otelsvc/k8s/pod_table_size

Signed-off-by: Patryk Matyjasek <[email protected]>
@pmatyjasek-sumo
Copy link
Contributor Author

I've updated PR

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor comment about the metrics. If you don't want to work on them here, let me know and I'll approve the PR once the tracking issue is created.

mPodsUpdated = stats.Int64("otelsvc/k8s/pod_updated", "Number of pod update events received", "1")
mPodsAdded = stats.Int64("otelsvc/k8s/pod_added", "Number of pod add events received", "1")
mPodsDeleted = stats.Int64("otelsvc/k8s/pod_deleted", "Number of pod delete events received", "1")
mPodTableSize = stats.Int64("otelsvc/k8s/pod_table_size", "Size of table containing pod info", "1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but is there a list of those metrics somewhere, telling users of this processor what they actually mean? Also, it would be a good opportunity to:

  • s/otelsvc/otelcol
  • replace the slashes by underscores, as it seems to be the standard at other metrics

If you don't want to do them here, feel free to open a tracking issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpkrohling I've created an issue and also I'll start work on it #2322

@pmm-sumo
Copy link
Contributor

@tigrannajaryan this is the PR we discussed during Log SIG yesterday. It will allow us to tag logs with K8s metadata

Copy link
Contributor

@owais owais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of minor comments.

My understanding is that if a user does not update config to specify the new pod association rules, the processor will continue to work with connection/attribute IP, right?

id: string(pod.UID),
name: pod.Name,
ts: time.Now(),
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks very repetitive. could be extracted into a function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch


// If IP is still not available by this point, nothing can be tagged here. Return.
if podIP == "" {
podAttributes := k8sPodAssociationFromAttributes(ctx, resource.Attributes(), kp.podAssociations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have another attrsToAdd := kp.getAttributesForPod() below and this one is named podAttributes as well. This could be a bit confusing to follow. IIRC, podAttributes here is actually a map of pod identifiers, such as IP, UID, etc, right? If so, let's name it podIdentifiers or something similar.

@pmatyjasek-sumo
Copy link
Contributor Author

LGTM with a couple of minor comments.

My understanding is that if a user does not update config to specify the new pod association rules, the processor will continue to work with connection/attribute IP, right?

Yes it will work as previous

// If a match is found, the cached metadata is added to the data as resource attributes.
// extracted metadata to the relevant spans, metrics and logs. The processor uses the kubernetes API to discover all pods
// running in a cluster, keeps a record of their IP addresses, pod UIDs and interesting metadata.
// The rules for associating the source record with specific Pod Metadata are configured via "pod_association" key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"source record" is not an entirely clear term. Does this refer to the telemetry in pdata that is passing through the processor?

// It represents a list of rules that are executed in order until the first one is able to do the match.
// Each rule is specified as a pair of from (representing the rule type) and name (representing the extracted key name).
// Following rule types are available:
// from: "labels" - takes the attribute from resource attributes (the value can contain either IP address or Pod UID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "resource" here refer to pdata.Resource or to the pod? Is this pdata.Resource attributes or pod labels? I am confused by mixed usage of "labels" and "attributes" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll clarify that

* Add more descriptive in-code comments for PodAssociation,
* Simplify PodAssociation logic
* Update tests

Signed-off-by: Patryk Matyjasek <[email protected]>
@pmatyjasek-sumo
Copy link
Contributor Author

@tigrannajaryan I've addressed Your comments. Please check if it is clean now.

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @pmatyjasek-sumo
I think I understand the code a bit bitter, but would like a bit more clarifications if possible.

)

// k8sPodAssociationFromAttributes extracts IP and pod UID from attributes
func k8sPodAssociationFromAttributes(ctx context.Context, attrs pdata.AttributeMap, associations []kube.Association) map[string]string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Or alternatively remove that case from this function and handle it separately.

// Allowed values are connection and labels
From string `mapstructure:"from"`

// Name represents the name of the association.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the answer below you wrote:

name (representing the extracted key name).

If "name" is extracted key name (which I think means the attribute name if I understand correctly) then what is "name of association"?

pmatyjasek-sumo and others added 2 commits February 23, 2021 08:54
* add custom type PodIdentifier
* fix tags adding behavior
* update comments
* refactor pod association extracting

Signed-off-by: Patryk Matyjasek <[email protected]>
@pmatyjasek-sumo
Copy link
Contributor Author

@tigrannajaryan I've provided some refactor to IDs extraction and association metadata tags. I hope it is clean now

// - from: connection
// name: ip
// - from: labels
// name: pod_uid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong but if I understand correctly we cannot expect to see "pod_uid" as a Resource attribute in normal use cases. Let's use an attribute name that is realistic, even if it is just an example.


// If IP is still not available by this point, nothing can be tagged here. Return.
if podIP == "" {
podIdentifierLabel, podIdentifierValue := extractPodIds(ctx, resource.Attributes(), kp.podAssociations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is best not to use the term "label" in this context. OpenTelemetry's vocabulary for Resources is "attribute name" or "attribute key". I think the following would be clear:

Suggested change
podIdentifierLabel, podIdentifierValue := extractPodIds(ctx, resource.Attributes(), kp.podAssociations)
podIdentifierKey, podIdentifierValue := extractPodIds(ctx, resource.Attributes(), kp.podAssociations)

@tigrannajaryan
Copy link
Member

@tigrannajaryan I've provided some refactor to IDs extraction and association metadata tags. I hope it is clean now

@pmatyjasek-sumo yes, it looks much better, thanks. Please address the remaining comments.

* Additional comments,
* Rename methods,
* fix misspells

Signed-off-by: Patryk Matyjasek <[email protected]>
@pmatyjasek-sumo
Copy link
Contributor Author

@tigrannajaryan please take a look

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thank you @pmatyjasek-sumo

@tigrannajaryan tigrannajaryan merged commit bee7cb1 into open-telemetry:main Feb 25, 2021
This was referenced Mar 15, 2021
pmatyjasek-sumo referenced this pull request in pmatyjasek-sumo/opentelemetry-collector-contrib Apr 28, 2021
…ather than just IP (#2199)

Add ability to associate metadata tags in k8s processor using pod UID rather than just IP

**Link to tracking Issue:** #1146 

**Testing:** 
Unit tests

**Documentation:** 
In-code comments
ljmsc referenced this pull request in ljmsc/opentelemetry-collector-contrib Feb 21, 2022
…rnal/tools (#2199)

* Bump github.com/golangci/golangci-lint in /internal/tools

Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.41.1 to 1.42.0.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](golangci/golangci-lint@v1.41.1...v1.42.0)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Auto-fix go.sum changes in dependent modules

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: MrAlias <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants