Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/k8sattributesprocessor] support regex capture groups in tag_name #9525

Merged
merged 17 commits into from
May 4, 2022

Conversation

newly12
Copy link
Contributor

@newly12 newly12 commented Apr 26, 2022

Description:
support configure key prefix in k8sattributesprocessor

Link to tracking Issue:
#8844

Testing:

Documentation:

@newly12 newly12 requested a review from a team April 26, 2022 10:04
@newly12 newly12 changed the title [k8sattrprocessor] support regex capture groups in tag_name [processor/k8sattributesprocessor] support regex capture groups in tag_name Apr 26, 2022
@newly12
Copy link
Contributor Author

newly12 commented Apr 27, 2022

@pmm-sumo @dmitryax could you have a look at this PR?

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add documentation to the new capabilities

processor/k8sattributesprocessor/options.go Outdated Show resolved Hide resolved
if r.KeyRegex != nil {
for k, v := range metadata {
var name string
if r.HasKeyRegexReference {
Copy link
Member

@dmitryax dmitryax Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have formatter as a string with regexp caption e.g. k8s.namespace.annotations.$0, we can drop this condition along with the else block, right?

Copy link
Contributor Author

@newly12 newly12 Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to fully leverage $0 it has certain requirements for the regexp, which might lead to some breaking changes, so far basically the difference I found is how it behaviors between * and .*, e.g. https://go.dev/play/p/yUbTV_SKWCj

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand your point. Can you elaborate please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$0 doesn't always represent the entire key, i.e. in https://go.dev/play/p/yUbTV_SKWCj, given the key is annotation1, name is annotations.$0,

  • given the key_regex is an*, $0 would be ann and a(both matches the regexp), after 2 iterations(the for loop in the sample code), we got the result annotations.annannotations.a (annotations.ann + annotations.a).
  • given the key_regex is an.*, $0 would be annotation1 as the matched string is the entire key, thus just one iteration and we got the expected result annotations.annotation1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an incorrect expansion logic. the result should not be composition of several matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wrapping a regexp supplied by user with ^ and $should do the trick

would it be tricky to wrap it with ^ and $, given that user may have ^ or $ in their regexp, ^ and $ probably has to be appended during validation phase, where we need to handle cases like

  • ^an.*
  • ^an.*$
  • (^an.*)
  • (^an.*$)

Copy link
Member

@dmitryax dmitryax May 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be tricky to wrap it with ^ and $, given that user may have ^ or $ in their regexp, ^ and $ probably has to be appended during validation phase, where we need to handle cases like

Yes, this would be the right approach. But let's not do it for now and keep this change as clean enhancement without breaking behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmitryax yeah the reason I noticed that an* is while making this change and noticed failures from current UT, which is expecting k8s.namespace.annotations.annotation1.

{
name: "all-annotations",
rules: ExtractionRules{
Annotations: []FieldExtractionRule{{
KeyRegex: regexp.MustCompile("an*"),
From: MetadataFromNamespace,
},
},
},
attributes: map[string]string{
"k8s.namespace.annotations.annotation1": "av1",
},
},
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I still believe that we should do strict match and make an* do not match annotation1, only annnnnn. @pmm-sumo if you agree on this behavior, I believe we can introduce this breaking change as another PR.

Yeah, I have the same sentiment @dmitryax @newly12

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a ticket #9716. @newly12 feel free to pick it up if you are interested

@newly12
Copy link
Contributor Author

newly12 commented Apr 29, 2022

@dmitryax updated doc.

@dmitryax
Copy link
Member

Please address other comments

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bogdandrutu bogdandrutu merged commit e7fb865 into open-telemetry:main May 4, 2022
@newly12 newly12 deleted the k8s_attr_tagname_regex branch May 5, 2022 02:02
djaglowski pushed a commit to djaglowski/opentelemetry-collector-contrib that referenced this pull request May 10, 2022
…g_name (open-telemetry#9525)

* refactor

* support captured group in TagName

* update CHANGELOG

* add UT for entire matched string

* fix lint error

* add doc for capturing groups

* remove capturing group validation

* Update CHANGELOG.md

* update doc

* update substitution logical

* remove extractField method from WatchClient

* move changelog entry to current release

Co-authored-by: Alex Boten <aboten@lightstep.com>
Co-authored-by: Dmitrii Anoshin <anoshindx@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants