Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support simplifying regex to expression #982

Merged
merged 2 commits into from
Aug 2, 2022

Conversation

brharrington
Copy link
Contributor

Adds method to PatternMatcher that attempts to rewrite
the pattern to a set of simple pattern matches that can
be combined with AND, OR, and NOT to have the same matching
behavior as the original regex pattern. This can be useful
when working with data stores that have more restricted
pattern matching support such as RE2.

Adds method to `PatternMatcher` that attempts to rewrite
the pattern to a set of simple pattern matches that can
be combined with AND, OR, and NOT to have the same matching
behavior as the original regex pattern. This can be useful
when working with data stores that have more restricted
pattern matching support such as RE2.
@brharrington brharrington added this to the 1.3.6 milestone Aug 1, 2022
@dmuino
Copy link
Contributor

dmuino commented Aug 1, 2022

This is very useful! We will be able to use this rewrite to properly handle regex queries that need to be sent to clickhouse.

Copy link
Contributor

@manolama manolama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

}
}

private static class Re2Encoder implements PatternExpr.Encoder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this move out into a plugin eventually for RE2 based stores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoder implementation would be specific to a datastore and potentially how you are using it (column names, indexes, etc). I don't know that there is a need to have a common shared implementation here.

@brharrington brharrington merged commit edb23a4 into Netflix:main Aug 2, 2022
@brharrington brharrington deleted the patternexpr branch August 2, 2022 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants