-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix grok processors #35276
Fix grok processors #35276
Conversation
This reverts commit f5ace09.
This pull request doesn't have a |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
The Elasticsearch made a behavioral change to the grok processor (PR elastic/elasticsearch#92586), where in case a field matches more than once, the processor will return a list of values. This change was introduced in 8.7. Some of the filebeats pipelines were built over the assumption that only the first match would be returned. In the case of IIS, the `source.address` is repeated in the second part, by which only the first match will be used, so, it's safe to remove the second `source.address` In the case of MySQL, according to their documentation: - the Schema is defined always at the beginning of the log, right after the "Thread_id:", so it's safe to remove the second `mysql.schema` - `thread_id` is more tricky, because it can be matched from different places, in this case, the potential matches are stored in 3 temporary fields, and then a new `script` processor does the job of using the correct temporary field and then removing it.
💔 Tests Failed
Expand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures> Show only the first 10 test failures
|
This pull request is now in conflicts. Could you fix it? 🙏
|
The Elasticsearch made a behavioral change to the grok processor
(PR elastic/elasticsearch#92586), where in case a field matches more than
once, the processor will return a list of values. This change was
introduced in 8.7. Some of the filebeats pipelines were built over the
assumption that only the first match would be returned.
In the case of IIS, the
source.address
is repeated in the second part, bywhich only the first match will be used, so, it's safe to remove the
second
source.address
In the case of MySQL, according to their documentation:
beginning of the log, right after the "Thread_id:", so it's safe to
remove the second
mysql.schema
thread_id
is more tricky, because it can be matched from differentplaces, in this case, the potential matches are stored in 3 temporary
fields, and then a new
script
processor does the job of using thecorrect temporary field and then removing it.
What does this PR do?
Why is it important?
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs