-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MYSQL/IIS log file issue in Integrations #6016
Comments
Thanks @ishleenk17 for creating this follow up issue.
From my understanding the issue is still present in beats, we have only fixed the tests, but values are still duplicated. And some fields in the IIS module are also gone, what is more concerning (see
Yes, I think that unexpectedly duplicated values need to be deduplicated. Something like what has been done here. Hopefuly this will also fix the issue with the disappeared fields in IIS. These changes should be ideally applied to Filebeat modules too.
Do you mean to revert elastic/elasticsearch#92586? Maybe this can be discussed, but it looks like this change was desired, to align Logstash and Elasticsearch implementations of the grok processor. Also, the change has been already released, so I am afraid that we have to live with it. ccing @ruflin in case he thinks something can be done in this line.
It shouldn't be needed, the change in the pipelines should work both when values are duplicated and when they are not. |
@ishleenk17 I'm wondering on your take on this putting aside the potential breaking changes and issues: Is the Elasticsearch change an improvement over the previous behaviour? Does Elasticsearch have an functionality to simplify the removal of duplicates? |
@ruflin : Yes, I think it is an improvement as it addresses the issue of storing all values in case there are multiple values. But that has also led to duplication of values, which doesn't look to be right.
I suppose we can add a check in ES in case a value previously exists in the array we need not add it. |
Thats right, we should remove the duplication of fields and I suppose we should handle it at ES rather than handling it in integrations. Details here. That will automatically handle scenarios of Beats/Integrations both. |
Can you take this up with the Elasticsearch team? |
There have been some recent failures with IIS/MYSQL have gone unnoticed after 8.7.0. (Since CI/CD is broken).
These issues came up in beats and same can be observed in Integrations as well.
The issue is seen since ES has updated the grok processing(8.7.0 onwards) and it is generating different outputs (Expected json files).
Files now report duplicate values (list) in contrast to a string value before.
For MYSQL:
Maria DB Log
Issue: {"root['mysql.slowlog.schema']": {'
old_type': <class 'str'>, 'new_type': <class 'list'>, 'old_value': 'employees-test', 'new_value': ['employees-test', 'employees-test']}}}
MYSQL Ubuntu Logs
Issue: {"root['mysql.thread_id']": {'
old_type': <class 'str'>, 'new_type': <class 'list'>, 'old_value': '16', 'new_value': [16, '16']
}}}Percona Ubuntu Logs
Issue: {"root['mysql.slowlog.schema']": {
'old_type': <class 'str'>, 'new_type': <class 'list'>, 'old_value': 'employees', 'new_value': ['employees', 'employees'
]}}}For IIS:
IIS Logs 7.5
Issue: {"root['destination.address']":
{'old_type': <class 'str'>, 'new_type': <class 'list'>, 'old_value': '10.100.220.70', 'new_value': ['10.100.220.70', '10.100.220.70']}
The issue has been fixed in beats.
But since integrations don't follow a release cycle, we need to figure out a way to fix this in integrations.
The issue raised in beats: elastic/beats#35133
The text was updated successfully, but these errors were encountered: