Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysql: Add script processor to fix grok behaviour of ES 8.7.x+ #6531

Merged
merged 2 commits into from
Jun 28, 2023

Conversation

shmsr
Copy link
Member

@shmsr shmsr commented Jun 12, 2023

  • Bug

What does this PR do?

Changes deduplicate the list and only keep the unique elements. After this, if only 1 element remains, then it's added to the result as a single element (a string) and if not, the list is kept in the result with unique elements. This handling is done to manage the new behaviour of the grok processor in 8.7.x+. Read the related issue for more details.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Related issues

@shmsr shmsr requested a review from a team as a code owner June 12, 2023 05:25
@shmsr shmsr requested a review from ishleenk17 June 12, 2023 05:37
@elasticmachine
Copy link

elasticmachine commented Jun 12, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-06-19T15:32:32.584+0000

  • Duration: 26 min 57 sec

Test stats 🧪

Test Results
Failed 0
Passed 63
Skipped 0
Total 63

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented Jun 12, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (2/2) 💚
Files 100.0% (2/2) 💚 2.89
Classes 100.0% (2/2) 💚 2.89
Methods 92.857% (26/28) 👍 0.643
Lines 94.34% (150/159) 👍 2.079
Conditionals 100.0% (0/0) 💚

@shmsr shmsr self-assigned this Jun 17, 2023
@shmsr shmsr added the bug Something isn't working, use only for issues label Jun 18, 2023
@shmsr shmsr changed the title [MYSQL/IIS]: Add script processor to fix grok behaviour of ES 8.7.x+ [MYSQL]: Add script processor to fix grok behaviour of ES 8.7.x+ Jun 19, 2023
@shmsr
Copy link
Member Author

shmsr commented Jun 19, 2023

cc: @HiDAl I've seen you work on similar issues in the past. Could you please also take a look?

@shmsr shmsr requested a review from HiDAl June 19, 2023 09:00
@shmsr shmsr force-pushed the 6016-fix-grok-es-related-issue branch from 0b3b8e1 to 574dd97 Compare June 19, 2023 09:40
@elasticmachine
Copy link

💚 Build Succeeded

History

cc @shmsr

@shmsr shmsr changed the title [MYSQL]: Add script processor to fix grok behaviour of ES 8.7.x+ mysql: Add script processor to fix grok behaviour of ES 8.7.x+ Jun 19, 2023
@shmsr shmsr force-pushed the 6016-fix-grok-es-related-issue branch from 574dd97 to edff03b Compare June 19, 2023 12:00
@shmsr shmsr closed this Jun 19, 2023
@shmsr shmsr reopened this Jun 19, 2023
@HiDAl
Copy link

HiDAl commented Jun 20, 2023

Hey @shmsr, just to be clear: I haven't worked previously on fixing this kind of issue, I just changed the behavior of the Grok processor elastic/elasticsearch#92586.

Now, I don't know the whole business logic (Beats side, since I'm not part of the team), but looking at the code your code does not have a consistent output. In case thread_id is different at any point in the log lines, the processor will return a list of different thread_ids, which is not tested (I don't know whether is possible to have distinct thread_id per log block, though). With your code change, you've just made the current test pass. In case the thread_id might be different within a log block, I think this change is better: https://github.com/elastic/beats/pull/35276/files

@@ -32,6 +32,30 @@
"ignore_missing": true
}
},
{
"script": {
Copy link
Member

@andrewkroh andrewkroh Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a description and a tag to the script processors. This helps in debugging.

The description field documents what the script does which helps other developers and it is displayed in the Ingest Pipeline editor in Kibana. The tag field is included in the error message metadata and ingest node stats to help distinguish between script processors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @andrewkroh!

@ishleenk17
Copy link
Contributor

@shmsr : You can add the tests for this change and update the expected json file.

@ishleenk17
Copy link
Contributor

Hey @shmsr, just to be clear: I haven't worked previously on fixing this kind of issue, I just changed the behavior of the Grok processor elastic/elasticsearch#92586.

Now, I don't know the whole business logic (Beats side, since I'm not part of the team), but looking at the code your code does not have a consistent output. In case thread_id is different at any point in the log lines, the processor will return a list of different thread_ids, which is not tested (I don't know whether is possible to have distinct thread_id per log block, though). With your code change, you've just made the current test pass. In case the thread_id might be different within a log block, I think this change is better: https://github.com/elastic/beats/pull/35276/files

In the PR shared in this comment, for any new entry of thread_id in the grok, a change is needed in the painless script.
I suppose the change made here should be fine. It returns list of distinct values when thread ids are different and a single value if thread id is same.

@agithomas agithomas mentioned this pull request Jun 26, 2023
19 tasks
@shmsr
Copy link
Member Author

shmsr commented Jun 27, 2023

@shmsr : You can add the tests for this change and update the expected json file.

@ishleenk17 I did add them but had to remove them because of unavailability of setting kibana version for specific tests. Here: 3a656c9

Copy link
Contributor

@ishleenk17 ishleenk17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@shmsr shmsr merged commit 588f75c into elastic:main Jun 28, 2023
1 check passed
@shmsr shmsr deleted the 6016-fix-grok-es-related-issue branch June 28, 2023 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working, use only for issues Integration:mysql MySQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MYSQL/IIS log file issue in Integrations
5 participants