Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report remote output health #3116

Closed
juliaElastic opened this issue Nov 21, 2023 · 4 comments · Fixed by #3127
Closed

Report remote output health #3116

juliaElastic opened this issue Nov 21, 2023 · 4 comments · Fixed by #3127
Assignees
Labels
QA:Validated Validated by the QA Team Team:Fleet Label for the Fleet team

Comments

@juliaElastic
Copy link
Contributor

juliaElastic commented Nov 21, 2023

Related to elastic/kibana#104986
Follow up after #3051 (comment)

In the original plans of remote ES output, we planned to make Fleet Server status degraded if a remote ES output is not accessible, however this has too large impact, so we decided to do output health reporting separately.

Implementation tasks:

  • Report periodic health updates per remote output to logs-fleet_server.output_health-default data stream or metrics-*
    • Check if privileges are needed in ES for fleet-server service account to write to these data streams, and kibana_system to read it (metrics privilege already exists for kibana_system)
    • Add ILM/DLM
  • From Fleet UI, read this data stream and show error status on Output UI
@juliaElastic juliaElastic changed the title Report remote output healthy Report remote output health Nov 21, 2023
@jlind23 jlind23 added the Team:Fleet Label for the Fleet team label Nov 21, 2023
@juliaElastic juliaElastic self-assigned this Nov 29, 2023
@juliaElastic
Copy link
Contributor Author

I picked this up as I think this is require to roll out remote ES: #3051 (comment)

@juliaElastic
Copy link
Contributor Author

juliaElastic commented Nov 29, 2023

Raised a few prs, I have the base functionality working, with some notes/suggestions here, as the design doc was not very detailed regarding the UX.
I'll continue to add tests, feel free to comment.

@juliaElastic
Copy link
Contributor Author

@amolnater-qasource This enhancement is related to Remote ES output, adding a health reporting for remote outputs.
See screenshots and verification steps here: elastic/kibana#172181

juliaElastic added a commit to elastic/kibana that referenced this issue Dec 5, 2023
## Summary

Relates elastic/fleet-server#3116

Relates #104986

Reading latest output health state from
`logs-fleet_server.output_health-default` data stream by output id, and
displaying error state on UI - Edit Output flyout.

Steps to verify:
- enable feature flag `remoteESOutput`
- add `remote_elasticsearch` output, can be a non-existent host for this
test
- add the output as monitoring output of an agent policy
- run fleet-server with the changes
[here](elastic/fleet-server#3116)
- enroll an agent
- wait until fleet-server starts reporting degraded state in the output
health data stream
- open edit output flyout on UI and verify that the error state is
visible
- when the connection is back again (update host to a valid one, or
remote es was temporarily down), the error state goes away

<img width="568" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/46d0cf95-6aa4-4f7c-8608-4362ada4eb6c">

The UI was suggested in the design doc:
https://docs.google.com/document/d/19D0bX7oURf0yms4qemfqDyisw_IYB-OVw4oU-t4lf18/edit#bookmark=id.595r8l91kaq8

### Notes/suggestions:

- We might want to add the output state to the output list as well
(maybe as badges like agent health?) as it's not too visible in the
flyout (have to scroll down).
- Also the error state will be reported earliest when an agent is
enrolled and fleet-server can't create api key, so not immediately when
the output is added. It would be good to show the time of the last state
(e.g. how we display on agents last checkin x minutes ago)
- I think it would be beneficial to display the healthy state too.

Added badges to output list:
<img width="1233" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/07ff06ec-b778-4420-975b-b46a0a18c7cc">

Added healthy state UI to Edit output:
<img width="627" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/4222d849-c957-41d7-9606-b58493264115">


### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
@amolnater-qasource
Copy link
Collaborator

Hi @juliaElastic

Thank you for the update.

We have revalidated this issue on latest 8.12.0 Snapshot kibana cloud environment and found it working fine:

Observations:

  • Unhealthy badge is visible with invalid Remote Elasticsearch output.
  • Appropriate error message is displayed while editing invalid Remote Elasticsearch output.
  • Healthy badge is visible with valid Remote Elasticsearch output.
  • Healthy message is also displayed while editing valid Remote Elasticsearch output.

Screenshot:
image

image

image

image

Build details:
VERSION: 8.12.0 SNAPSHOT
BUILD: 69787
COMMIT: ff04f68ab0e6d82b91b30dea13810225efc6c606

Hence we are marking this as QA:Validated, further we will create the testcases for these changes under elastic/kibana#104986

Thanks!

@amolnater-qasource amolnater-qasource added QA:Validated Validated by the QA Team and removed QA:Needs Validation Needs validation by the QA Team labels Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QA:Validated Validated by the QA Team Team:Fleet Label for the Fleet team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants