Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus alerts #40

Merged
merged 1 commit into from
Oct 16, 2017
Merged

Prometheus alerts #40

merged 1 commit into from
Oct 16, 2017

Conversation

moolitayer
Copy link

@moolitayer moolitayer commented Jun 14, 2017

Issue: ManageIQ/manageiq#14238
See issue for setup

@moolitayer moolitayer changed the title Prometheus alerts [WIP] Prometheus alerts Jun 14, 2017
@moolitayer moolitayer mentioned this pull request Jun 14, 2017
9 tasks
@miq-bot miq-bot added the wip label Jun 14, 2017
@moolitayer moolitayer force-pushed the prometheus_alerts branch 2 times, most recently from e12b8c4 to 497cedf Compare June 15, 2017 14:37
@moolitayer moolitayer changed the title [WIP] Prometheus alerts Prometheus alerts Jul 3, 2017
@miq-bot miq-bot removed the wip label Jul 3, 2017
@moolitayer moolitayer force-pushed the prometheus_alerts branch 5 times, most recently from c12f52f to 566740d Compare July 6, 2017 10:40
@miq-bot
Copy link
Member

miq-bot commented Jul 24, 2017

This pull request is not mergeable. Please rebase and repush.

@moolitayer
Copy link
Author

@cben @zgalor please review

@moolitayer
Copy link
Author

@enoodle please review

Copy link
Contributor

@zgalor zgalor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor naming and privacy nits

target = find_target(labels)
{
:ems_id => @cfg[:ems_id],
:source => 'DATAWAREHOUSE',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be MONITORING ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that would not work as DATAWAREHOUSE is our current hook into the event system. I have a PR renaming these two fields to MONITORING/monitoring_alert. I update here one that has been resolved

:ems_id => @cfg[:ems_id],
:source => 'DATAWAREHOUSE',
:timestamp => timestamp,
:event_type => 'datawarehouse_alert',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment from 2 lines above

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous response

FactoryGirl.create(
:endpoint,
:role => 'prometheus_alerts',
:hostname => 'alerts-prometheus.10.35.48.141.nip.io',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe replace (here and in recordings) to fake hostname and token

FactoryGirl.create(
:authentication,
:authtype => 'prometheus_alerts',
:auth_key => 'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJtYW5hZ2VtZW50LWluZnJhIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Im1hbmFnZW1lbnQtYWRtaW4tdG9rZW4tZzRscDciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibWFuYWdlbWVudC1hZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImQxOWUzMDk4LTZmOGUtMTFlNy04YTQ1LTAwMWE0YTE2MjY4YyIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDptYW5hZ2VtZW50LWluZnJhOm1hbmFnZW1lbnQtYWRtaW4ifQ.m4ckxIoTj__rCFA25V8lurlapNgL6doCpP51Z0JA5VgiU9mRxdk8KGyCe6w_mhrh5xPO9UuiK_pYIIMPQ-lAijRaQrKITZ-smRMEcsqKU0uP10UliJIj7ICGxn7c2zDYQuxMOQwI4ELWPv4xBHVXkF0hBKhVCPWRmdnIwADIfQi4RqoN_1gQf8S2O8C_eDS-A1f5SUf1Bvhexs997NWFBVyKEUQKxdqAlHf59V3Zt2VaE2HUme_ppYvp5L9QSa-CEZU_XxmUwNeCzcgbzts8OMujuHB_8lwKcLy9H0k3_oh-EcOn4CvzVz6pTfh8HPDoFhqpnBHfLB4pnsRkoHbDrA',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see previous comment

FactoryGirl.create(
:endpoint,
:role => 'prometheus_alerts',
:hostname => 'alerts-prometheus.10.35.48.141.nip.io',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe replace (here and in recordings) to fake hostname and token

@@ -14,7 +14,7 @@
let(:prometheus_authentication) do
FactoryGirl.create(
:authentication,
:authtype => :prometheus_alerts,
:authtype => 'prometheus_alerts',
:auth_key => 'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJtYW5hZ2VtZW50LWluZnJhIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Im1hbmFnZW1lbnQtYWRtaW4tdG9rZW4tMnBzZjMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoibWFuYWdlbWVudC1hZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjJiMjg1MmIyLTZjNzAtMTFlNy1hYTlmLTAwMWE0YTE2MjYxOSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDptYW5hZ2VtZW50LWluZnJhOm1hbmFnZW1lbnQtYWRtaW4ifQ.OSh7pgdRXAIUh8hPipfj_me-T3cwI_DsqQUSV1yYo1qvGEd1Aa-oVaBAeKsbwjCDhcNw2nNTWkdeYoy4-MyoTjleZbvdsbtSgs84LCPdLfaVd7NXBiXtNj6o4a2tbcE_GUTBiOEyZfzFh5pwM1n9BbU6qwAvf5B-uiAuGI1VZXyiUWtlspNAEPOy3awFNt9vausNahfMRox9MLB62BYzj2NV36inpNY_UWMNV0X0Q1VnZWO6-v28JAkWhuqaRHgSOUKV1FKJDY6R4rCGxt5BnVS6_81au80vouZmv0oR6kvDPWZo6IVPG8JCIpxK0liJW65pxKIBf7oOkEzhi3wy9w',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe replace (here and in recordings) to fake token

@moolitayer
Copy link
Author

@ilackarms please review
(BTW since we are blocked on watch this is still using pulling)

@moolitayer
Copy link
Author

moolitayer commented Aug 2, 2017

code climate asks to use safe navigation operator (&.) available since ruby 2.3.
According to travis.yml here and in the main repo, we do not support 2.2 (which is in security maintenance phase) so that should be fine. I do not see any current usages of safe navigation although I would like to start using it.
cc @jrafanie @cben

@moolitayer
Copy link
Author

@cben @zgalor @enoodle @simon3z needs your review please

event = event.dup

annotations, labels = event["annotations"], event["labels"]
severity, event[:url] = annotations["severity"], annotations["url"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO these are paired randomly and could read better as separate assignments.

  • event[:url] write is hiding here in middle of line.
  • labels and its uses far below could be closer together.


event[:severity] = parse_severity(severity)
event[:resolved] = event["status"] == 'resolved'
event[:ems_ref] = incident_identifier(event["startsAt"], labels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event will contain a mixture of string and symbol keys. intentional?
all the symbol keys are added only so that they appear in full_data, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the symbol keys are added only so that they appear in full_data, right?

Right

event will contain a mixture of string and symbol keys. intentional?

to be taken into account these should be symbols[1], if we care about all the keys having the same type we can call deep_symbolize_keys (or have the client return symbol keyed objects). WDYT @cben?

[1] https://github.com/moolitayer/manageiq/blob/408cc00831ae9fc669a79f6be002f8e6effc264f/app/models/ems_event.rb#L177-L183

def incident_identifier(incident_start, labels)
# When event b resolves event a, they both have the same startAt.
# Labels are added to avoid having two incidents starting at the same time.
"#{incident_start}_#{labels.hash}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.hash is the ruby method returning an integer, right?
It's not stable between ruby runs (randomized seed against DOS)! And obviously not between ruby versions.
You're persisting it, which sounds bad idea.

Can you use some canonic json / yaml serialization (or its SHA if length is a concern)?

@jrafanie
Copy link
Member

jrafanie commented Aug 2, 2017

code climate asks to use safe navigation operator (&.) available since ruby 2.3.
According to travis.yml here and in the main repo, we do not support 2.2 (which is in security maintenance phase) so that should be fine. I do not see any current usages of safe navigation although I would like to start using it.
cc @jrafanie @cben

@moolitayer we can't yet because we haven't officially dropped 2.2.

I haven't gotten getting this ruby 2.4 issue done to completion. Once that is complete, we'd drop support for 2.2 and only support 2.3 and 2.4. If you have feedback on which gems are complete from that list, or ones that are missing, I could use help making sure our gem ecosystem works on 2.4.

@ilackarms
Copy link

tested working based on 4.6 deployment efforts

@miq-bot
Copy link
Member

miq-bot commented Aug 30, 2017

This pull request is not mergeable. Please rebase and repush.

@simon3z
Copy link
Contributor

simon3z commented Aug 31, 2017

tested working based on 4.6 deployment efforts

@ilackarms IIUC this still requires some manual activation?

@ilackarms
Copy link

@simon3z yes, need to manually create alert policy in Control dashboard

@moolitayer moolitayer force-pushed the prometheus_alerts branch 3 times, most recently from 8abe758 to 5b90a58 Compare September 4, 2017 10:43
@miq-bot
Copy link
Member

miq-bot commented Sep 5, 2017

This pull request is not mergeable. Please rebase and repush.

@miq-bot
Copy link
Member

miq-bot commented Sep 27, 2017

Checked commit moolitayer@4cd2b3e with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
11 files checked, 0 offenses detected
Everything looks fine. 🏆

event[:resolved] = event["status"] == "resolved"
timestamp = event["timestamp"]

target = find_target(labels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moolitayer all the target references below (e.g. target.id, etc.) are crashing when target is nil.
cc @ilackarms @joelddiaz

end

def stop_event_monitor
@event_monitor_handle.stop unless @event_monitor_handle.nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moolitayer there's a codeclimate comment to remember here for future fixes/refactorings:

Use safe navigation (&.) instead of checking if an object exists before calling the method. 

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this comment: #40 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moolitayer why can't you use try meanwhile?

@simon3z simon3z merged commit 446495e into ManageIQ:master Oct 16, 2017
@agrare agrare added this to the Sprint 71 Ending Oct 16, 2017 milestone Oct 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants