Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event mgmt #7813

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open

Event mgmt #7813

wants to merge 21 commits into from

Conversation

spenugondaa
Copy link

@spenugondaa spenugondaa commented Jun 7, 2021

Why I did it

This PR contains code changes for providing Event and Alarm Framework as specified in the HLD

How I did it

Followed design specified in HLD.

  • A new container - eventd - is added to provide the framework. (buildimage/src/sonic-eventd/)
  • Applications need to link libeventnotify library to raise events or alarms. (buildimage/src/sonic-eventd/lib/)
  • API is defined at buildimage/src/sonic-eventd/lib/include/eventnotify.h
  • A new DB - EVENT_DB - to store event/alarm/event-stats/alarm-stats tables.
  • Default event profile is maintained at buildimage/src/sonic-eventd/var/evprofile and gets installed on the device at /etc/evprofile/default.json
  • Manifest file to control the size of event table is maintained at buildimage/src/sonic-eventd/etc/eventd.json and gets installed on the device at /etc/eventd.json

How to verify it

A python script is developed to raise events, raise/clear/ack/unack alarms. The script verifies various counters from EVENT_STATS and ALARM_STATS tables.

Ran tests/event_unit_test.py

root@sonic:/# python3 event_unittest.py

TEST: raise event
test_event success

TEST: raise alarm
test_raise_alarm success

TEST: raise and clear alarm
test_clear_alarm success

TEST: ack alarm
('42006',)
raise_id is 42006
test_ack_alarm success

TEST: unack alarm
('42009',)
raise_id is 42009
test_unack_alarm success
root@sonic:/#

Dependent PRs

sonic-mgmt-framework
sonic-mgmt-common
sonic-swss-common

spenugondaa added 13 commits June 3, 2021 01:15
Define libeventnotify for applications to use LOG_EVENT to raise events/alarms.
Eventd to receive and store them in the DB.

Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
…isord;

Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
spenugondaa added 7 commits June 8, 2021 01:14
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
…tency.

Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
…gnore it.

throttle-timeout support to specify time duration to keep throttling the event from an applicaiton.

Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Signed-off-by: spenugondaa <srinadh.penugondaa@dell.com>
Copy link
Contributor

@rajendra-dendukuri rajendra-dendukuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please consider removing blank lines at at the end of newly added files
  • At what point are Event tables saved from redisdb to the corresponding dump.rdb file? If so is there a usecase where a thermal shutdown happens and all the event history is lost. I agree that syslogs will have information of all the event notifications.

@@ -11,6 +11,12 @@
"port": 6380,
"unix_socket_path": "/var/run/redis-chassis/redis_chassis.sock",
"persistence_for_warm_boot" : "yes"
},
"redis1":{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for having a dedicated redis instance for event mgmt is because they are supposed to persist across reboots. If that is the intention, can we name this instance as redisp instead of redis1. Other applications can also use it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the HLD review, we were asked to keep event/alarm tables in separate instance so that they wont grow too big that impacts other tables.

Description=Eventd container
Requires=database.service
After=database.service
Before=ntp-config.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for the before ntp-config service dependency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At that time, I thought there might be a scenario ntpd may want to raise an event/alarm.


* Initial release.

-- Eventd <support@dell.com> Thu, 9 Apr 2020 12:00:00 -0800
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm the date? 2020 or 2021?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change


* Initial release.

-- Eventd <support@dell.com> Thu, 9 Apr 2020 12:00:00 -0800
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check date

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change

@@ -0,0 +1,2 @@
#! /usr/bin/dh-exec
/usr/lib/x86_64-linux-gnu/libeventnotify.so.0 /usr/lib/x86_64-linux-gnu/libeventnotify.so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about arm64 platforms?

EventInfo tmp = (EventInfo) (it->second);
// discard the event as event_static_map shows enable is false for this event
if (tmp.enable == EVENT_ENABLE_FALSE_STR) {
SWSS_LOG_NOTICE("Discarding event <%s> as it is set to disabled", ev_id.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be SWSS_LOG_INFO? These are disabled events, may not be of great interest.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

m_alarmTable.set(ev_src, ack_vec);
m_eventTable.set(ev_src, ack_vec);
} else {
SWSS_LOG_ERROR(" %s/%s is already un-acknowledged", ev_id.c_str(), ev_src.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be NOTICE instead of ERROR. There is no error in eventd that has happened. It is an invalid input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

m_alarmTable.set(ev_src, ack_vec);
m_eventTable.set(ev_src, ack_vec);
} else {
SWSS_LOG_ERROR(" %s/%s is not in raised state", ev_id.c_str(), ev_src.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this NOTICE

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,2 @@
/etc/eventd.json
/var/evprofile/default.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to /var/lib/evprofile/default.json

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know the reason?
The default.json is placed under sonic-eventd/var/evprofile/default.json so, the .install is using that path.
Should we rather put default.json in sonic-eventd/var/lib/evprofile/ ?

But why?


sonic-db-cli EVENT_DB config set notify-keyspace-events AKE

cp /var/evprofile/default.json /etc/evprofile/default.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cp /var/lib/evprofile/default.json /etc/evprofile/default.json

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous comment.

Signed-off-by: Srinadh Penugonda <srinadh.penugondaa@dell.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants