-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service although it is masked #18829
Conversation
@kellyyeh please review this at earliest, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would like to keep an eye on this PR.
we have seen similar issue where sup has bgp disabled but not masked, and swss startup also starts bgp which caused an issue. there is #15734 to fix that.
this PR LGTM as a more general fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the only case is if the service is delayed and hostcfg sets the FEATURE state to enable/disable later -- which seems not the case with dhcp_relay service.
This change is generally in the right direction. I am surprised that it only affected dhcp relay. @qiluo-msft , @saiarcot895 do you have other concerns for the change? |
we have seen similar issue where sup has bgp disabled but not masked, and swss startup also starts bgp which caused an issue. there was #15734 to fix that. |
For swss, there's also the |
2c21ab5
to
cd9ddcb
Compare
For this particular issue, it only happens on the variable "DEPENDENT". The docker-wait-any is using a variable "MULTI_INST_DEPENDENT". No need to modify for docker-wait-any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cd9ddcb
to
5f12fdf
Compare
@prabhataravind would you be able to review/approve, so that this PR could be merged? thanks. |
…e althoug it is masked Signed-off-by: mlok <marty.lok@nokia.com>
5f12fdf
to
91b0986
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest change looks good to me.
@rlhui Reviewed and approved the latest diff. |
/bin/systemctl start ${dep} | ||
if [[ $dep == "dhcp_relay" ]]; then | ||
state=$(is_feature_enabled $dep) | ||
if [[ $state == "true" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After starting the service, should we enable the feature flag for this service?
Hi @yxieca are we good to merge this PR? |
…e although it is masked (sonic-net#18829) * [voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service althoug it is masked Signed-off-by: mlok <marty.lok@nokia.com>
…e although it is masked (sonic-net#18829) * [voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service althoug it is masked Signed-off-by: mlok <marty.lok@nokia.com>
Cherry-pick PR to 202405: #20182 |
…e although it is masked (#18829) * [voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service althoug it is masked Signed-off-by: mlok <marty.lok@nokia.com>
…e although it is masked (sonic-net#18829) * [voq][chassis][dhcp_relay] swss.sh try to start the dhcp_relay service althoug it is masked Signed-off-by: mlok <marty.lok@nokia.com>
Why I did it
on Master branch, dhcp_relay is not supported in VOQ chassis. It is disabled in the FEATURE table. But based on the dependency, swss.sh always call "systemctl start" it although it's service file has been masked/disabled. The following error is logged in syslog which causes the logAnalyze failed on some of the OC tests. Fix issue #18822
Work item tracking
How I did it
Added code to swss.sh to check if service is disabled or not. If it is disabled, do not start the service although it is in the DPENDENT_LIST. This avoids the ERROR log shown up in the syslog file
How to verify it
Which release branch to backport (provide reason below if selected)
This issue only exist in Master branch which is using the newer version of kernel.
Tested branch (Please provide the tested image version)
tested on Master branch.
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)