Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gbsyncd fails to do PHY device cleanup OR graceful deinit during docker restart and config reload. #16608

Closed
jaganbal-a opened this issue Sep 20, 2023 · 3 comments · Fixed by #16812

Comments

@jaganbal-a
Copy link
Contributor

Problem description:

  1. Gbsyncd fails to do PHY device cleanup OR graceful deinit during docker restart and config reload.

  2. Restarting ‘syncd0’ docker, invokes /usr/bin/syncd_request_shutdown from syncd.sh which gbsyncd0 also receives the request and does the PHY cleanup, Which is incorrect.

Findings:

  1. Since “stopplatform1” is not implemented in files/scripts/gbsyncd.sh, gbsyncd is not getting notification for graceful shutdown.
  2. /usr/bin/docker exec -i /usr/bin/syncd_request_shutdown –cold results in shutdown request on syncd process in both gbsyncd/syncd dockers, which is triggered from syncd.sh when syncd docker goes for shutdown/restart.

So how to make syncnd_request_shutdown command to notify only the respective docker syncd process ?

root@sonic:/home/cisco# /usr/bin/docker exec -i syncd0 /usr/bin/syncd_request_shutdown --cold
requested COLD shutdown

Sep 20 16:00:52.631030 sonic NOTICE syncd0#syncd_request_shutdown: :- loadFromFile: no context config specified, will load default context config
Sep 20 16:00:52.631030 sonic NOTICE syncd0#syncd_request_shutdown: :- insert: added switch: idx 0, hwinfo ''
Sep 20 16:00:52.631227 sonic NOTICE syncd0#syncd_request_shutdown: :- send: requested COLD shutdown
Sep 20 16:00:52.631447 sonic NOTICE syncd0#syncd: :- run: is asic queue empty: 1
Sep 20 16:00:52.631447 sonic NOTICE syncd0#syncd: :- run: drained queue
Sep 20 16:00:52.631467 sonic NOTICE gbsyncd0#GBSAI[14]: :- run: is asic queue empty: 1
Sep 20 16:00:52.631499 sonic NOTICE gbsyncd0#GBSAI[14]: :- run: drained queue
Sep 20 16:00:52.631499 sonic NOTICE syncd0#syncd: :- handleRestartQuery: received COLD switch shutdown event
Sep 20 16:00:52.631541 sonic NOTICE gbsyncd0#GBSAI[14]: :- handleRestartQuery: received COLD switch shutdown event
Sep 20 16:00:52.632546 sonic NOTICE gbsyncd0#GBSAI[14]: :- removeAllSwitches: Removing all switches

 ---------------------------------------------------
 BUG REPORT INFORMATION
 ---------------------------------------------------
 Use the commands below to provide key information from your environment:
 You do NOT have to include this information if this is a FEATURE REQUEST

-->

Description

  1. Gbsyncd fails to do PHY device cleanup OR graceful deinit during docker restart and config reload.
  2. Restarting ‘syncd0’ docker, invokes /usr/bin/syncd_request_shutdown from syncd.sh which gbsyncd0 also receives the request and does the PHY cleanup, Which is incorrect.

Steps to reproduce the issue:

  1. gbsyncd restart
  2. config reload
  3. execute the command /usr/bin/docker exec -i syncd0 /usr/bin/syncd_request_shutdown --cold

Describe the results you received:

  1. With config reload, gbsyncd never receive notification to gracefully deinit the PHY. this is due to shutdownplatform1 is not implemented in gyncd.sh.
    Since the gbsyncd docker get terminated ~12 sec before syncd docker, the syncd_shutsown_request issued in syncd docker do not reach gbsyncd as it is already terminated.

  2. syncd docker restart makes the gbsyncd to cleanup PHY .

Describe the results you expected:

gbsyncd should be notified of graceful de-init/shutdown during config reload.
syncd docker restart should only cleanup the device controlled by the docker and do not send notification to gbsyncd.

Output of show version:

Output of show techsupport:

@judyjoseph
Copy link
Contributor

@jaganbal-a Could you share more info on which platform, which branch and build.

@jaganbal-a
Copy link
Contributor Author

@jaganbal-a Could you share more info on which platform, which branch and build.

Platform: Multi-ASIC Platform with PHY device: I see this issue with Cisco 88-LC36-FH-MO line card.
Branch : azure/cisco/msft/202205, which is in sync with 202205 branch commit 7c68be0.
Build: It is cisco internal build.

@judyjoseph
Copy link
Contributor

@abdosi f.y.i

lguohan pushed a commit that referenced this issue Dec 7, 2023
…6812)

Fix #16608. Need to gracefully shutdown syncd/gbsyncd individually.
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Dec 15, 2023
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Dec 16, 2023
jimmyzhai added a commit to jimmyzhai/sonic-buildimage that referenced this issue Dec 19, 2023
yxieca pushed a commit that referenced this issue Dec 20, 2023
…6812) (#17563)

Fix #16608. Need to gracefully shutdown syncd/gbsyncd individually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants