Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove PMON and XCVRD delay on system boot for modules controlled by SW #18295

Conversation

vadymhlushko-mlnx
Copy link
Contributor

@vadymhlushko-mlnx vadymhlushko-mlnx commented Mar 7, 2024

Why I did it

To support the fastboot feature alongside module controlled by sw feature.
Because if the feature is enabled we need the XCVRD to start as soon as possible to identify if the port is SW or FW controlled and bring the port to the UP state.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Remove the delays for the PMON docker and XCVRD daemon on the SONiC image start.

How to verify it

Run the sonic-mgmt/tests/platform_tests/test_advanced_reboot.py - test_fast_reboot

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305
  • 202311

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@liat-grozovik liat-grozovik changed the title [IM] Remove the PMON and XCVRD delay on system boot Remove PMON and XCVRD delay on system boot for modules controlled by SW Mar 11, 2024
@liat-grozovik
Copy link
Collaborator

@vadymhlushko-mlnx is this cleanly cherry picked to 202311 or we need a dedicate PR? if such is required we should have the label set properly and relate to the new PR against 202311.

@vadymhlushko-mlnx
Copy link
Contributor Author

@vadymhlushko-mlnx is this cleanly cherry picked to 202311 or we need a dedicate PR? if such is required we should have the label set properly and relate to the new PR against 202311.

I've already created a separate PR for 202311 - [202311][IM] Remove the PMON and XCVRD delay on system boot #18296

@vadymhlushko-mlnx
Copy link
Contributor Author

@prgeor could you please help to merge

/bin/systemctl start pmon
debug "Started pmon service"
fi
debug "Starting pmon service..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx Why will Systemd not start the PMON early? Why not remove "After=Syncd" for your platform?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx would you please check the comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor XCVRD should be started after we create_switch and initialize all the ports, that is why we have a dependency for the syncd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx There is already a dependency between Xcvrd and Swss. Xcvrd waits for PortInitDone and untill portsyncd sets its after all host interfaces are created: https://github.com/sonic-net/sonic-swss/blob/19410232523283d99eaa211979cc1e58b6cb262f/portsyncd/portsyncd.cpp#L134

We are holding complete PMON init just for Xcvrd which doesn't seems right. Can we use PortInitDone as a synchronization?

Copy link
Contributor Author

@vadymhlushko-mlnx vadymhlushko-mlnx Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor
This PR goal is to start the XCVRD CMIS manager task as soon as possible to link UP the SW-controlled ports as quickly as possible during the fastboot.
The XCVRD CMISMangerTaks already using the PortInitDone sync mechanism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx synchronization via systemd service file is fine. Why do we need to start pmon from synch.sh?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx let systemd do the synchronization between syncd and pmon service

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Some services are started by shell scripts, some via systemd. Both approaches work in SONiC. Starting from shell script gives more flexibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor I've removed unused code from the syncd.sh as you requested

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx OK. is the PR still in draft or ready for review?

@vadymhlushko-mlnx
Copy link
Contributor Author

@vadymhlushko-mlnx vadymhlushko-mlnx force-pushed the master-pmon-remove-delay branch 2 times, most recently from 665dd46 to eed6e6c Compare May 2, 2024 09:18
@dprital
Copy link
Collaborator

dprital commented May 6, 2024

@prgeor , Can you please review as your comment was addressed ?

@yuazhe yuazhe mentioned this pull request May 8, 2024
11 tasks
Copy link
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadymhlushko-mlnx please use systemd for synchronization between PMON and syncd

Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
Signed-off-by: vadymhlushko-mlnx <vadymh@nvidia.com>
@vadymhlushko-mlnx vadymhlushko-mlnx marked this pull request as draft May 20, 2024 14:53
@vadymhlushko-mlnx
Copy link
Contributor Author

New PR will be opened by @stepanblyschak

@stepanblyschak
Copy link
Collaborator

@prgeor Please have a look at #19190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants