Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sonic_y_cable] add abstract class YCableBase required for Y-cable API support for multiple vendors #186

Merged
merged 28 commits into from
Jul 29, 2021

Conversation

vdahiya12
Copy link
Contributor

@vdahiya12 vdahiya12 commented May 5, 2021

Description

For multiple Y-Cable vendor support once we do have a mapping from vendor/part number to the appropriate Y-Cable module to load, we need to map appropriate port to a module as well. This PR adds definition for base abstract class.

Motivation and Context

  • Basically, the key idea is once we have a port identified as being of a certain vendor and it has been identified to load a certain module, it is then needed to call the correct module/API on each port each time we call the API on the port

  • it is required to maintain this mapping in memory since xcvrd does not want to read/parse this y_cable_vendor_mapping.py file again and again each time we call the Y-Cable API

  • Also note that the module loaded might change during xcvrd running lifetime since cable/transceiver can be changed from one vendor to another. So we need to take this into consideration as well

Proposed Solution for this
Each module of the Y-Cable vendor can be a class (of each transceiver type) and all we need to do is instantiate the objects of these classes as class instances and these objects will provide the interface of calling the API's for the appropriate vendor Y-Cable.

This instantiation will be done inside xcvrd, when xcvrd starts

  • These objects basically emulate Y-Cable instances and whatever action/interaction needs to be done with the Y-Cable the methods of these objects would provide that

  • each vendor in their implementation can inherit from a base class where there will be definitions for all the supported capabilities of the Y-Cable.

  • for vendors the recommended approach in case their subclass implementation does not support a method, is to set the method equal to None. This differentiates it from a method they forgot to implement. Then, the calling code should first check if the method is None before attempting to call it.

Design document for the support is here

Signed-off-by: vaibhav-dahiya vdahiya@microsoft.com

How Has This Been Tested?

Not tested, since this PR just defines the abstract base class implementation.

Additional Information (Optional)

…I support for multiple vendors

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Comment on lines 58 to 59
TOR A. This means if the Y cable is actively routing, the "check_active_linked_tor_side"
API will now return Tor A. It also implies that if the link is actively routing on this port, Y cable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe change 'routing' to 'sending', since 'routing' tends to imply some decision-making capability which is not the case for the cable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it still says 'routing'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 76 to 77
TOR B. This means if the Y cable is actively routing, the "check_active_linked_tor_side"
API will now return Tor B. It also implies that if the link is actively routing on this port, Y cable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


def check_if_link_is_active_for_NIC(self, port):
"""
This API specifically checks if NIC side of the Y cable's link is active
Copy link
Contributor

@theasianpianist theasianpianist May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the definition we are using for 'active'? Does that mean the cable is plugged in, link is set to admin up, traffic is flowing through this link, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes basically link is up means cable is up and configured correctly, show int status should show "oper" up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this definition commonly known? It might be worth including this info in the docstring.

Copy link
Contributor Author

@vdahiya12 vdahiya12 May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm maybe I can elaborate a little. I think "oper" up on Cli show interfaces status might confuse the vendors, but objectively this API it's just their capability to check if all the parameters for transceiver initialization happened correctly, and the link is active. Actually when the lanes come up they send some packets internally on Serdes to check if lanes are functioning properly and a bunch of other things to check if link is healthy. While this generally translates to link is up, they cant tell if the link will be up on SONiC side. So I think we can make this definition more concrete but maybe we should not put show int status output a cross check for them. Thoughts ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as the vendors know what this means, it's fine as is. I'll defer to your judgement here.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
"""
self.port = port

def toggle_mux_to_torA(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming to toggle_mux_to_tor_a to make snake case consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


raise NotImplementedError

def enable_loopback_mode(self, target, mode=NEAR_END_LOOPBACK, lane_mask):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def enable_loopback_mode(self, target, mode=NEAR_END_LOOPBACK, lane_mask):
def enable_loopback_mode(self, target, lane_mask, mode=LOOPBACK_MODE_NEAR_END):

Suggest to swap the order of lane_mask and mode to fix 'non-default argument follows default argument' syntax error and correct the variable name with 'LOOPBACK_MODE_NEAR_END'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

raise NotImplementedError

def get_switch_count_tor_a(self, clear_on_read=False):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both get_switch_count_tor_a() and get_switch_count_tor_b() are not supported in current MCU firmware. The cable can only return the total auto/manual switch count by reading the upper page 0x5, byte 141 to 148. If this information is useful for diagnostics, I will update the API document and MCU FW to support these two APIs from next firmware release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, we defined a total manual switching count in upper page 0x4, byte 141-144. In order to support these two APIs

  • get_switch_count_tor_a()
  • get_switch_count_tor_b()

I'd like to change the definition as below

  • upper page 0x4, byte 141-144: number of manual switching count initiated from TOR A
  • upper page 0x4, byte 225-229: number of manual switching count initiated from TOR B

and the get_switch_count_total() will return the sum of TOR A (byte 141-144) and TOR B (byte 225 - 228).

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>

#############################################################################################
### Debug Functionality ###
#############################################################################################
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a general question of the debug functionality. We observed that those diagnostic functions (get_ber_inf(), enable_loopback_mode()...etc.) can not work well when AN/LT enabled. Both sides receiver will jump to wait for AN (auto-negotiation) state after we enabled the debug mode (ex. set prbs generation, set loopback mode...). The AN/LT can no longer be finished because the TX were constantly transmit PRBS data instead of responding the AN signal. User cannot get correct BER info due to the receiver is not ready.

To solve the deadlock between AN/LT and debug function, the alternative solution is to re-configure both sides (SONiC and y cable) to disable AN/LT first, then run the debugging functions (PRBS, BER, Loopback)
For the next new firmware, the y_cable will support a mode configuration register in upper page 0x4, byte 209. User can configure the speed (50G, 100G) and ANLT on/off via this register.
For SONiC OS, is there any simple way to enable/disable the AN/LT without reloading the Config_db?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think there is any other way for now, other than to reload the confg_db.json

raise NotImplementedError

def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE,
fec_mode_nic=FEC_MODE_NONE, anlt_tor_a=False, anlt_tor_b= False, anlt_nic=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it make sense to change the input parameter as suggestion?
For current y_cable firmware implementation, both TOR A and TOR B must be operated in the same mode. it is not allowed to configure AN/LT, FEC on TOR A and B individually.

Suggested change
fec_mode_nic=FEC_MODE_NONE, anlt_tor_a=False, anlt_tor_b= False, anlt_nic=False):
fec_mode_nic=FEC_MODE_NONE, anlt_tor=False, anlt_nic=False):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, can you please check


raise NotImplementedError

def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def create_port(self, speed, fec_mode_tor_a=FEC_MODE_NONE, fec_mode_tor_b=FEC_MODE_NONE,
def create_port(self, speed, fec_mode_tor=FEC_MODE_NONE,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, can you please check

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
sonic_y_cable/y_cable_base.py Outdated Show resolved Hide resolved
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>

def log_warning(self, msg):
self.main_logger.log_warning("y_cable_port {} {}".format(self.port, msg))
self._logger.log_warning("y_cable_port {} {}".format(self.port, msg))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to add something like a colon as a delimiter. E.g, y_cable_port {}: {}. Do this for all log functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

vdahiya12 and others added 2 commits July 1, 2021 14:45
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
@vdahiya12 vdahiya12 requested a review from prgeor July 19, 2021 23:25
FEC_MODE_FC = 2

# definitions of the modes to be run for loopback mode
# on the port/cable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add LOOPBACK_MODE_NONE if the loopback mode is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

mode_value:
One of the following predefined constants, the mode to be run for loopback:
LOOPBACK_MODE_NEAR_END
LOOPBACK_MODE_FAR_END
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we return LOOPBACK_MODE_NONE if this feature is not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Signed-off-by: xinyu <xinyu0123@gmail.com>

raise NotImplementedError

def get_pcs_stats(self, target):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_pcs_stats(self, target):
def get_pcs_stats(self, target, clear_on_read=False):

Is it necessary to add clear_on_read to reset the pcs statistics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the feature cannot be supported, you can return an error when the caller tries to enable clear_on_read= True;
I believe other vendors might be able to support this, hence it is kept as an argument.


raise NotImplementedError

def get_fec_stats(self, target):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_fec_stats(self, target):
def get_fec_stats(self, target, clear_on_read=False):

Is it necessary to add clear_on_read to reset the fec statistics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the feature cannot be supported, you can return an error when the caller tries to enable clear_on_read= True;
I believe other vendors might be able to support this, hence it is kept as an argument.

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:


raise NotImplementedError

def get_target_cursor_values(self, lane, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of cursor, can we rename to get_target_equalization() as that would be more generic? pre-tap, post-tap, pre-emphasis, pre-cursor, post-cursor all fall under equalization settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make a note for this, all vendors will also need to be notfied for this. We can put this in a future PR.


raise NotImplementedError

def set_target_cursor_values(self, lane, cursor_values, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of cursor, can we rename to set_target_equalization() as that would be more generic? pre-tap, post-tap, pre-emphasis, pre-cursor, post-cursor all fall under equalization settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me make a note for this, all vendors will also need to be notfied for this. We can put this in a future PR

a boolean, False, Non-hitless upgrade: it will update the firmware regardless
the current status, a link flip can be observed during the upgrade.
Returns:
One of the following predefined constants:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a vendor doesn't support hitless upgrade, then we should have a return value indicating the failure.

Copy link
Contributor Author

@vdahiya12 vdahiya12 Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all vendors need to support hitless by default. The firmware error codes are going to be in a future PR as they are under discussion by vendors. Currently the recommendation is if there is an error which is not defined, vendor can return a generic error EEPROM_ERROR, which the API is called for.


raise NotImplementedError

def get_firmware_version(self, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Y-cable function with different targets running different firmware versions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, depends on vendor. Some vendors have a unique version on all MCU's whereas some could have different versions.


raise NotImplementedError

def get_eye_heights(self, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NRZ signal has single eye, whereas PAM4 signal has 3 eyes. So we should return something like.

{lane#: eye1, eye2, eye3} for PAM4 where eye1 is for upper eye, eye2 is for middle eye and eye3 for lower eye heights respectively.
{lane#: eye} for NRZ signal

PAM4 is used in QSFP-DD, OSFP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all vendors are on NRZ and that's why we have a single eye value for each lane.


raise NotImplementedError

def get_switch_count_target(self, switch_count_type, target, clear_on_read=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API will fail for target=TARGET_NIC but the return type don't indicate the failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted for a further enhancement PR

raise NotImplementedError

def get_switching_mode(self):
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will this api return if set_switching_mode() is not called? or in other words what is the default mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by default the mode has to be manual. There can never be a no mode value. it is either auto or manual


raise NotImplementedError

def get_alive_status(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this API return true if all the API is_link_active() returns true for ALL targets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No link active is different from get_alive_status. get_alive_status just checks whether all sides of the cable are plugged in or not.


raise NotImplementedError

def get_autoswitch_hysteresis_timer(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this return 0 if hystersis timer is not configured or return the default value?

Copy link
Contributor Author

@vdahiya12 vdahiya12 Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should return the value corresponding to the eeprom offset where the definition is. All vendors could have different hysteresis_timer values but default is 5 s


raise NotImplementedError

def reset(self, target):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify if this API affect the ongoing traffic on the link?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if target is NIC this will reset.

@vdahiya12 vdahiya12 merged commit c43dc17 into sonic-net:master Jul 29, 2021
@vdahiya12 vdahiya12 deleted the abstract_class branch July 29, 2021 06:24
qiluo-msft pushed a commit that referenced this pull request Aug 3, 2021
…I support for multiple vendors (#186)

Description
For multiple Y-Cable vendor support once we do have a mapping from vendor/part number to the appropriate Y-Cable module to load, we need to map appropriate port to a module as well. This PR adds definition for base abstract class.

Motivation and Context
Basically, the key idea is once we have a port identified as being of a certain vendor and it has been identified to load a certain module, it is then needed to call the correct module/API on each port each time we call the API on the port

it is required to maintain this mapping in memory since xcvrd does not want to read/parse this y_cable_vendor_mapping.py file again and again each time we call the Y-Cable API

Also note that the module loaded might change during xcvrd running lifetime since cable/transceiver can be changed from one vendor to another. So we need to take this into consideration as well

Proposed Solution for this
Each module of the Y-Cable vendor can be a class (of each transceiver type) and all we need to do is instantiate the objects of these classes as class instances and these objects will provide the interface of calling the API's for the appropriate vendor Y-Cable.

This instantiation will be done inside xcvrd, when xcvrd starts

These objects basically emulate Y-Cable instances and whatever action/interaction needs to be done with the Y-Cable the methods of these objects would provide that

each vendor in their implementation can inherit from a base class where there will be definitions for all the supported capabilities of the Y-Cable.

for vendors the recommended approach in case their subclass implementation does not support a method, is to set the method equal to None. This differentiates it from a method they forgot to implement. Then, the calling code should first check if the method is None before attempting to call it.

Design document for the support is sonic-net/SONiC#757

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants