Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing FEC port statistics on management port cause port rate script to fail #10850

Closed
lukasstockner opened this issue May 16, 2022 · 3 comments · Fixed by sonic-net/sonic-sairedis#1052
Labels
BRCM Triaged this issue has been triaged

Comments

@lukasstockner
Copy link
Contributor

Description

When running a recent 202111 image on a Celestica Seastone2, the syslog gets spammed with errors related to FEC counters and Redis failures, and show interface counters doesn't show port data rates on ports past Ethernet63.

Steps to reproduce the issue:

  1. Install recent 202111 image on Celestica Seastone2
  2. Check syslog

Describe the results you received:

When SONiC tries to query the port counters on the front-panel 10G management port, it fails to get FEC counters (presumably the mgmt port does not support FEC) and then fails to collect any info.
Therefore, the relevant DB entry stays mostly empty:

  "COUNTERS:oid:0x1000000000042": {
    "expireat": 1652720954.8880002,
    "ttl": -0.001,
    "type": "hash",
    "value": {
      "SAI_PORT_STAT_IN_DROPPED_PKTS": "0",
      "SAI_PORT_STAT_OUT_DROPPED_PKTS": "0"
    }
  },

This in turn causes port_rates.lua to fail, since it expects the entry to contain various counters.

Relevant logs:

INFO syncd#syncd: [none] SAI_API_PORT:brcm_sai_get_port_stats:4948 Get Port Phy Control FEC Symbol Error Count failed with error Feature unavailable (0xfffffff0).
ERR syncd#syncd: :- collectPortCounters: Failed to get stats of port 0x100000042: -2
ERR syncd#syncd: :- guard: RedisReply catches system_error: command: *85#015#012$7#015#012EVALSHA#015#012$40#015#012db030d61747eb633f38f61ddeaed7f27739807d0#015#012$2#015#01278#015#012$19#015#
012oid:0x1000000000002#015#012$19#015#012oid:0x1000000000003#015#012$19#015#012oid:0x1000000000004#015#012$19#015#012oid:0x1000000000005#015#012$19#015#012oid:0x1000000000006#015#012$19#015#012oid:0x1000000000007#015#012$19#015#012oid:0x1
000000000008#015#012$19#015#012oid:0x1000000000009#015#012$19#015#012oid:0x100000000000c#015#012$19#015#012oid:0x100000000000d#015#012$19#015#012oid:0x100000000000e#015#012$19#015#012oid:0x100000000000f#015#012$19#015#012oid:0x10000000000
10#015#012$19#015#012oid:0x1000000000011#015#012$19#015#012oid:0x1000000000012#015#012$19#015#012oid:0x1000000000013#015#012$19#015#012oid:0x1000000000014#015#012$19#015#012oid:0x1000000000015#015#012$19#015#012oid:0x1000000000016#015#012
$19#015#012oid:0x1000000000017#015#012$19#015#012oid:0x1000000000018#015#012$19#015#012oid:0x1000000000019#015#012$19#015#012oid:0x100000000001a#015#012$19#015#012oid:0x100000000001b#015#012$19#015#012oid:0x100000000001c#015#012$19#015#01
2oid:0x100000000001d#015#012$19#015#012oid:0x100000000001e#015#012$19#015#012oid:0x100000000001f#015#012$19#015#012oid:0x1000000000020#015#012$19#015#012oid:0x1000000000021#015#012$19#015#012oid:0x1000000000022#015#012$19#015#012oid:0x100
0000000023#015#012$19#015#012oid:0x1000000000024#015#012$19#015#012oid:0x1000000000025#015#012$19#015#012oid:0x1000000000026#015#012$19#015#012oid:0x1000000000027#015#012$19#015#012oid:0x1000000000028#015#012$19#015#012oid:0x1000000000029#015#012$19#015#012oid:0x100000000002a#015#012$19#015#012oid:0x100000000002b#015#012$19#015#012oid:0x100000000002c#015#012$19#015#012oid:0x100000000002d#015#012$19#015#012oid:0x100000000002e#015#012$19#015#012oid:0x100000000002f#015#012$19#015#012oid:0x1000000000030#015#012$19#015#012oid:0x1000000000031#015#012$19#015#012oid:0x1000000000032#015#012$19#015#012oid:0x1000000000033#015#012$19#015#012oid:0x1000000000034#015#012$19#015#012oid:0x1000000000035#015#012$19#015#012oid:0x1000000000036#015#012$19#015#012oid:0x1000000000037#015#012$19#015#012oid:0x1000000000038#015#012$19#015#012oid:0x1000000000039#015#012$19#015#012oid:0x100000000003a#015#012$19#015#012oid:0x100000000003b#015#012$19#015#012oid:0x100000000003c#015#012$19#015#012oid:0x100000000003d#015#012$19#015#012oid:0x100000000003e#015#012$19#015#012oid:0x100000000003f#015#012$19#015#012oid:0x1000000000040#015#012$19#015#012oid:0x1000000000041#015#012$19#015#012oid:0x1000000000042#015#012$19#015#012oid:0x1000000000043#015#012$19#015#012oid:0x1000000000044#015#012$19#015#012oid:0x1000000000045#015#012$19#015#012oid:0x1000000000046#015#012$19#015#012oid:0x1000000000047#015#012$19#015#012oid:0x1000000000048#015#012$19#015#012oid:0x1000000000049#015#012$19#015#012oid:0x100000000004a#015#012$19#015#012oid:0x1000000000052#015#012$19#015#012oid:0x1000000000053#015#012$19#015#012oid:0x1000000000054#015#012$19#015#012oid:0x1000000000055#015#012$19#015#012oid:0x1000000000056#015#012$19#015#012oid:0x1000000000057#015#012$19#015#012oid:0x1000000000058#015#012$1#015#0122#015#012$8#015#012COUNTERS#015#012$4#015#0121000#015#012$2#015#012''#015#012, reason: ERR Error running script (call to f_db030d61747eb633f38f61ddeaed7f27739807d0): @user_script:86: @user_script: 86: Lua redis() command arguments must be strings or integers : Input/output error
...
ERR syncd#syncd: :- guard: RedisReply catches system_error: command: *94#015#012$7#015#012EVALSHA#015#012$40#015#012db030d61747eb633f38f61ddeaed7f27739807d0#015#012$2#015#01287#015#012$19#015#
012oid:0x1000000000002#015#012$19#015#012oid:0x1000000000003#015#012$19#015#012oid:0x1000000000004#015#012$19#015#012oid:0x1000000000005#015#012$19#015#012oid:0x1000000000006#015#012$19#015#012oid:0x1000000000007#015#012$19#015#012oid:0x1
000000000008#015#012$19#015#012oid:0x1000000000009#015#012$19#015#012oid:0x100000000000a#015#012$19#015#012oid:0x100000000000b#015#012$19#015#012oid:0x100000000000c#015#012$19#015#012oid:0x100000000000d#015#012$19#015#012oid:0x10000000000
0e#015#012$19#015#012oid:0x100000000000f#015#012$19#015#012oid:0x1000000000010#015#012$19#015#012oid:0x1000000000011#015#012$19#015#012oid:0x1000000000012#015#012$19#015#012oid:0x1000000000013#015#012$19#015#012oid:0x1000000000014#015#012$19#015#012oid:0x1000000000015#015#012$19#015#012oid:0x1000000000016#015#012$19#015#012oid:0x1000000000017#015#012$19#015#012oid:0x1000000000018#015#012$19#015#012oid:0x1000000000019#015#012$19#015#012oid:0x100000000001a#015#012$19#015#012oid:0x100000000001b#015#012$19#015#012oid:0x100000000001c#015#012$19#015#012oid:0x100000000001d#015#012$19#015#012oid:0x100000000001e#015#012$19#015#012oid:0x100000000001f#015#012$19#015#012oid:0x1000000000020#015#012$19#015#012oid:0x1000000000021#015#012$19#015#012oid:0x1000000000022#015#012$19#015#012oid:0x1000000000023#015#012$19#015#012oid:0x1000000000024#015#012$19#015#012oid:0x1000000000025#015#012$19#015#012oid:0x1000000000026#015#012$19#015#012oid:0x1000000000027#015#012$19#015#012oid:0x1000000000028#015#012$19#015#012oid:0x1000000000029#015#012$19#015#012oid:0x100000000002a#015#012$19#015#012oid:0x100000000002b#015#012$19#015#012oid:0x100000000002c#015#012$19#015#012oid:0x100000000002d#015#012$19#015#012oid:0x100000000002e#015#012$19#015#012oid:0x100000000002f#015#012$19#015#012oid:0x1000000000030#015#012$19#015#012oid:0x1000000000031#015#012$19#015#012oid:0x1000000000032#015#012$19#015#012oid:0x1000000000033#015#012$19#015#012oid:0x1000000000034#015#012$19#015#012oid:0x1000000000035#015#012$19#015#012oid:0x1000000000036#015#012$19#015#012oid:0x1000000000037#015#012$19#015#012oid:0x1000000000038#015#012$19#015#012oid:0x1000000000039#015#012$19#015#012oid:0x100000000003a#015#012$19#015#012oid:0x100000000003b#015#012$19#015#012oid:0x100000000003c#015#012$19#015#012oid:0x100000000003d#015#012$19#015#012oid:0x100000000003e#015#012$19#015#012oid:0x100000000003f#015#012$19#015#012oid:0x1000000000040#015#012$19#015#012oid:0x1000000000041#015#012$19#015#012oid:0x1000000000042#015#012$19#015#012oid:0x1000000000043#015#012$19#015#012oid:0x1000000000044#015#012$19#015#012oid:0x1000000000045#015#012$19#015#012oid:0x1000000000046#015#012$19#015#012oid:0x1000000000047#015#012$19#015#012oid:0x1000000000048#015#012$19#015#012oid:0x1000000000049#015#012$19#015#012oid:0x100000000004a#015#012$19#015#012oid:0x100000000004b#015#012$19#015#012oid:0x100000000004c#015#012$19#015#012oid:0x100000000004d#015#012$19#015#012oid:0x100000000004e#015#012$19#015#012oid:0x100000000004f#015#012$19#015#012oid:0x1000000000050#015#012$19#015#012oid:0x1000000000051#015#012$19#015#012oid:0x1000000000052#015#012$19#015#012oid:0x1000000000053#015#012$19#015#012oid:0x1000000000054#015#012$19#015#012oid:0x1000000000055#015#012$19#015#012oid:0x1000000000056#015#012$19#015#012oid:0x1000000000057#015#012$19#015#012oid:0x1000000000058#015#012$1#015#0122#015#012$8#015#012COUNTERS#015#012$4#015#0121000#015#012$2#015#012''#015#012, reason: ERR Error running script (call to f_db030d61747eb633f38f61ddeaed7f27739807d0): @user_script:56: user_script:56: attempt to perform arithmetic on local 'in_octets' (a boolean value) : Input/output error

Describe the results you expected:

SONiC should just ignore the unsupported FEC counters, or at least skip processing of port rates on the port with missing info instead of failing for all ports.

Output of show version:

f71c57e581b1c3610a83b29892b60e1b23ed251f

Output of show techsupport:

Additional information you deem important (e.g. issue happens only occasionally):

@zhangyanzhao
Copy link
Collaborator

Known issue with BRCM SAI.

@zhangyanzhao zhangyanzhao added Triaged this issue has been triaged BRCM labels May 25, 2022
@zhangyanzhao
Copy link
Collaborator

@gechiang

@gechiang
Copy link
Collaborator

We have already started a case for this with BRCM SAI team. Initial reply from BRCM SAI team is that this is not supported on TH3 devices. but looks to be a generic issue for all 10G ports which FEC does not apply perhaps... Will update once confirmed and plan course of action for this.

jimmyzhai added a commit to sonic-net/sonic-swss that referenced this issue Jun 3, 2022
What I did
Fix issue sonic-net/sonic-buildimage#10850 partially by adding sanity check in port_rates.lua. If the must-have counters of one port are not able to get, skip its rate computation.

Why I did it
It avoids port_rates.lua execution exits abnormally.
jimmyzhai added a commit to sonic-net/sonic-sairedis that referenced this issue Jun 9, 2022
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
yxieca pushed a commit to sonic-net/sonic-sairedis that referenced this issue Jun 9, 2022
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
lukasstockner added a commit to genesiscloud/sonic-buildimage that referenced this issue Jun 13, 2022
yxieca pushed a commit to sonic-net/sonic-swss that referenced this issue Jun 15, 2022
What I did
Fix issue sonic-net/sonic-buildimage#10850 partially by adding sanity check in port_rates.lua. If the must-have counters of one port are not able to get, skip its rate computation.

Why I did it
It avoids port_rates.lua execution exits abnormally.
preetham-singh pushed a commit to preetham-singh/sonic-swss that referenced this issue Aug 6, 2022
What I did
Fix issue sonic-net/sonic-buildimage#10850 partially by adding sanity check in port_rates.lua. If the must-have counters of one port are not able to get, skip its rate computation.

Why I did it
It avoids port_rates.lua execution exits abnormally.
pettershao-ragilenetworks pushed a commit to pettershao-ragilenetworks/sonic-sairedis that referenced this issue Nov 18, 2022
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
skbarista pushed a commit to skbarista/sonic-sairedis that referenced this issue Dec 2, 2022
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
lukasstockner pushed a commit to genesiscloud/sonic-swss that referenced this issue Mar 31, 2023
What I did
Fix issue sonic-net/sonic-buildimage#10850 partially by adding sanity check in port_rates.lua. If the must-have counters of one port are not able to get, skip its rate computation.

Why I did it
It avoids port_rates.lua execution exits abnormally.
lukasstockner pushed a commit to genesiscloud/sonic-sairedis that referenced this issue Mar 31, 2023
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
lukasstockner added a commit to genesiscloud/sonic-buildimage that referenced this issue Mar 31, 2023
lukasstockner pushed a commit to genesiscloud/sonic-sairedis that referenced this issue Mar 31, 2023
Fix sonic-net/sonic-buildimage#10850.

A fact is there might be different port types on asic, then different port stats capabilities.
Instead of using a cached supported port counter ID list for all ports, it gets supported
port counter list per port.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BRCM Triaged this issue has been triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants