Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flex counter] Flex counter threads consume too much CPU resources. #9202

Open
StormLiangMS opened this issue Nov 9, 2021 · 0 comments · Fixed by sonic-net/sonic-swss#2031 or sonic-net/sonic-utilities#1925
Assignees
Labels
Triaged this issue has been triaged

Comments

@StormLiangMS
Copy link
Contributor

Description

syncd will start multi threads to collect counters, which occupies lots of CPU resources in some platform, how bad it could be? It depends on the quantity of ports and queues, and the CPU.

Steps to reproduce the issue:

  1. To enable the flex counter
  2. To check the CPU usage by top command.

Describe the results you received:

Describe the results you expected:

Output of show version:


SONiC Software Version: SONiC.20201231.41
Distribution: Debian 10.11
Kernel: 4.19.0-12-2-amd64
Build commit: 84eefd6578
Build date: Sat Oct 30 12:01:42 UTC 2021
Built by: cloudtest@3cfd51cec000000

Platform: x86_64-arista_7260cx3_64
HwSKU: Arista-7260CX3-D108C8
ASIC: broadcom
ASIC Count: 1
Serial Number: JPE20222159
Uptime: 02:03:32 up 10:09,  1 user,  load average: 1.54, 1.78, 1.87

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@StormLiangMS StormLiangMS self-assigned this Nov 9, 2021
@zhangyanzhao zhangyanzhao added the Triaged this issue has been triaged label Nov 10, 2021
stephenxs added a commit to stephenxs/sonic-buildimage that referenced this issue Nov 23, 2021
bb0733a [aclorch] Add ACL_TABLE_TYPE configuration  (sonic-net#1982)
59cab5d Support for setting switch level DSCP to TC QoS map (sonic-net#2023)
da21172 [aclorch] add generic AclOrch::updateAclRule() method (sonic-net#1993)
4f6cb05 [Reclaiming buffer] Support reclaiming buffer in traditional model (sonic-net#2011)
32d7a69 [Reclaiming buffer] Common code update (sonic-net#1996)
b91d8ba [swss] L2 Forwarding Enhancements (sonic-net#1716)
797dab4 [muxorch] Bind all ports to drop ACL table (sonic-net#2027)
99929cd [lgtm.yml] add libgmock-dev (sonic-net#2035)
8727ae5 [flex counter] Flex counter threads consume too much CPU resources sonic-net#9202 (sonic-net#2031)
103fdf0 Remove redundant calls to get child scheduler group during initialization (sonic-net#1965)
18ea840 [macsec]: MACsec statistics support (sonic-net#1867)
0c46242 [orchagent] Flush pipeline every 1 second, not only when select will timeout (sonic-net#2003)
339101c [cbf] Add class-based forwarding support (sonic-net#1963)
24a615b Fix issue: accumulative headroom can exceed limit in rare scenario (sonic-net#2020)
708e232 Test divide by zero processing path (sonic-net#2028)
8f1d035 [macsecmgr]: Wait for port up before enabling macsec (sonic-net#2032)
4912a77 Remove buffer drop counter when port is removed (sonic-net#1860)
f9462c4 [Dynamic buffer] [Mellanox] Calculate the peer response time according to the speed (sonic-net#1930)
8b5a401 Routed subinterface enhancements (sonic-net#2017)
cdea5e9 Fix next hop compilation (sonic-net#2025)
37c197d [SRV6] Sonic-swss changes for SRV6 (sonic-net#1964)
f502c32 [vnetorch] Add ECMP support for vnet tunnel routes (sonic-net#1960)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
vivekrnv added a commit to vivekrnv/sonic-buildimage that referenced this issue Nov 23, 2021
a0bff26 [acl-loader] modify acl-loader with change in STATE DB ACL capability table (sonic-net#1896)
a395e28 [debug dump util] Changes for EVPN and VxLAN dump module (sonic-net#1892)
02a98ef [debug dump util] Route Module added (sonic-net#1913)
ac8382f [generic-config-updater] Logging change just before applying it (sonic-net#1934)
9ab6c51 [flex counter] Flex counter threads consume too much CPU resources. sonic-net#9202 (sonic-net#1925)
2ec47a5 [generic-config-updater] Handling empty tables while sorting a patch (sonic-net#1923)
fdedcbf [fdbshow]: Handle FDB cleanup gracefully. (sonic-net#1926)
e7535ae [sonic-cli-gen] first phase implementation of the SONiC CLI Auto-generation tool (sonic-net#1644)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
liat-grozovik pushed a commit that referenced this issue Nov 24, 2021
bb0733a [aclorch] Add ACL_TABLE_TYPE configuration  (#1982)
59cab5d Support for setting switch level DSCP to TC QoS map (#2023)
da21172 [aclorch] add generic AclOrch::updateAclRule() method (#1993)
4f6cb05 [Reclaiming buffer] Support reclaiming buffer in traditional model (#2011)
32d7a69 [Reclaiming buffer] Common code update (#1996)
b91d8ba [swss] L2 Forwarding Enhancements (#1716)
797dab4 [muxorch] Bind all ports to drop ACL table (#2027)
99929cd [lgtm.yml] add libgmock-dev (#2035)
8727ae5 [flex counter] Flex counter threads consume too much CPU resources #9202 (#2031)
103fdf0 Remove redundant calls to get child scheduler group during initialization (#1965)
18ea840 [macsec]: MACsec statistics support (#1867)
0c46242 [orchagent] Flush pipeline every 1 second, not only when select will timeout (#2003)
339101c [cbf] Add class-based forwarding support (#1963)
24a615b Fix issue: accumulative headroom can exceed limit in rare scenario (#2020)
708e232 Test divide by zero processing path (#2028)
8f1d035 [macsecmgr]: Wait for port up before enabling macsec (#2032)
4912a77 Remove buffer drop counter when port is removed (#1860)
f9462c4 [Dynamic buffer] [Mellanox] Calculate the peer response time according to the speed (#1930)
8b5a401 Routed subinterface enhancements (#2017)
cdea5e9 Fix next hop compilation (#2025)
37c197d [SRV6] Sonic-swss changes for SRV6 (#1964)
f502c32 [vnetorch] Add ECMP support for vnet tunnel routes (#1960)

Signed-off-by: Stephen Sun <stephens@nvidia.com>
lguohan pushed a commit that referenced this issue Nov 24, 2021
a0bff26 [acl-loader] modify acl-loader with change in STATE DB ACL capability table (#1896)
a395e28 [debug dump util] Changes for EVPN and VxLAN dump module (#1892)
02a98ef [debug dump util] Route Module added (#1913)
ac8382f [generic-config-updater] Logging change just before applying it (#1934)
9ab6c51 [flex counter] Flex counter threads consume too much CPU resources. #9202 (#1925)
2ec47a5 [generic-config-updater] Handling empty tables while sorting a patch (#1923)
fdedcbf [fdbshow]: Handle FDB cleanup gracefully. (#1926)
e7535ae [sonic-cli-gen] first phase implementation of the SONiC CLI Auto-generation tool (#1644)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
theasianpianist pushed a commit to theasianpianist/sonic-buildimage that referenced this issue Feb 5, 2022
…nic-net#9202 (sonic-net#2031)

* [flex counter] Flex counter threads consume too much CPU resources sonic-net#9202

1. water thread flex counter will consume many cpu resouces on some platforms.
change the default interval for those counters from 10 seconds to 60 seconds to
workaround this issue. The performance issue of flex counter read need to be addressed
separately, this is just a workaround.
if it needs smaller granulatiry, the interval could be adjusted through CLI under SONIC shell.
counterpoll

* Increase the buffer pool watermark interval from 10 seconds to 60 seconds
taras-keryk pushed a commit to taras-keryk/sonic-buildimage that referenced this issue Apr 28, 2022
…onic-net#9202 (sonic-net#1925)

* [flex counter] Flex counter threads consume too much CPU resources. sonic-net#9202

Increase the interval up lmit from 30 seconds to 60 seconds

* only modify the interval of watermark counter group

* fix merge conflict

* fix commit issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triaged this issue has been triaged
Projects
None yet
2 participants