Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldpd]: Use kernel autoprobe for netlink socket .nl_pid portion of the address #2164

Merged
merged 1 commit into from
Oct 18, 2018

Conversation

pavel-shirshov
Copy link
Contributor

- What I did
I added a patch to fix lldpd issue lldpd[24]: unable to bind netlink socket: Address already in use.

- How I did it
Netlink sockets doesn't support Linux kernel namespaces. So every netlink socket is created in a global namespace.
Traditionally, when netlink socket is created, the current process pid is used in nl_pid portion of the socket address. When we use docker environment, we could have a lot of processes with the same pid number, but in different docker containers. When we have two processes which create a netlink socket with get_pid() as its address it will create "Address already in use error" error for the second process. To avoid it, it's better to use .nl_pid = 0 which allows us to avoid such error by autoprobing the most appropriate address.

- How to verify it
Build lldp with the patch. Start one lldp process in one docker and another lldp process in another docker. Please make sure that both processes uses the same pid.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

@lguohan lguohan merged commit 891e256 into master Oct 18, 2018
lguohan pushed a commit that referenced this pull request Oct 18, 2018
@stcheng stcheng deleted the pavelsh/lldp_pid branch October 19, 2018 07:35
judyjoseph added a commit that referenced this pull request Mar 20, 2022
6a6b711 (HEAD -> 202111, origin/202111) Fix issue: sometimes PFC WD unable to create zero buffer pool (#2164)
459aee0 Use abort instead of exit in case calling SAI API failure (#2170)
e767137 Fix issue config qos reload causing orchagent aborted via tracking dependencies among QoS tables (#2116)
liorghub added a commit to liorghub/sonic-buildimage that referenced this pull request Mar 23, 2022
Update sonic-swss submodule to include below commits:
d80094b [aclorch] Do not fail ACL rule remove flow if rule already deleted (sonic-net#2183)
bea0b70 [gcov]: Change coverage.xml file references (sonic-net#2120)
829b219 [tunnelmgrd]: Warm boot support (sonic-net#2166)
ad65b0a Fix issue: sometimes PFC WD unable to create zero buffer pool (sonic-net#2164)
608acc3 [doc] Moving Configuration.md from swss to yang sub-folder (sonic-net#2177)
0294376 [orchagent] NVGRE Tunnel orchestration agent implementation (sonic-net#1953)
ce88696 [ci] Update default sonic image downloading build ID. (sonic-net#2175)
liat-grozovik pushed a commit that referenced this pull request Mar 29, 2022
Update sonic-swss submodule to include below commits:
d80094b [aclorch] Do not fail ACL rule remove flow if rule already deleted (#2183)
bea0b70 [gcov]: Change coverage.xml file references (#2120)
829b219 [tunnelmgrd]: Warm boot support (#2166)
ad65b0a Fix issue: sometimes PFC WD unable to create zero buffer pool (#2164)
608acc3 [doc] Moving Configuration.md from swss to yang sub-folder (#2177)
0294376 [orchagent] NVGRE Tunnel orchestration agent implementation (#1953)
ce88696 [ci] Update default sonic image downloading build ID. (#2175)

Co-authored-by: liora <liora@nvidia.com>
Ndancejic pushed a commit to Ndancejic/sonic-buildimage that referenced this pull request May 3, 2022
…net#2164)

What I did
Fix issue: sometimes PFC WD is unable to create zero buffer pool.
On some platforms, an ingress/egress zero buffer profile will be applied on the PG and queue which are under PFC storm. The zero buffer profile is created based on zero buffer pool. However, sometimes it fails to create zero buffer pool due to too many buffer pools existing in the system.
Sometimes, there is a zero buffer pool existing on the system for reclaiming buffer. In that case, we can leverage it to create zero buffer profile for PFC WD.

Why I did it
Fix the issue via sharing the zero buffer pool between PFC WD and buffer orchagent

How I verified it
Manually test
Run PFC WD test and PFC WD warm reboot test
Run unit test

Details if related
The detailed flow is like this:

PFC Storm detected:
If there is a zero pool in PFC WD's cache, just create the zero buffer profile based on it
Otherwise, fetching the zero pool from buffer orchagent
If got one, create the zero buffer profile based on it
Otherwise,
create a zero buffer pool
notify the zero buffer pool about the buffer orch
In both cases, PFC WD should notify buffer orch to increase the reference number of the zero buffer pool.

Buffer orchagent:
When creating the zero buffer pool,
check whether there is one. if yes, skip the SAI API create_buffer_pool
increase the reference number.
Before removing the zero buffer pool, decrease and check the reference number. if it is zero (after decreased), skip SAI API destroy_buffer_pool.
When PFC WD decrease reference number: remove the zero buffer pool if the reference number becomes zero

Notes
We do not leverage the object_reference_map infrastructure to track the dependency because:
it assumes the dependency will eventually be removed if an object is removed. that's NOT true in this scenario because the PFC storm can last for a relatively long time and even cross warm reboot.
the interfaces differ.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
SuvarnaMeenakshi added a commit that referenced this pull request Jul 6, 2022
Update sonic-utilities submodule to include below commits:
7070794 Fix DBConfig not initialize issue in pfcwd (#2238)
b5d6659 [config/load_mgmt_config] Support load IPv6 mgmt IP (#2206)
3274b0e Added bf_drivers.log to zipped dump after execution of "show techsupport" (#2164)
8dee36c [portstat] Update portstat to use CounterTable API (#2207)
7d9faf3 Added support for Sonic cross-compilation build. (#2233)
c3620fc [GCU] Moving UniqueLanes from only validating moves, to be a supplemental YANG validator (#2234)
Signed-off-by: Suvarna Meenakshi <sumeenak@microsoft.com>
vivekrnv added a commit to vivekrnv/sonic-buildimage that referenced this pull request Aug 26, 2022
…storm is detected (sonic-net#2304)

What I did
Avoid dropping traffic that is ingressing the port/pg that is in storm. The code changes in this PR avoid creating the ingress zero pool and profile and does not attach any zero profile to the ingress pg when pfcwd is triggered

Revert changes related to sonic-net#1480 where the retry mechanism was added to BufferOrch which caches the task retries and while the PG is locked by PfcWdZeroBufferHandler.

Revert changes related to sonic-net#2164 in PfcWdZeroBufferHandler & ZeroBufferProfile & BufferOrch.

Updated UT's accordingly

How I verified it
UT's.
Ran the sonic-mgmt test with these changes sonic-net/sonic-mgmt#5665 and verified if they've passed.

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants