Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vnetorch] [vxlanorch] fix a set of memory usage issues #2352

Merged
merged 2 commits into from
Jun 29, 2022

Conversation

Yakiv-Huryk
Copy link
Contributor

@Yakiv-Huryk Yakiv-Huryk commented Jun 24, 2022

What I did
Fixed the following memory usage issues:

  • [vnetorch] removeBfdSession(): the monitor_addr is invalid after the "nexthop_info_[vnet].erase(endpoint_addr)"
  • [vxlanorch] usage of an invalid iterator in updateRemoteEndPointIpRef()

ASAN reports:

vnetorch

==446==ERROR: AddressSanitizer: heap-use-after-free on address 0x6070003b2d38 at pc 0x7f5c8324b8dd bp 0x7ffe2cfc9660 sp 0x7ffe2cfc8e10
READ of size 4 at 0x6070003b2d38 thread T0
    #0 0x7f5c8324b8dc  (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x408dc)
    #1 0x7f5c82c6475a in swss::IpAddress::to_string[abi:cxx11]() const (/usr/lib/x86_64-linux-gnu/libswsscommon.so.0+0x2f075a)
    #2 0x55699e6c2d9d in VNetRouteOrch::removeBfdSession(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, NextHopKey const&, swss::IpAddress const&) orchagent/vnetorch.cpp:1570
    #3 0x55699e6c48a5 in VNetRouteOrch::delEndpointMonitor(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, NextHopGroupKey&) orchagent/vnetorch.cpp:1605
    #4 0x55699e6c7f5b in bool VNetRouteOrch::doRouteTask<VNetVrfObject>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, swss::IpPrefix&, NextHopGroupKey&, std::__cxx11::basic_string<char, std::char_traits<char>, std:
:allocator<char> >&, std::map<NextHopKey, swss::IpAddress, std::less<NextHopKey>, std::allocator<std::pair<NextHopKey const, swss::IpAddress> > > const&) orchagent/vnetorch.cpp:1060
    #5 0x55699e6cc12e in VNetRouteOrch::handleTunnel(Request const&) orchagent/vnetorch.cpp:1964
    #6 0x55699e695123 in VNetRouteOrch::delOperation(Request const&) orchagent/vnetorch.cpp:2007
    #7 0x55699dfd2bcf in Orch2::doTask(Consumer&) orchagent/orch.cpp:1067
    #8 0x55699dfda236 in Consumer::execute() orchagent/orch.cpp:235
    #9 0x55699dfa9d93 in OrchDaemon::start() orchagent/orchdaemon.cpp:716
    #10 0x55699de50fc0 in main orchagent/main.cpp:734
    #11 0x7f5c8232f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #12 0x55699df4db19  (/usr/bin/orchagent+0x339b19)

0x6070003b2d38 is located 56 bytes inside of 80-byte region [0x6070003b2d00,0x6070003b2d50)
freed by thread T0 here:
    #0 0x7f5c832f6aa0 in operator delete(void*) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xebaa0)
    #1 0x55699e6fa23e in __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::deallocate(std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> >*, unsigned long) /usr/include/c++/8/ext/new_alloca
tor.h:125
    #2 0x55699e6fa23e in std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > > >::deallocate(std::allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > >&, std::_Rb_tree_nod
e<std::pair<swss::IpAddress const, VNetNextHopInfo> >*, unsigned long) /usr/include/c++/8/bits/alloc_traits.h:462
    #3 0x55699e6fa23e in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::_M_put_node(std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> >*) /usr/include/c++/8/bits/stl_tree.h:603
    #4 0x55699e6fa23e in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::_M_drop_node(std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> >*) /usr/include/c++/8/bits/stl_tree.h:670
    #5 0x55699e6fa23e in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::_M_erase_aux(std::_Rb_tree_const_iterator<std::pair<swss::IpAddress const, VNetNextHopInfo> >) /usr/include/c++/8/bits/stl_tree.h:2493
    #6 0x55699e6fa23e in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::_M_erase_aux(std::_Rb_tree_const_iterator<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::_Rb_tree_const_iterator<std::pair<swss::IpAddress const, VNetNextHopInfo> >) /usr/include/c++/8/bits/stl_tree.h:2507
    #7 0x55699e6fa23e in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::erase(swss::IpAddress const&) /usr/include/c++/8/bits/stl_tree.h:2518
    #8 0x55699e6c2d30 in std::map<swss::IpAddress, VNetNextHopInfo, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::erase(swss::IpAddress const&) /usr/include/c++/8/bits/stl_map.h:1068
    #9 0x55699e6c2d30 in VNetRouteOrch::removeBfdSession(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, NextHopKey const&, swss::IpAddress const&) orchagent/vnetorch.cpp:1568
    #10 0x55699e6c48a5 in VNetRouteOrch::delEndpointMonitor(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, NextHopGroupKey&) orchagent/vnetorch.cpp:1605
    #11 0x55699e6c7f5b in bool VNetRouteOrch::doRouteTask<VNetVrfObject>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, swss::IpPrefix&, NextHopGroupKey&, std::__cxx11::basic_string<char, std::char_traits<char>, std
::allocator<char> >&, std::map<NextHopKey, swss::IpAddress, std::less<NextHopKey>, std::allocator<std::pair<NextHopKey const, swss::IpAddress> > > const&) orchagent/vnetorch.cpp:1060
    #12 0x55699e6cc12e in VNetRouteOrch::handleTunnel(Request const&) orchagent/vnetorch.cpp:1964
    #13 0x55699e695123 in VNetRouteOrch::delOperation(Request const&) orchagent/vnetorch.cpp:2007
    #14 0x55699dfd2bcf in Orch2::doTask(Consumer&) orchagent/orch.cpp:1067
    #15 0x55699dfda236 in Consumer::execute() orchagent/orch.cpp:235
    #16 0x55699dfa9d93 in OrchDaemon::start() orchagent/orchdaemon.cpp:716
    #17 0x55699de50fc0 in main orchagent/main.cpp:734
    #18 0x7f5c8232f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    
previously allocated by thread T0 here:
    #0 0x7f5c832f5d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x55699e69ee2f in __gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::allocate(unsigned long, void const*) /usr/include/c++/8/ext/new_allocator.h:111
    #2 0x55699e69ee2f in std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > > >::allocate(std::allocator<std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> > >&, unsigned long) /usr
/include/c++/8/bits/alloc_traits.h:436
    #3 0x55699e69ee2f in std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNet
NextHopInfo> > >::_M_get_node() /usr/include/c++/8/bits/stl_tree.h:599
    #4 0x55699e69ee2f in std::_Rb_tree_node<std::pair<swss::IpAddress const, VNetNextHopInfo> >* std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::les
s<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::_M_create_node<std::piecewise_construct_t const&, std::tuple<swss::IpAddress const&>, std::tuple<> >(std::piecewise_construct_t const&, std::tuple<swss::IpAddress con
st&>&&, std::tuple<>&&) /usr/include/c++/8/bits/stl_tree.h:653
    #5 0x55699e69ee2f in std::_Rb_tree_iterator<std::pair<swss::IpAddress const, VNetNextHopInfo> > std::_Rb_tree<swss::IpAddress, std::pair<swss::IpAddress const, VNetNextHopInfo>, std::_Select1st<std::pair<swss::IpAddress const, VNetNextHopInfo> >, std::
less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::_M_emplace_hint_unique<std::piecewise_construct_t const&, std::tuple<swss::IpAddress const&>, std::tuple<> >(std::_Rb_tree_const_iterator<std::pair<swss::IpAddress
 const, VNetNextHopInfo> >, std::piecewise_construct_t const&, std::tuple<swss::IpAddress const&>&&, std::tuple<>&&) /usr/include/c++/8/bits/stl_tree.h:2414
    #6 0x55699e6f12e2 in std::map<swss::IpAddress, VNetNextHopInfo, std::less<swss::IpAddress>, std::allocator<std::pair<swss::IpAddress const, VNetNextHopInfo> > >::operator[](swss::IpAddress const&) /usr/include/c++/8/bits/stl_map.h:499
    #7 0x55699e6bfbff in VNetRouteOrch::createBfdSession(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, NextHopKey const&, swss::IpAddress const&) orchagent/vnetorch.cpp:1556
    #8 0x55699e6c2580 in VNetRouteOrch::setEndpointMonitor(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<NextHopKey, swss::IpAddress, std::less<NextHopKey>, std::allocator<std::pair<NextHopKey const, swss:
:IpAddress> > > const&, NextHopGroupKey&) orchagent/vnetorch.cpp:1587
    #9 0x55699e6c5f39 in bool VNetRouteOrch::doRouteTask<VNetVrfObject>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, swss::IpPrefix&, NextHopGroupKey&, std::__cxx11::basic_string<char, std::char_traits<char>, std:
:allocator<char> >&, std::map<NextHopKey, swss::IpAddress, std::less<NextHopKey>, std::allocator<std::pair<NextHopKey const, swss::IpAddress> > > const&) orchagent/vnetorch.cpp:916
    #10 0x55699e6cc12e in VNetRouteOrch::handleTunnel(Request const&) orchagent/vnetorch.cpp:1964
    #11 0x55699e695a03 in VNetRouteOrch::addOperation(Request const&) orchagent/vnetorch.cpp:1983
    #12 0x55699dfd2bcf in Orch2::doTask(Consumer&) orchagent/orch.cpp:1067
    #13 0x55699dfda236 in Consumer::execute() orchagent/orch.cpp:235
    #14 0x55699dfa9d93 in OrchDaemon::start() orchagent/orchdaemon.cpp:716
    #15 0x55699de50fc0 in main orchagent/main.cpp:734
    #16 0x7f5c8232f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

,

vxlanorch

==452==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6130000681e8 at pc 0x563b9f1f4311 bp 0x7ffdbfd062c0 sp 0x7ffdbfd062b8
READ of size 4 at 0x6130000681e8 thread T0
    #0 0x563b9f1f4310 in VxlanTunnel::updateRemoteEndPointIpRef(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool) orchagent/vxlanorch.cpp:1112
    #1 0x563b9f215407 in VxlanTunnelOrch::addTunnelUser(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned int, unsigned int, tunnel_user_t, unsigned long) orchagent/vxlanorch.cpp:1651
    #2 0x563b9ec18dc7 in RouteOrch::createRemoteVtep(unsigned long, NextHopKey const&) orchagent/routeorch.cpp:2456
    #3 0x563b9ec2a557 in RouteOrch::addRoute(RouteBulkContext&, NextHopGroupKey const&) orchagent/routeorch.cpp:1736
    #4 0x563b9ec40b99 in RouteOrch::doTask(Consumer&) orchagent/routeorch.cpp:856
    #5 0x563b9eb86236 in Consumer::execute() orchagent/orch.cpp:235
    #6 0x563b9eb55d93 in OrchDaemon::start() orchagent/orchdaemon.cpp:716
    #7 0x563b9e9fcfc0 in main orchagent/main.cpp:734
    #8 0x7f88f5dcd09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #9 0x563b9eaf9b19  (/usr/bin/orchagent+0x339b19)

0x6130000681e8 is located 16 bytes to the right of 344-byte region [0x613000068080,0x6130000681d8)
allocated by thread T0 here:
    #0 0x7f88f6d93d30 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xead30)
    #1 0x563b9f213122 in VxlanTunnelOrch::addOperation(Request const&) orchagent/vxlanorch.cpp:1590
    #2 0x563b9eb7ebcf in Orch2::doTask(Consumer&) orchagent/orch.cpp:1067
    #3 0x563b9eb86236 in Consumer::execute() orchagent/orch.cpp:235
    #4 0x563b9eb55d93 in OrchDaemon::start() orchagent/orchdaemon.cpp:716
    #5 0x563b9e9fcfc0 in main orchagent/main.cpp:734
    #6 0x7f88f5dcd09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

SUMMARY: AddressSanitizer: heap-buffer-overflow orchagent/vxlanorch.cpp:1112 in VxlanTunnel::updateRemoteEndPointIpRef(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool)
Shadow bytes around the buggy address:
  0x0c2680004fe0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2680004ff0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2680005000: fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa fa
  0x0c2680005010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c2680005020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c2680005030: 00 00 00 00 00 00 00 00 00 00 00 fa fa[fa]fa fa
  0x0c2680005040: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c2680005050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2680005060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2680005070: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x0c2680005080: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==452==ABORTING

Why I did it
To fix memory usage issues

How I verified it
Run the tests that were used to find the issues and checked the ASAN report

Details if related

* using a copy of monitor ip instead of a reference since the reference gets
invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
* remove a usage of an invalid (end(tnl_users_)) iterator

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
@liat-grozovik
Copy link
Collaborator

@prsunny please review this change. Should you consider it as required for 202012?

@liat-grozovik
Copy link
Collaborator

/azp run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny prsunny merged commit a8e238a into sonic-net:master Jun 29, 2022
yxieca pushed a commit that referenced this pull request Jun 30, 2022
* [vnetorch] fix use-after-free in removeBfdSession()
* using a copy of monitor ip instead of a reference since the reference gets invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
@prsunny
Copy link
Collaborator

prsunny commented Jul 1, 2022

@Yakiv-Huryk , could you please raise a PR again 202012. It may have conflicts

@Yakiv-Huryk
Copy link
Contributor Author

@Yakiv-Huryk , could you please raise a PR again 202012. It may have conflicts

Sure: #2366

dprital added a commit to dprital/sonic-buildimage that referenced this pull request Jul 7, 2022
Update sonic-swss submodule pointer to include the following:
* Port configuration incremental update support ([sonic-net#2305](sonic-net/sonic-swss#2305))
* [VS Test] Skip failing subport tests ([sonic-net#2370](sonic-net/sonic-swss#2370))
* [teammgr]: Waiting MACsec ready before doLagMemberTask ([sonic-net#2286](sonic-net/sonic-swss#2286))
* [vnetorch] [vxlanorch] fix a set of memory usage issues ([sonic-net#2352](sonic-net/sonic-swss#2352))
* [tests] [asan] add graceful stop flag ([sonic-net#2347](sonic-net/sonic-swss#2347))
* [asan] suppress the static variable leaks ([sonic-net#2354](sonic-net/sonic-swss#2354))
* Add support for IP interface loopback action ([sonic-net#2307](sonic-net/sonic-swss#2307))
* [orchagent]: srv6orch support for uSID ([sonic-net#2335](sonic-net/sonic-swss#2335))

Signed-off-by: dprital <drorp@nvidia.com>
lolyu pushed a commit to lolyu/sonic-swss that referenced this pull request Jul 9, 2022
* [vnetorch] fix use-after-free in removeBfdSession()
* using a copy of monitor ip instead of a reference since the reference gets invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Jul 14, 2022
Update sonic-swss submodule pointer to include the following:
* Port configuration incremental update support ([#2305](sonic-net/sonic-swss#2305))
* [VS Test] Skip failing subport tests ([#2370](sonic-net/sonic-swss#2370))
* [teammgr]: Waiting MACsec ready before doLagMemberTask ([#2286](sonic-net/sonic-swss#2286))
* [vnetorch] [vxlanorch] fix a set of memory usage issues ([#2352](sonic-net/sonic-swss#2352))
* [tests] [asan] add graceful stop flag ([#2347](sonic-net/sonic-swss#2347))
* [asan] suppress the static variable leaks ([#2354](sonic-net/sonic-swss#2354))
* Add support for IP interface loopback action ([#2307](sonic-net/sonic-swss#2307))
* [orchagent]: srv6orch support for uSID ([#2335](sonic-net/sonic-swss#2335))

Signed-off-by: dprital <drorp@nvidia.com>
preetham-singh pushed a commit to preetham-singh/sonic-swss that referenced this pull request Aug 6, 2022
* [vnetorch] fix use-after-free in removeBfdSession()
* using a copy of monitor ip instead of a reference since the reference gets invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
lukasstockner pushed a commit to genesiscloud/sonic-swss that referenced this pull request Mar 31, 2023
* [vnetorch] fix use-after-free in removeBfdSession()
* using a copy of monitor ip instead of a reference since the reference gets invalidated after the endpoint is erased

Signed-off-by: Yakiv Huryk <yhuryk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants