Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[minigraph] set admin_status to down if port not in minigraph #14

Open
wants to merge 77 commits into
base: master
Choose a base branch
from

Conversation

dmytroxshevchuk
Copy link
Owner

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

judyjoseph and others added 30 commits February 12, 2021 10:56
- Introduced TS common file in docker as well and moved common functions.
- TSA/B/C scripts run only in BGP instances for front end ASICs.
       In addition skip enforcing it on route maps used between internal BGP sessions.

admin@str--acs-1:~$ sudo /usr/bin/TSA
System Mode: Normal -> Maintenance

and in case of Multi-ASIC
admin@str--acs-1:~$ sudo /usr/bin/TSA
BGP0 : System Mode: Normal -> Maintenance
BGP1 : System Mode: Normal -> Maintenance
BGP2 : System Mode: Normal -> Maintenance
coppmgrd process do not need to be auto-restarted if it exited unexpectedly.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
…onic-net#6780)

Fix marvell-armhf build break

The azure-storage package depends on the cryptography package. Newer
versions of cryptography require the rust compiler, the correct version
for which is not readily available in buster. Hence we pre-install an
older version here to satisfy the azure-storage dependency.
Note: This is not a problem for other architectures as pre-built versions
of cryptography are available for those. This sequence can be removed
after upgrading to debian bullseye.
sonic-net/sonic-utilities#1431 changes the path to the udevprefix.conf file. The file previously inappropriately resided in the <platform>/plugins/ directory. That directory is reserved for now-deprecated Python platform plugins, and will be removed in the near future.
Commits include:

* src/sonic-utilities c7e46c9...42cab68 (3):
  > [consutil] Look for udevprefix.conf file under platform dir, not plugins (sonic-net#1431)
  > [ci]: download from sonic-buildimage.vs artifact (sonic-net#1428)
  > [storyteller] sort output by time and improve lag support (sonic-net#1430)
* Add *MUX_CABLE_TABLE* to set of tables to clear on SWSS start, which
will clear HW_MUX_CABLE_TABLE and MUX_CABLE_TABLE
* Order swss to start before pmon to ensure that DBs are cleared before
xcvrd (running inside pmon) starts and re-populates the tables

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
- What I did
All SWSS dependent services should stop before SWSS service to avoid future possible issues.
For example 'teamd' service will stop before to allow the driver unload netdev gracefully.
This is to stop all LAG's before restarting syncd service when running 'config reload' command.

- How I did it
Change the order of dependent services of SWSS.

- How to verify it
Run 'config reload' command.
Previously the operation failed when a large number of PortChannel configured on the system.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
- Why I did it
Support shared headroom pool

Signed-off-by: Stephen Sun stephens@nvidia.com

- How I did it
Port configurations for SKUs based on 2700/3800 platform from 201911
For SN3800 platform:
C64: 32 100G down links and 32 100G up links.
D112C8: 112 50G down links and 8 100G up links.
D24C52: 24 50G down links, 20 100G down links, and 32 100G up links.
D28C50: 28 50G down links, 18 100G down links, and 32 100G up links.
For SN2700 platform:
D48C8: 48 50G down links and 8 100G up links
C32: 16 100G downlinks and 16 100G uplinks
Add configuration for Mellanox-SN4600C-D112C8
112 50G down links and 8 100G up links.

- How to verify it
Run regression test.
To have the following fixes:
* All | Port status remains down after warm boot and flapping the port on peer side
* All | LAG HASH  | IPv6 SRC_IP is not accounted in LAG hashing [
* All | ASIC driver | Kernel crash observed when driver reload is initiated before it fully loaded
* Spectrum-3 | Buffer | In lossless configuration, headroom is been evicted only when the shared buffers is free
* All | prevent FW access during ISSU

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
…ault (sonic-net#6444)

Signed-off-by: Andriy Yurkiv <ayurkiv@nvidia.com>
As we are currently in the process of removing Python 2 from SONiC, to ensure a seamless transition to Python 3.
This PR updates the following commits
c6b642b [ci]: download from sonic-buildimage.vs artifact (sonic-net#168)
e76ecc6 [sonic_y_cable] add support for retrieving firmware info for Y cable, internal and nic temperature and voltage (sonic-net#162)
f9cf8c9 [GitHub] Add pull request template (sonic-net#167)
c31636e [ci] Call pip2/3 using sudo (sonic-net#166)
5521f67 [ci] Test and build packages using Azure Pipelines (sonic-net#164)
faca35c [ci]: Set up CI with Azure Pipelines

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
)

adding noTLS mode for debugging purpose
Removing config-set for port 8080. It fails to start telemetry if docker restarts in case on noTLS mode because it expects log_level config to be present as well.
Update FRR 7.5 head. The following is a list of new commits.

```
e2f17ae47ad047e66923c2ff1e84c9ba10d4ad38 Merge pull request sonic-net#8096 from idryzhov/7.5-backports-2021-02-16
380341362ced8e317c18b7395acb012de1f23acd ospf6d: Don't send hellos on loopback interface
7fa78b659f8e720466e0df62689327ea4b9ff867 bgpd: send correct BMP down message when nht fails
385faf6c079a41def1e6eb882cbfd50047559644 [filter]: change return code for errors
d9a0e9a2934f2f75c64496fe4c724a18aa581fcb bfdd: fix session lookup
08afa0a75311a4e8cb2a18116384b603f7f2d751 ospf6d : fix issue in ecmp inter area  route
2299afa1a9128d87d5169742b993c0ada575eb83 ospfd:  Prevent duplicate packet read in certain vrf situations
ff42a28af659ee61c0efb877b10738a5812f4bc2 vrf: use wrappers to change VRF_CONFIGURED flag
2bdc59ca21da2d67b77ec70a2fadffbca60690cd vrf: mark vrf as configured when entering vrf node
b9611f65a71adc0b8fa14a5a4d1a8f44e04dcd85 ospf6d: Fix LSA formatting out-of-bounds access
610ebf56913fa56167b0a2a127b07afe020a1efe bfdd: Prevent use after free ( again )
35b0cd5d753dda9aa70ea1c06db61a8d4b8671e3 *: Fix usage of bfd_adj_event
95b8915d0f4de3eae5438632ecd0827061ef48e8 ospf6d: Fix LSA formatting inconsistent retvals
49d73d8be84dbd23d767697474019165e511786c pimd: SGRpt prune received during prune didn't override holdtime
1d0d19afa9bb7cd4bc476d00c887876bc04eee95 eigrpd: Correctly set the mtu for eigrp packets sent
bbb08db69f8eb554d23b4920c1c1e3982d8d2a91 zebra: Prevent sending of unininted data
0813d650a8120458ab7d9317061f3864dbc6f2f7 ospf6d: prevent use after free
2f2e981d967b36b240fca82fea8a961d927ef43c lib: Prevent unininted usage of data
6171becdb391ea5b88916a3a28b04b555e1fc518 bfdd: Prevent storage of ifp pointer that has been deleted
9ebb41cf4bb51e0872796530bf8c7a4d819053db bfdd: Prevent unininited data transmittal
72e16db6fea3629111537f9eb10c86f2d275adcb eigrpd: Prevent uninitialized value from being used
72b61a5bb09d59c3cc0d1d401d51de96949dff52 zebra: disallow resolution to duplicate nexthops
1083bae40b00c0ed2c9f3521ae1ab9675a87202e bgpd: Initialize bgp_notify.raw_data before passing to bgp_notify_receive()
31df7314310416f10c133dcfe9c4586edadf3fbb doc: ebgp-requires-policy requires manuall session clearing
ecc8ec678d2d8a1c3d1d50a22732f9fc4bad689c watchfrr: fix SA warning
9d9365d161979a031de817c1fbcab6508dfee013 watchfrr: fix crash on missing optional argument
907e600d63c1c5b6bda40b0a08344a72533b1787 pimd: Prevent use after free
b47374f0e95d99c93bfe2d14afe55219a9fda455 doc: Update bgp doc for more rfc-8212 talk
4fbeef60cc8dc5362ff84fc91d1a4e343e4e32c7 docker: centos 7, 8 yang bump and repo fixes
808e6d731f330df4a91fdfd6df6a3c8dce1651a6 docker: prefer alpine:latest for building
91b3c471f1c48818370a0f218add917f0d46aa47 Merge pull request sonic-net#8092 from donaldsharp/7.5_track
60be43c0bf63c16ca42008fa802d0a2050f3fce2 Merge pull request sonic-net#8090 from ton31337/fix/static_network_vrf_7.5
1f6785aa60cc57a5c8d5de98c9c09a344a0c9262 ospf6d: Track wait_timer and disable when needed
c89e326be91312bed066eb2447ea8944e25a225e bgpd: Check for peer->su_remote if not NULL when handling IPv6 nexthop
15e070f6448870c98c030b6b5013ad8750d8918b Merge pull request sonic-net#8047 from pguibert6WIND/nhrp_shortcut_routes_75
912994efec94082ae7d8c5e014c410964bea19f4 Merge pull request sonic-net#8034 from qlyoung/fix-gnu-readline-bracketed-paste-7.5.1
9f50536993f1eb900fbfbe98d21b8c072bbd9c15 nhrpd: replace nhrp route nexthop with onlink route when prefix=nh
8c185008246db31c34574d7b79358001ac411f84 nhrpd: shortcut routes installed with nexthop.
c46c87d19758040bc3f3902ab8e4a0f1bb908721 vtysh: disable bracketed paste in readline
20b35e4c3386de798f3b0cb9f2a7e6b04d995485 Merge pull request sonic-net#8018 from ton31337/fix/drop_aggregate_as_attribute_if_malformed_7.5
fa25d7327fd64613cc7530aba2edfcde038da074 bgpd: Unset only aggregator flag when AGGREGATOR_AS is 0
3ee9a3726fe1a526d946c1978487a4509fe98f29 bgpd: Drop aggregator_as attribute if malformed in case of BGP_AS_ZERO
be88595c6a2011f0e882bfa663baa61c86ede14e Merge pull request sonic-net#8005 from opensourcerouting/snap-libyang1-fix-75
fd840ad37f2e836b210c6e60fc6325a4c3e495ce snapcraft: Update rtrlib to 0.7.0
3d00552fa9aedb96acd7ea773bc14fd2b77e7e0f snapcraft: Fix passthrough path for Libyang 1.x
```
This PR includes the following commit in sonic-platform-daemons

068bccc [xcvrd] Store mux_cable telemetry data in State DB (sonic-net#148)
93cac0a [ci]: download from sonic-buildimage.vs artifact (sonic-net#152)
d651e9b [GitHub] Add pull request template (sonic-net#151)
bd7830b [pcied] Remove unnecessary message and move the configuration path (sonic-net#144)
9080fda [ci] Call pip2/3 using sudo (sonic-net#150)
de60784 [ci] Test and build packages using Azure Pipelines (sonic-net#149)
8bf0fd1 [ledd] Refactor to allow for more thorough unit testing; Increase unit test coverage (sonic-net#147)
26bdc9e Set up CI with Azure Pipelines
1fcaa57 [pcied] Add PCIe AER stats collection (sonic-net#100)

Signed-off-by: vaibhav-dahiya <vdahiya@microsoft.com>
Open ACL Outer VLAN ID for egress for ports part of VLAN RIF

- Why I did it
Open ACL Outer VLAN ID for egress for ports part of VLAN RIF

- How I did it
Updated SAI submodule pointer

- How to verify it
Build an image, deploy and check all is up and running.
Verify ACL sonic-mgmt test is passing

Signed-off-by: Volodymyr Samotiy <volodymyrs@nvidia.com>
- Why I did it
Mellanox-SN4600C-D112C8 SKU is not configured properly.
It should have 112 50G interfaces and 8 100G interfaces as described on this PR.

- How I did it
Modify sai_profile, port_config.ini and hwsku.json for DPB.

- How to verify it
Apply this HwSKU to a MSN4600C Mellanox platform.

Signed-off-by: Shlomi Bitton <shlomibi@nvidia.com>
Change in this update:
    b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (sonic-net#421)
    4a77d1c [ci]: add vstest (sonic-net#459)
    07258a6 [ci]: use build template (sonic-net#457)
    ddcae3e runRedisScript api to process integer returned by script run in the redis (sonic-net#447)
    33d89c7 [systemlag] Schema defs for system lag (sonic-net#448)
    af01f37 spell check fixes (sonic-net#456)
    7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (sonic-net#455)

signed-off-by: Tamer Ahmed <tamer.ahmed@microsoft.com>
…raph (sonic-net#6219)

Update topology script to retrieve hwsku from minigraph
if hwsku information is not available in config_db.
Fix clean up of interfaces in msft_multi_asic_vs hwsku
topology script.
- Why I did it
When bringing up multi-asic VS switch, topology service is started during boot up.
Topology service starts a shell script which runs the topology script present in /usr/share/sonic/device// directory. To invoke hwsku specific script, the topology script tries to retrieve hwsku information from config_db.
During initial boot up config_db might not be populated. In order to start topology service before config_db is updated,
update topology script to get hwsku information from minigraph.xml if it is available.
This will be helpful to bring up multi-asic VS testbed by loading minigraph and starting topology service.
- How I did it
Update topology.sh script to retrieve hwsku information from minigraph.xml.
Fix clean up function on msft_multi_asic_vs toplogy script.
- How to verify it
single-asic VS - no change; topology service is only enabled for multi-asic VS.
multi-asic VS - Bring up multi-asic VS image, copy minigraph to vs image, start topology service. Topology service should be successful.
to test clean up function fix, start topology service - make sure interfaces are created and moved to the right namespaces.
stop topology service - make sure namespace do not have any interface and all front end interfaces are present in default namespace.
 - Add support for `DCS-7050SX3-48YC8` and `DCS-7050SX3-48C8` platform
 - Add support for more variants of `DCS-7280CR3-32[PD]4`
 - Add Supervisor to Linecard consutil support
 - Complete Watchdog platform API support
 - Fix some PSU behavior on `DCS-7050QX-32` and `DCS-7060CX-32S`
 - Fix SEU management on `DCS-7060CX-32S`
 - Allow kernel modules to build up to linux 5.10
 - Rename led color `orange` to `amber`
 - Miscellaneous fixes
…onic-net#6475)

- Why I did it
The pcie configuration file location is under plugin directory not under platform directory.
sonic-net#6437

- How I did it

Move all pcie.yaml configuration file from plugin to platform directory.
Remove unnecessary timer to start pcie-check.service
Move pcie-check.service to sonic-host-services
- How to verify it
Verify on the device
sonic-net#6833)

It is possible that one interface attaches multiple vlans. The VlanInterface should be in tagged mode.

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>
Set correct MTU size of internal interface for Newport platform
Needed support for platform system health in Dell platforms
#### Why I did it
Fix runtime issues caused by SONiC update

#### How I did it
- new attribute SAI_ACL_ENTRY_ATTR_FIELD_ACL_IP_TYPE  supported 
- new attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY supported

Signed-off-by: Roman Savchuk <romanx.savchuk@intel.com>
jleveque and others added 7 commits February 25, 2021 11:20
…lizer (sonic-net#6853)

In preparation for the merging of sonic-net/sonic-platform-common#173, which properly defines class and instance members in the Platform API base classes.

It is proper object-oriented methodology to call the base class initializer, even if it is only the default initializer. This also future-proofs the potential addition of custom initializers in the base classes down the road.
…ializer (sonic-net#6852)

In preparation for the merging of sonic-net/sonic-platform-common#173, which properly defines class and instance members in the Platform API base classes.

It is proper object-oriented methodology to call the base class initializer, even if it is only the default initializer. This also future-proofs the potential addition of custom initializers in the base classes down the road.
… from Supervisord. (sonic-net#6849)

Signed-off-by: Yong Zhao yozhao@microsoft.com

Why I did it
In the configuration of rsyslog, duplicate messages will be suppressed and reported in the format of message repeated n times.
Due to this behavior, if a critical process in a container exited unexpectedly, the alerting message will be written into syslog once
and not be written into syslog anymore until the second critical process exited. This PR aims to differentiate these alerting messages such that they will not be suppressed by rsyslogd and can appear in the syslog periodically.

How I did it
This PR adds a counter into the alerting message and shows how many minutes a critical process was not running.

How to verify it
I verified and test this implementation on a physical DUT.
…onic-net#6848)

Enable BBR config allowas-in 1 for internal peers

Why I did:
To advertise BBR routes learnt via e-BGP peer in one asic/namespace to another iBGP asic/namespace via Route Reflector.
…pport a… (sonic-net#6831)

To fix [DPB| wrong aliases for interfaces](sonic-net#6024) issue, implimented flexible alias support [design doc](sonic-net/SONiC#749)

> [[dpb|config] Fix the validation logic of breakout mode](sonic-net/sonic-utilities#1440) depends on this

#### How I did it

1. Removed `"alias_at_lanes"` from port-configuration file(i.e. platfrom.json) 
2. Added dictionary to "breakout_modes" values. This defines the breakout modes available on the platform for this parent port, and it maps to the alias list. The alias list presents the alias names for individual ports in order under this breakout mode.
```
{
    "interfaces": {
        "Ethernet0": {
            "index": "1,1,1,1",
            "lanes": "0,1,2,3",
            "breakout_modes": {
                "1x100G[40G]": ["Eth1"],
                "2x50G": ["Eth1/1", "Eth1/2"],
                "4x25G[10G]": ["Eth1/1", "Eth1/2", "Eth1/3", "Eth1/4"],
                "2x25G(2)+1x50G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"],
                "1x50G(2)+2x25G(2)": ["Eth1/1", "Eth1/2", "Eth1/3"]
            }
        }
}
```
#### How to verify it
`config interface breakout`

Signed-off-by: Sangita Maity <samaity@linkedin.com>
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
@dmytroxshevchuk dmytroxshevchuk force-pushed the admin_status_down_if_port_not_in_mini branch from ec23e2b to 6cb5849 Compare February 26, 2021 13:02
qiluo-msft and others added 11 commits February 26, 2021 10:41
…mbers (sonic-net#6895)

#### Why I did it
Some platforms have difficult to attach egress ACL to vlan.

#### How I did it
For egress ACL attaching to vlan, break them into vlan members.

#### How to verify it
Unit test
Tested in DUT
1. Made the command next-hop-self force only applicable on back-end asic bgp. This is done so that BGPL iBGP session running on backend can send e-BGP learn nexthop. Back end asic FRR is able to recursively resolve the eBGP nexthop in its routing table since it knows about all the connected routes advertise from front end asic.

2. Made all front-end asic bgp use global loopback ip (Loopback0) as router id and back end asic bgp use Loopbacl4096 as ruter-id and originator id for Route-Reflector. This is done so that routes learnt by external peer do not see Loopback4096 as router id in show ip bgp <route-prerfix> output.

3. To handle above change need to pass Loopback4096 from BGP manager for jinja2 template generation. This was missing and this change/fix is needed for this also https://github.com/Azure/sonic-buildimage/blob/master/dockers/docker-fpm-frr/frr/bgpd/templates/dynamic/instance.conf.j2#L27

4. Enhancement to add mult_asic specific bgpd template generation unit test cases.
…nic-net#6896)

- Add support for VLAN ID match
- Add support for ICMP type/code match

Signed-off-by: Danny Allen <daall@microsoft.com>
1. [dni_dps460] Add attributes to retrieve PMBus status command codes sonic-net/sonic-linux-kernel@65fccd7
2. [mellanox]: Backport new kernel patches sonic-net/sonic-linux-kernel@2fcd4e3
3. [ci]: build amd64/armhf/arm64 for CI build sonic-net/sonic-linux-kernel@7c57fef
4. Fix read and write failure to ‘fan1_target’ attribute of ‘dni_dps460’ sonic-net/sonic-linux-kernel@fc74b1c
5. [backport]: i2c mux pca954x allow management of device idle state sonic-net/sonic-linux-kernel@173ebe7
6. README: Fix typo in *difficult* sonic-net/sonic-linux-kernel@7778c99
As of the merging of PR sonic-net#6799, we are now installing a newer version of scapy via pip, therefore there is no longer a need to install the older Debian package.
…-net#6692)

- Why I did it
   Bug fixes
   - In rare cases when thermal algorithm is reactivated after FAN/PSU insertion, FAN remains at high rpm
   - When stop hw-management code received error in the log instead of exit code '0'.
   - In SPC1 i2c sometimes collide with chip reset coming from SDK
   - Remove raw eeprom data link, when working with PSU which don't have eeprom for "msn274x", "msn24xx" and "msn27xx" systems
   - Fix memory leak on mlxsw_core_bus_device module removal

- How I did it
Update the hw-mgmt version number in the make file
Update the hw-mgmt repo pointer

- How to verify it
run platform related test cases on all Mellanox platform

Signed-off-by: Kebo Liu <kebol@nvidia.com>
…ms (sonic-net#6913)

- Why I did it
To fix PCIEd errors in log.

- How I did it
Update pcie.yaml with the right PCI addresses.

- How to verify it
Check logs, operation occurs each minute.

Signed-off-by: liora <liora@nvidia.com>
…t#6887)

- Why I did it
To add support for the dynamic breakout on Mellanox platform x86_64-mlnx_msn4600

- How I did it
Add the relevant files describing Mellanox platform x86_64-mlnx_msn4600 breakout modes to a new device folder.

- How to verify it
System bringup is completed, all interfaces are up.
Platform tests suits all is passing.
- Made python2 to python3 changes
- Removed ord() func as python3 return int instead of str
- Had to change chr(..) to bytes([..]) function while using ctypes class methods
- Why I did it
Fix the build and fix the SN4600 DPB support

- How I did it
Fix port configuration file for SN4600 based on recent changes

- How to verify it
System bringup is completed, all interfaces are up.
Platform tests suits all is passing.
@dmytroxshevchuk dmytroxshevchuk force-pushed the admin_status_down_if_port_not_in_mini branch 10 times, most recently from bee186d to 49484a7 Compare March 4, 2021 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.