Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AclOrch] aclOrch enhancement to handle LAG port not configured case #494

Merged
merged 13 commits into from
May 25, 2018
Merged

[AclOrch] aclOrch enhancement to handle LAG port not configured case #494

merged 13 commits into from
May 25, 2018

Conversation

keboliu
Copy link
Collaborator

@keboliu keboliu commented May 4, 2018

What I did

  1. Add STATE_DB db connector to Orchdaemon so AclOrch can use it.
  2. AclOrch subscribe to STATE_DB and handle the LAG add/del notificaiton from STATE_DB.
  3. When AclOrch detected a LAG port in it's port list not configured yet, it will continue to handle the rest ports in the list and add this LAG port to the pending port list.
  4. When AclOrch received the LAG port created notification, and this LAG is in it's pending list, it will bind the ACL table to this LAG.
  5. If one LAG deleted, the related ACL table that have this LAG, will add this LAG to the pending port list.

Why I did it
it's possible that LAG port configured later or not configured yet while ACL table created, in this case we can add this LAG to the pending list and wait for it created, instead of fail the whole ACL table creating.

How I verified it
manual test, run ACL and Everflow test case on T1 and T1-LAG topo.

Details if related

@lguohan lguohan requested a review from prsunny May 5, 2018 00:15
@@ -249,8 +249,9 @@ int main(int argc, char **argv)
/* Initialize orchestration components */
DBConnector *appl_db = new DBConnector(APPL_DB, DBConnector::DEFAULT_UNIXSOCKET, 0);
DBConnector *config_db = new DBConnector(CONFIG_DB, DBConnector::DEFAULT_UNIXSOCKET, 0);
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new [](start = 29, length = 3)

There are memory leak risk at extreme memory situation. #Closed

Copy link
Collaborator Author

@keboliu keboliu May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft Hi Qi, could you elaborate the concern here? #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several 'new' statements here. If memory is used up in the middle, there will be memory leak issue.
The bug should be here for a long time, and your added code make it worse a little bit.


In reply to: 186248077 [](ancestors = 186248077)

@@ -248,6 +248,10 @@ class AclTable {
std::map<sai_object_id_t, sai_object_id_t> ports;
// Map rule name to rule data
map<string, shared_ptr<AclRule>> rules;
// Set to store the ACL table port alias
set<string> portListsSet;
// Set tje store the not cofigured ACL table port alias
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo #Closed

}
else
{
it = consumer.m_toSync.erase(it);
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it = consumer.m_toSync.erase(it); [](start = 12, length = 33)

erase is common for all cases, so you can do it outside the if-else. #Closed

@@ -1740,6 +1812,7 @@ bool AclOrch::processPorts(string portsList, std::function<void (sai_object_id_t
split(portsList, strList, ',');

set<string> strSet(strList.begin(), strList.end());
aclTable.portListsSet = strSet;
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

portListsSet [](start = 13, length = 12)

Call it portSet? #Closed


return true;
}

bool AclOrch::processAclTableType(string type, acl_table_type_t &table_type)
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processAclTableType [](start = 14, length = 19)

Lots of repeated code, you can call one from the other. #Closed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I mean processPorts() and processPendingPort() share some code, it is possible to call one from the other?


In reply to: 186252087 [](ancestors = 186252087)

Copy link
Collaborator Author

@keboliu keboliu May 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft They do share some same code, I also have considered calling processPendingPort(previously processSinglePort) in processPorts. But if we look into the whole procedure, processPorts only "link" ports to tables and the action to "bind" table to ports are not included.
For processPendingPort, it's dedicated to receiving the notification and handle the pending port, in this one function "link" and "bind" all performed. To keep the code more straightforward and readable I decide to have these two functions separated through will have some slight redundant code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can move aclTable.bind out of the processPendingPort() function and make the code reuse.

you can do aclTable.bind in doAclTablePortUpdateTask().


In reply to: 186911317 [](ancestors = 186911317)

{
auto table = itmap.second;

if (table.pendingPortListSet.find(port_alias) != table.pendingPortListSet.end())
Copy link
Contributor

@qiluo-msft qiluo-msft May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pendingPortListSet [](start = 26, length = 18)

Call it pendingPortSet? #Closed

Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments.

@keboliu
Copy link
Collaborator Author

keboliu commented May 8, 2018

@qiluo-msft all comments handled, please check the latest commit. #Closed

Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comments.


OrchDaemon *orchDaemon = new OrchDaemon(appl_db, config_db);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we would need to have orchDaemon allocate from stack. This could still be retained as it is (allocate via new) so we wouldn't have to worry about the class holding more data in future just in case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised.


if (table.pendingPortSet.find(port_alias) != table.pendingPortSet.end())
{
SWSS_LOG_NOTICE("found the port: %s in ACL table: %s pending port list, bind it to ACL table.", port_alias.c_str(), table.description.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to put this as INFO instead of NOTICEs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised.

for (auto itmap : m_AclTables)
{
auto table = itmap.second;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the extra lines. There are few other redundant lines as well in the PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted.

@@ -1898,18 +2024,17 @@ sai_status_t AclOrch::bindAclTable(sai_object_id_t table_oid, AclTable &aclTable
sai_status_t status = SAI_STATUS_SUCCESS;

SWSS_LOG_INFO("%s table %s to ports", bind ? "Bind" : "Unbind", aclTable.id.c_str());

if (aclTable.portSet.empty())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this condition will be ever true in this flow. Can you check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are correct, I removed this flow.

{
return SAI_STATUS_SUCCESS;
}
SWSS_LOG_WARN("Binding port list is empty for %s table", aclTable.id.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a normal flow to delete aclTable which is not yet bind to any port list. We wouldn't want to log warning if it is in the delete flow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

condition check added.

Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also get Prince's approval.

Copy link
Collaborator

@prsunny prsunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good. Could you please add/update VS test cases for ACL to verify this changes?

@liatgrozovik
Copy link

As for VS, it should be added but for now we have the testbed to cover this flow.
Lets please approve the change and we will add new VS cases for port channel.
It is currently don't have any reference to port channel although it was supported before.

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need vs switch test.

@keboliu
Copy link
Collaborator Author

keboliu commented May 22, 2018

3 new VS test case added

  1. negative test: bind acl table to lag member port
  2. bind acl table to lag
  3. bing acl table to non-exist lag and then configure this lag.
root@arc-host74-006:/sonic-swss/tests# pytest -v --dvsname=vs test_acl.py
=================== test session starts =====================
platform linux2 -- Python 2.7.12, pytest-3.3.0, py-1.5.2, pluggy-0.6.0 -- /usr/bin/python
cachedir: .cache
rootdir: /.autodirect/rdmzsysgwork/kebol/sonic-2/sonic-buildimage/src/sonic-swss/tests, inifile:
collected 19 items                                                                                                                                                                                                                         

test_acl.py::TestAcl::test_AclTableCreation PASSED                                                                                                                                                                                   [  5%]
test_acl.py::TestAcl::test_AclRuleL4SrcPort PASSED                                                                                                                                                                                   [ 10%]
test_acl.py::TestAcl::test_AclTableDeletion PASSED                                                                                                                                                                                   [ 15%]
test_acl.py::TestAcl::test_V6AclTableCreation PASSED                                                                                                                                                                                 [ 21%]
test_acl.py::TestAcl::test_V6AclRuleIPv6Any PASSED                                                                                                                                                                                   [ 26%]
test_acl.py::TestAcl::test_V6AclRuleIPv6AnyDrop PASSED                                                                                                                                                                               [ 31%]
test_acl.py::TestAcl::test_V6AclRuleIpProtocol PASSED                                                                                                                                                                                [ 36%]
test_acl.py::TestAcl::test_V6AclRuleSrcIPv6 PASSED                                                                                                                                                                                   [ 42%]
test_acl.py::TestAcl::test_V6AclRuleDstIPv6 PASSED                                                                                                                                                                                   [ 47%]
test_acl.py::TestAcl::test_V6AclRuleL4SrcPort PASSED                                                                                                                                                                                 [ 52%]
test_acl.py::TestAcl::test_V6AclRuleL4DstPort PASSED                                                                                                                                                                                 [ 57%]
test_acl.py::TestAcl::test_V6AclRuleTCPFlags PASSED                                                                                                                                                                                  [ 63%]
test_acl.py::TestAcl::test_V6AclRuleL4SrcPortRange PASSED                                                                                                                                                                            [ 68%]
test_acl.py::TestAcl::test_V6AclRuleL4DstPortRange PASSED                                                                                                                                                                            [ 73%]
test_acl.py::TestAcl::test_V6AclTableDeletion PASSED                                                                                                                                                                                 [ 78%]
test_acl.py::TestAcl::test_InsertAclRuleBetweenPriorities PASSED                                                                                                                                                                     [ 84%]
test_acl.py::TestAcl::test_AclTableCreationOnLAGMember PASSED                                                                                                                                                                        [ 89%]
test_acl.py::TestAcl::test_AclTableCreationOnLAG PASSED                                                                                                                                                                              [ 94%]
test_acl.py::TestAcl::test_AclTableCreationBeforeLAG PASSED 

{
SWSS_LOG_ENTER();

sai_object_id_t port_id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sai_object_id_t port_id == SAI_NULL_OBJECT_ID; do it here. then you do not need to assign later in the function.

if (port.m_lag_member_id != SAI_NULL_OBJECT_ID)
{
SWSS_LOG_ERROR("Failed to process port. Bind table to LAG member %s is not allowed", alias.c_str());
port_id = SAI_NULL_OBJECT_ID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port_id = SAI_NULL_OBJECT_ID; [](start = 12, length = 29)

remove this.

if (table.portSet.find(port_alias) != table.portSet.end())
{
table.pendingPortSet.emplace(port_alias);
SWSS_LOG_WARN("Add deleted port: %s to the pending list of ACL table: %s", port_alias.c_str(), table.description.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this warning level?

table.pendingPortSet.emplace(port_alias);
SWSS_LOG_WARN("Add deleted port: %s to the pending list of ACL table: %s", port_alias.c_str(), table.description.c_str());
}
}
Copy link
Contributor

@lguohan lguohan May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the scenario here? when a port is removed, then to update the acl table? In this case, you need to unbind the port from acl table, and then put the port to pending list. You cannot simply put the port into the pending list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this one maybe need to marked as TODO here. if one port or LAG deleted, do will still have the chance to unbind it? I mean maybe some DB entries already destroyed and for SDK level all the configuration should already be cleared?

string port_alias = key.substr(0, found);
string op = kfvOp(t);

SWSS_LOG_INFO("doAclTablePortUpdateTask: OP: %s, port_alias: %s", op.c_str(), port_alias.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SWSS_LOG_INFO("doAclTablePortUpda [](start = 7, length = 34)

this is debug level message.

SWSS_LOG_ENTER();

sai_object_id_t port_id;

vector<string> strList;

SWSS_LOG_INFO("Processing ACL table port list %s", portsList.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SWSS_LOG_INFO(" [](start = 4, length = 15)

this is debug level message.

@@ -1754,30 +1856,50 @@ bool AclOrch::processPorts(string portsList, std::function<void (sai_object_id_t
Port port;
if (!gPortsOrch->getPort(alias, port))
{
SWSS_LOG_ERROR("Failed to process port. Port %s doesn't exist", alias.c_str());
return false;
SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", alias.c_str(), aclTable.description.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SWSS_LOG_WARN [](start = 12, length = 13)

why warning level? this should be info level since this is a common situation your code is handling, so this should not be warning level.

{
SWSS_LOG_ENTER();

SWSS_LOG_INFO("Processing ACL table port %s", portAlias.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SWSS_LOG_INFO [](start = 4, length = 13)

debug level.

Port port;
if (!gPortsOrch->getPort(portAlias, port))
{
SWSS_LOG_WARN("Port %s not configured yet, add it to ACL table %s pending list", portAlias.c_str(), aclTable.description.c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SWSS_LOG_WARN [](start = 8, length = 13)

info level.

self.verify_acl_group_member(adb, 0, test_acl_table_id)

# check port binding
self.verify_acl_lag_binding(adb, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this after port channel member creation, before state db creation.

keys = atbl.getKeys()
assert len(keys) == 0

def verify_acl_group(self, adb, expt):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expt [](start = 36, length = 4)

rename to "acl_group_num" for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to "acl_group_num" for clarity.


In reply to: 190217928 [](ancestors = 190217928)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already changed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already changed?

self.verify_acl_group(adb, 2)

# check acl table group member
self.verify_acl_group_member(adb, 2, test_acl_table_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify_acl_group_member [](start = 13, length = 23)

this function signature should be

verify_acl_group_member(adb, [acl_group_ids], acl_table_id)

it should verify the acl_groups have corresponding acl_group_member which has the acl_table_id.

assert set(port_groups) == set(acl_table_groups)


def verify_acl_lag_binding(self, adb, expt):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify_acl_lag_binding(self, adb, expt): [](start = 8, length = 40)

like the verifying the port binding, we need to give the lag name as input.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find out how a lag port name
was mapped to a SAI_OBJECT_TYPE_LAG_MEMBER in ASIC DB, unless you create one LAG and then know that the only SAI_OBJECT_TYPE_LAG_MEMBER record in ASIC DB is yours?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use lag_object_ids here instead of lag name.


In reply to: 190546287 [](ancestors = 190546287)


# check port binding

def verify_acl_port_binding(self, dvs, adb, expt, bind_ports):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expt [](start = 48, length = 4)

you do not need give expt here, just use the len(bind_ports)

assert set(port_groups) == set(acl_table_groups)


def verify_acl_lag_binding(self, adb, expt):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expt [](start = 42, length = 4)

change to bind_ports as well, do not need expt here.

Copy link
Collaborator Author

@keboliu keboliu May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for lag case it different with port case, bind_ports len may not equal to the binding lag number if some LAG not configured yet, so I think we still need a expt?


# check acl table group member

def verify_acl_group_member(self, adb, expt, test_acl_table_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expt [](start = 43, length = 4)

change to acl_group_id list.

bool validateAclTable(AclTable &aclTable);
sai_object_id_t getValidPortId(string alias, Port port);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getValidPortId [](start = 20, length = 14)

to reflect the true meaning, it is better to rename it to getBindPortId. Besides, it makes more sense to move this into portsyncd since it is retrieving a port property.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed and move it to portOrch.

break;
default:
SWSS_LOG_ERROR("Failed to process port. Incorrect port %s type %d", alias.c_str(), port.m_type);
return false;
Copy link
Contributor

@lguohan lguohan May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align spaces

self.verify_acl_group_member(adb, acl_group_ids, test_acl_table_id)

# check port binding
self.verify_acl_lag_binding(adb, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. [](start = 40, length = 3)

use lag id

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revised.

Copy link
Contributor

@lguohan lguohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@qiluo-msft qiluo-msft merged commit 4df9c28 into sonic-net:master May 25, 2018
qiluo-msft pushed a commit to sonic-net/sonic-buildimage that referenced this pull request May 26, 2018
…ed ACL table configuration (#1712)

* Fix minigraph parser issue when handling LAG related ACL table configuration
* rephrase the warning message.
* pick up swss change in sonic-net/sonic-swss#494
qiluo-msft pushed a commit that referenced this pull request May 27, 2018
…494)

* enhance acl orch to handle the LAG port not configured case
* rename variable, function; redunce redundant code; fix memory issue
* revise according to comments
* one more blank line
* add new VS test cases for acl on LAG
* update with PR comments fix
* rename getValidPortId and move it to portOrch
* fix indent
* fix test case comments
qiluo-msft pushed a commit to qiluo-msft/sonic-buildimage that referenced this pull request May 27, 2018
…ed ACL table configuration (sonic-net#1712)

* Fix minigraph parser issue when handling LAG related ACL table configuration
* rephrase the warning message.
* pick up swss change in sonic-net/sonic-swss#494
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request May 30, 2018
…ed ACL table configuration (#1712)

* Fix minigraph parser issue when handling LAG related ACL table configuration
* rephrase the warning message.
* pick up swss change in sonic-net/sonic-swss#494
@keboliu keboliu deleted the acl-sequence branch August 22, 2018 10:28
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
Signed-off-by: Wenda Ni <wenni@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants