Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a PTF test CI pipeline for p4c-dpdk on the DPDK SoftNIC #4072

Merged
merged 1 commit into from
Aug 10, 2023

Conversation

Hoooao
Copy link
Contributor

@Hoooao Hoooao commented Jul 18, 2023

Summary

This PR sets up a PTF test CI pipeline for p4c-dpdk and the dpdk-target.

  • PTF(Packet Testing Framework) is a Python based dataplane test framework.
  • p4c-dpdk is the p4 dpdk compiler backend that translates the P4-16 programs to DPDK API to configure the DPDK software switch (SWX) pipeline.
  • DPDK-target implements the P4 pipeline,driven by artifacts generated by the P4 compiler.

Background

A PTF test CI pipeline for p4c-dpdk on the DPDK SWX is currently missing. Adding such tests is beneficial for both the development of the compiler backend and DPDK SWX pipeline.

Demo

  • Make sure infrap4d and p4sde/dpdk-target are properly setup.
  • Define $P4C_DIR as the root directory for p4c.
  • Run sudo $P4C_DIR/backends/dpdk/run-dpdk-ptf-test.py $P4C_DIR $P4C_DIR/testdata/p4_16_samples/pna-dpdk-add_on_miss0.p4 --ipdk-recipe $IPDK_RECIPE --sde-install $SDE_INSTALL --ld-library-path $LD_LIBRARY_PATH --testfile $P4C_DIR/testdata/p4_16_samples/pna-dpdk-add_on_miss0.py -ll DEBUG

Output:

Updated Environment Variables ...
// Skip

1024
1024
INFO: ---------------------- Build conf file ----------------------
INFO: Executing command: python3 /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/p4c/backends/dpdk/edit-conf-file.py -i /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/p4c/backends/dpdk/conf_template.conf -o /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/add_on_miss0.conf -s test_dir temp_prog -r  /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst add_on_miss0.p4
INFO: Executing command: mkdir /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/pipe
INFO: ---------------------- Compile with p4c-dpdk ----------------------
INFO: Executing command: p4c-dpdk --arch pna --target dpdk             --p4runtime-files /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/p4Info.txt             --bf-rt-schema /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/bf-rt.json             --context /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/pipe/context.json             -o /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/pipe/add_on_miss0.spec /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/p4c/testdata/p4_16_samples/add_on_miss0.p4
WARNING: [--Wwarn=mismatch] warning: Mismatched header/metadata struct for key elements in table ct_tcp_table. Copying all match fields to metadata
INFO: ---------------------- Start infrap4d ----------------------
INFO: Writing / h o m e / r u n n e r / w o r k / p 4 c - d p d k - c i - p i p e l i n e / p 4 c - d p d k - c i - p i p e l i n e / i p d k . r e c i p e / i n s t a l l / s b i n / i n f r a p 4 d   - g r p c _ o p e n _ i n s e c u r e _ m o d e = T r u e
INFO: ---------------------- Creating TAPs ----------------------
INFO: Executing command: /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/ipdk.recipe/install/bin/gnmi-ctl set device:virtual-device,name:TAP0,pipeline-name:pipe,mempool-name:MEMPOOL0,mtu:1500,port-type:TAP -grpc_use_insecure_mode=True
INFO: setting vhost_dev = true.Set request, successful...!!!
WARNING: I20230714 19:52:04.519886 106752 gnmi_ctl.cc:103] Client context cancelled.
INFO: Executing command: ifconfig TAP0 up
INFO: Executing command: /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/ipdk.recipe/install/bin/gnmi-ctl set device:virtual-device,name:TAP1,pipeline-name:pipe,mempool-name:MEMPOOL0,mtu:1500,port-type:TAP -grpc_use_insecure_mode=True
INFO: setting vhost_dev = true.Set request, successful...!!!
WARNING: I20230714 19:52:04.740332 106773 gnmi_ctl.cc:103] Client context cancelled.
INFO: Executing command: ifconfig TAP1 up
INFO: ---------------------- Build and Load Pipleline ----------------------
INFO: Executing command: /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/ipdk.recipe/install/bin/tdi_pipeline_builder --p4c_conf_file=/home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/add_on_miss0.conf --bf_pipeline_config_binary_file=/home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/add_on_miss0.pb.bin
WARNING: I20230714 19:52:04.801210 106788 tdi_pipeline_builder.cc:116] Found P4 program: add_on_miss0.p4
WARNING: I20230714 19:52:04.80[18](https://github.com/Hoooao/p4c-dpdk-ci-pipeline/actions/runs/5556956035/job/15053507547#step:9:19)88 106788 tdi_pipeline_builder.cc:123] 	Found pipeline: pipe
INFO: Executing command: /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/ipdk.recipe/install/bin/p4rt-ctl set-pipe br0 /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/add_on_miss0.pb.bin /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/p4Info.txt 
INFO: *** Warning: Unable to locate TLS certificates ***
INFO: Attempting P4RT communication over insecure channel on port localhost:9559...
INFO: ---------------------- Run PTF test ----------------------
INFO: Executing command: ptf --pypath /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/p4c/backends/dpdk --test-dir /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst --list
WARNING: /usr/local/bin/ptf:[19](https://github.com/Hoooao/p4c-dpdk-ci-pipeline/actions/runs/5556956035/job/15053507547#step:9:20): DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
WARNING:   import imp
INFO: Using packet manipulation module: ptf.packet_scapy
INFO: Tests are shown grouped by module. If a test is in any groups beyond "standard"
INFO: and its module's group then they are shown in parentheses.
INFO: 
INFO: Tests marked with '!' are disabled because they are experimental, special-purpose,
INFO: or are too long to be run normally. These are not part of the "standard" test
INFO: group or their module's test group.
INFO: 
INFO: Test List:
INFO:   Module add_on_miss0:  No description
INFO:     OneEntryTest:     
INFO: 
INFO: 1 modules shown with a total of 1 tests
INFO: 
INFO: Test groups: add_on_miss0, all, standard
INFO: Executing command: ptf --pypath /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/p4c/backends/dpdk  -i 0@TAP0 -i 1@TAP1 --log-file /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst/ptf.log --test-params=grpcaddr='0.0.0.0:9559' --test-dir /home/runner/work/p4c-dpdk-ci-pipeline/p4c-dpdk-ci-pipeline/tmpjdvgvxst
WARNING: /usr/local/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
WARNING:   import imp
WARNING: 2023-07-14 19:52:08,105 - root - INFO - port map: {(0, 0): 'TAP0', (0, 1): 'TAP1'}
WARNING: 2023-07-14 19:52:08,106 - root - INFO - Autogen random seed: 78534030
WARNING: 2023-07-14 19:52:08,119 - root - INFO - *** TEST RUN START: Fri Jul 14 19:52:08 2023
WARNING: add_on_miss0.OneEntryTest ... 2023-07-14 19:52:08,119 - root - INFO - AbstractIdleTimeoutTest.setUp()
WARNING: 2023-07-14 19:52:08,148 - root - INFO - Attempting to delete all entries in ipv4_host
WARNING: 2023-07-14 19:52:08,152 - root - INFO - Attempting to add entries to ipv4_host
WARNING: 2023-07-14 19:52:08,330 - root - INFO - Now ipv4_host contains 2 entries
WARNING: 2023-07-14 19:52:08,330 - root - INFO - Sending Packet ...
WARNING: 2023-07-14 19:52:08,334 - root - INFO - Sending packet #1
WARNING: 2023-07-14 19:52:08,335 - root - INFO - Verifying Packet ...
WARNING: 2023-07-14 19:52:08,539 - root - INFO -     packet experienced a miss in ct_tcp_table as expected
WARNING: 2023-07-14 19:52:08,539 - root - INFO - IdleTimeoutTest.tearDown()
WARNING: ok
WARNING: 
WARNING: ----------------------------------------------------------------------
WARNING: Ran 1 test in 0.425s

Current status and further plans

  • There is only one test added to the PTF test suite, but we will be working on the P4Testgen for it.
  • Currently, the dependency building for infrap4d takes too long. We will be working on a Docker image to try to accelerate the building process.

@fruffy
Copy link
Collaborator

fruffy commented Jul 18, 2023

Can you squash these commits into one?

@fruffy fruffy added the dpdk Topics related to the DPDK back end label Jul 18, 2023
@fruffy fruffy changed the title [WiP] Add a PTF test CI pipeline for DPDK [WiP] Add a PTF test CI pipeline for p4c-dpdk on the DPDK SoftNIC Jul 18, 2023
Copy link
Collaborator

@fruffy fruffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial round of comments.

testdata/p4_16_samples/add_on_miss0.p4 Outdated Show resolved Hide resolved
tools/ci-dpdk-ptf-test/ci-build.sh Outdated Show resolved Hide resolved
tools/ci-dpdk-ptf-test/ci-build.sh Outdated Show resolved Hide resolved
.github/workflows/ci-ptf-tests.yml Outdated Show resolved Hide resolved
backends/dpdk/run-dpdk-ptf-test.py Outdated Show resolved Hide resolved
.github/workflows/ci-ptf-tests.yml Outdated Show resolved Hide resolved
.github/workflows/ci-ptf-tests.yml Outdated Show resolved Hide resolved
.github/workflows/ci-ptf-tests.yml Outdated Show resolved Hide resolved
backends/dpdk/run-dpdk-ptf-test.py Outdated Show resolved Hide resolved
@jafingerhut
Copy link
Contributor

Is it intentional to delete so many files with this PR?

@fruffy
Copy link
Collaborator

fruffy commented Jul 19, 2023

Is it intentional to delete so many files with this PR?

Yes, the deleted files are the Github Action runs. The intention is to save CI runs while we get the pipeline right.

tools/dpdk-ci-build.sh Show resolved Hide resolved
backends/dpdk/CMakeLists.txt Outdated Show resolved Hide resolved
backends/dpdk/run-dpdk-ptf-test.py Outdated Show resolved Hide resolved
@fruffy fruffy requested a review from jafingerhut August 1, 2023 08:44
@fruffy fruffy changed the title [WiP] Add a PTF test CI pipeline for p4c-dpdk on the DPDK SoftNIC Add a PTF test CI pipeline for p4c-dpdk on the DPDK SoftNIC Aug 1, 2023
@fruffy
Copy link
Collaborator

fruffy commented Aug 1, 2023

@qobilidop Do you have any comments/suggestions on this PR? It's the first step. We are slowly building out a testing pipeline for the SoftNIC.

@qobilidop
Copy link
Member

This is awesome! I really like how clean and flexible it is to write a test using the PTF framework. And it seems to me all the existing PTF tests can be reused for the DPDK SoftNIC? If true, that's a big plus. It's probably time to remove the e2e-test in p4-dpdk-target and add more PTF tests in this repo instead 😂

I have some high level questions to clarify:

  1. As multiple repos are involved, what's the high level strategy to decide which versions of each repo to test? Is it always testing the latest?
  2. Could an update in a different repo (e.g. p4-dpdk-target) trigger these tests? Is it or the lack of it a concern?
  3. How does these tests influence the development workflow of tested components? For example, a bug might need work on both p4c backend side, and DPDK pipeline side to fix. What is the expected workflow for fixing that bug, regarding the use of these tests?
  4. In terms of table entry IO interface, is it: DPDK <--[TDI]--> infrap4d <--[P4Runtime]--> PTF?
  5. What is the relationship between infrap4d and bf_switchd?

@fruffy
Copy link
Collaborator

fruffy commented Aug 2, 2023

This is awesome! I really like how clean and flexible it is to write a test using the PTF framework. And it seems to me all the existing PTF tests can be reused for the DPDK SoftNIC? If true, that's a big plus. It's probably time to remove the e2e-test in p4-dpdk-target and add more PTF tests in this repo instead joy

Yes, happy to get this done! Ultimately, we want to use P4Testgen on all the available sample programs and so it makes sense to have this pipeline as part of the compiler.

We could also add any additional PTF tests to the repository to ensure the compiler is working correctly.

1. As multiple repos are involved, what's the high level strategy to decide which versions of each repo to test? Is it always testing the latest?

It's a good question. Currently, we are always testing the latest to exercise the entire toolchain. We could also pin some of the dependencies to keep things stable and then periodically update. Or have a meta-repository that tests the latest components. In my experience, it has been beneficial to have tests running on each PR to expose pain points early.

2. Could an update in a different repo (e.g. p4-dpdk-target) trigger these tests? Is it or the lack of it a concern?

No, that is not possible yet. We could also run these kinds of workflows on both P4C and the p4-dpdk-target. The only question is where to place tests.

3. How does these tests influence the development workflow of tested components? For example, a bug might need work on both p4c backend side, and DPDK pipeline side to fix. What is the expected workflow for fixing that bug, regarding the use of these tests?

In the past, for example with BMv2, we timed these. We first merged the fix in BMv2 then merge the corresponding PR in P4C or vice versa.

4. In terms of table entry IO interface, is it: DPDK <--[TDI]--> infrap4d <--[P4Runtime]--> PTF?

Yep, the plan is to cut this down, too. Ideally, we want to have a rather minimal (or simple) installation. At least for the compiler.

5. What is the relationship between infrap4d and bf_switchd?

Not quite sure. I believe bf_switchd are Stratum remains which were tied to the Tofino bf_switchd version. Ultimately, that daemon should be replaced or renamed.

backends/dpdk/run-dpdk-ptf-test.py Outdated Show resolved Hide resolved
@Hoooao Hoooao force-pushed the main branch 2 times, most recently from 2b500b7 to 4025423 Compare August 4, 2023 12:45
tu.send_packet(self, ig_port, pkt_in)
first_pkt_time = time.time()

def verifyPackets(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra structure to tests that you have added with verifyPackets and sendPackets methods is fine, for tests where that pattern is sufficient.

In general, the fact that PTF tests allow you to write more general flows of tests, e.g. add some entries, send and verify some packets, add/modify/delete some table entries, send and verify more packets, etc. I consider a super-power of PTF tests over STF tests, and we should not discourage people from using it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I will approve this PR regardless of this comment. I am not suggesting any changes to this PR. I just don't want people going forward thinking that they have to restrict themselves to writing tests with this pattern.

Copy link
Contributor

@jafingerhut jafingerhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jafingerhut
Copy link
Contributor

I believe that right now in the public repos for bf_switchd and infrap4d, a very important distinction is that bf_switchd ONLY allows you to do table add/modify/delete calls via an interactive Python shell that you can reach by typing the tdi_python command. bf_switchd does NOT include a P4Runtime API server, so running it does not enable you to write tests with a P4Runtime API client.

infrap4d DOES include a P4Runtime API server (and some other things, e.g. a gNMI and I believe also gNOI server).

Thus at this time, if you want to do table add/delete/modify operations from a separate process (e.g. a Python program as in this PR), you must use infrap4d with P4-DPDK.

Ccache Added

Moved PTF tests into CMake; Shell scrips cleaned

Fix CMake

Fix CI

Add sudo to cmd

No Hugepage test

Rearrange compilation

Cleaned

Ccache Added

Moved PTF tests into CMake; Shell scrips cleaned

Fix CMake

Fix CI

Add sudo to cmd

No Hugepage test

Rearrange compilatin

Main branch update

Renamed test with ptf prefix

Test final

added outputs

IPDK recipe updated

Renamed test; ipdk_recipe repo changed

Deleted old test and outputs

Added test outputs

IPDK update

Test ld_lib

Cleaned code

CI to main

Changed comments

Minor fixes

Ccache Added

Moved PTF tests into CMake; Shell scrips cleaned

Fix CMake

Fix CI

Add sudo to cmd

No Hugepage test

Rearrange compilation

Cleaned

Moved PTF tests into CMake; Shell scrips cleaned

Fix CMake

Fix CI

Add sudo to cmd

No Hugepage test

Rearrange compilatin

Main branch update

Renamed test with ptf prefix

Test final

added outputs

IPDK recipe updated

Renamed test; ipdk_recipe repo changed

Deleted old test and outputs

Added test outputs

IPDK update

Test ld_lib

Cleaned code

Changed comments

Fix redundence

Minor mods

Rearrange p4c and recipe

Test protoc version

Deps updated

Fix

test

Changed Dep_install

Changed Dep_install to /usr/local

Wrap up

Delete residual comment

check

N

Added loop to check if infrap4d is on; Disabled IPv6

Switch to main
@fruffy
Copy link
Collaborator

fruffy commented Aug 8, 2023

@jafingerhut Did you notice any nondeterminism running tests using infrap4d? We see tests sporadically failing but it is unclear why.

@jafingerhut
Copy link
Contributor

I did notice the following, which you might be seeing:

The P4Runtime API specification says that a P4Runtime API server should not send back a response to a WriteRequest until after all successfully performed parts of the WriteRequest message have been committed to the data plane (and will thus affect all future packet processing).

In my testing, infrap4d fairly often returns such a response message before the WriteRequest commands are committed to the data plane.

Thus if the PTF test program immediately sends a packet into the data plane after receiving the response, there is a race to see whether the WriteRequest will be committed to the data plane first, or the packet will reach the data plane first and see the old state before the WriteRequest.

In the mean tme, as a workaround, I have put in a call "waitForControlPlaneModifications()" after control plane operations, before sending in data packets, where all that function does is time.sleep() for an amount of time that lets infrap4d win the race. I found 1 second works well in my testing so far.

I have notified the infrap4d authors of this, but should file an official bug to track it.

@Hoooao
Copy link
Contributor Author

Hoooao commented Aug 8, 2023

@jafingerhut
Thanks for the reply.
Just to make sure I understand it correctly and we observed the same issue, do you mean by the test should not start immediately after p4rt-ctl is used to load the pipeline?
I am curious that why p4rt-ctl would crash like in the github action without sending a packet: should not the race condition cause an unexpected testing result in PTF? So in the case of the github action, I am not quite sure waiting then sending the packet can solve it.

@jafingerhut
Copy link
Contributor

@Hoooao I am not aware of why p4rt-ctl command would be crashing. I have not used p4rt-ctl for any purposes other than the set-pipe command to load a P4 program executable into the P4-DPDK software switch. If it is failing for some reason, that sounds like a different problem than the one I described in my previous comment above.

@jafingerhut
Copy link
Contributor

@Hoooao I did skim through a recent test failure log and saw this message:

*** Warning: Unable to locate TLS certificates ***

That looks like you might be trying to run p4rt-ctl, or a Python program trying to make a P4Runtime API connection to infrap4d process, and it is expecting the client to use the proper TLS certificates required by the infrap4d P4Runtime API server.

@jafingerhut
Copy link
Contributor

@fruffy @Hoooao I have been doing some more asking around and realize that I have not personally confirmed that this race issue I mentioned above actually exists with P4-DPDK. I saw it when testing with a different system that also uses infrap4d, but the code responsible for that race might be outside of infrap4d.

Anyway, short summary is that there might be a similar race in P4DPDK when using infrap4d, but I am not 100% sure there is.

@Hoooao
Copy link
Contributor Author

Hoooao commented Aug 8, 2023

@jafingerhut Thanks for the follow-up.
The warning is intentional since we are using unsecured mode for both the infrap4d and p4rt-ctl for simplicity of configuration.
I have found a problem in dpdk-target that is at least partially attributed to the unexpected issue and made a PR here.

@jafingerhut
Copy link
Contributor

2 approvals and all CI tests passing. Merging.

@jafingerhut jafingerhut merged commit 9499498 into p4lang:main Aug 10, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dpdk Topics related to the DPDK back end
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants