Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add platform test for kdump #10047

Merged
merged 6 commits into from
Dec 29, 2023
Merged

Conversation

amnsinghal
Copy link
Contributor

Summary:
Add test_kdump.py to platform_tests.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

Add automation test for kdump functionality.

How did you do it?

Induce kernel panic thru sysrq-trigger.

How did you verify/test it?

For platform with kdump enabled, trigger kernel panic, and check for reboot-cause on subsequent bootup.
Validated on Cisco-8000 platforms thru internal sonic-mgmt test.

image

Any platform specific information?

N/A

Supported testbed topology if it's a new test case?

Any platform with kdump enabled, else test will be skipped.

Signed-off-by: Aman Singhal <amans@cisco.com>
@amnsinghal
Copy link
Contributor Author

@prgeor @lguohan @abdosi Pls help review this new testcase for kdump, thanks.

@amnsinghal
Copy link
Contributor Author

@prgeor Can you pls help check this Elastictest failure, doesn't seem related to my change, thanks.

@amnsinghal
Copy link
Contributor Author

@prgeor Can you pls help review and check this Elastictest failure, doesn't seem related to my change, thanks.

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/platform_tests/test_kdump.py:21:1: E302 expected 2 blank lines, found 1

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@bpar9 bpar9 assigned bpar9 and unassigned bpar9 Oct 3, 2023
@bpar9 bpar9 self-requested a review October 3, 2023 00:14
tests/platform_tests/test_kdump.py Show resolved Hide resolved
tests/platform_tests/test_kdump.py Outdated Show resolved Hide resolved
@bpar9
Copy link
Contributor

bpar9 commented Oct 3, 2023

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

@yxieca
Copy link
Collaborator

yxieca commented Oct 3, 2023

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

Good question. Thanks for looking up the history. I believe the old PR #3562 has unresolve review comments. That was the reason for leaving it open.

@bpar9
Copy link
Contributor

bpar9 commented Oct 10, 2023

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

Good question. Thanks for looking up the history. I believe the old PR #3562 has unresolve review comments. That was the reason for leaving it open.

thanks @yxieca for the reply. So what is your guidance on this PR-10047? it has a subset of the TCs of #3562 which are currently applicable for our platform. So do we wait for 3562 to be merged or can we go ahead and push this PR in. Please let us know, so that we can proceed accordingly. Thanks

@yxieca
Copy link
Collaborator

yxieca commented Oct 10, 2023

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

Good question. Thanks for looking up the history. I believe the old PR #3562 has unresolve review comments. That was the reason for leaving it open.

thanks @yxieca for the reply. So what is your guidance on this PR-10047? it has a subset of the TCs of #3562 which are currently applicable for our platform. So do we wait for 3562 to be merged or can we go ahead and push this PR in. Please let us know, so that we can proceed accordingly. Thanks

We can work on this PR and get it merged. At this point. this PR contains blocking comments. Please resolve.

@amnsinghal amnsinghal requested a review from bpar9 October 10, 2023 17:37
@bpar9
Copy link
Contributor

bpar9 commented Oct 10, 2023

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

Good question. Thanks for looking up the history. I believe the old PR #3562 has unresolve review comments. That was the reason for leaving it open.

thanks @yxieca for the reply. So what is your guidance on this PR-10047? it has a subset of the TCs of #3562 which are currently applicable for our platform. So do we wait for 3562 to be merged or can we go ahead and push this PR in. Please let us know, so that we can proceed accordingly. Thanks

We can work on this PR and get it merged. At this point. this PR contains blocking comments. Please resolve.

thanks @yxieca , Aman will work on addressing the comments and proceed.

@amnsinghal
Copy link
Contributor Author

@yxieca there is an old PR which has a superset of TCs then what is present in this PR. That PR has been reviewed but not merged for past couple of years- #3562 . Is there some reason why community has not merged the old PR

Good question. Thanks for looking up the history. I believe the old PR #3562 has unresolve review comments. That was the reason for leaving it open.

thanks @yxieca for the reply. So what is your guidance on this PR-10047? it has a subset of the TCs of #3562 which are currently applicable for our platform. So do we wait for 3562 to be merged or can we go ahead and push this PR in. Please let us know, so that we can proceed accordingly. Thanks

We can work on this PR and get it merged. At this point. this PR contains blocking comments. Please resolve.

thanks @yxieca , Aman will work on addressing the comments and proceed.

Thanks for confirming @yxieca and @bpar9, I have resolved the comments and update the PR, pls review and approve/merge.

@amnsinghal
Copy link
Contributor Author

@yxieca All checks have passed, except Semgrep is still waiting for status to be reported, can you pls check this. Thanks!

@mssonicbld
Copy link
Collaborator

The pre-commit check detected issues in the files touched by this pull request.
The pre-commit check is a mandatory check, please fix detected issues.

Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1

tests/platform_tests/test_kdump.py:61:9: F841 local variable 'dut_ip' is assigned to but never used
tests/platform_tests/test_kdump.py:63:9: F841 local variable 'dut_datetime' is assigned to but never used

flake8...............................................(no files to check)Skipped
check conditional mark sort..........................(no files to check)Skipped

To run the pre-commit checks locally, you can follow below steps:

  1. Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run
    the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt
    docker container.
  2. Ensure that the pre-commit package is installed:
sudo pip install pre-commit
  1. Go to repository root folder
  2. Install the pre-commit hooks:
pre-commit install
  1. Use pre-commit to check staged file:
pre-commit
  1. Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>

@yxieca
Copy link
Collaborator

yxieca commented Oct 12, 2023

@prgeor can you review this test?

@amnsinghal
Copy link
Contributor Author

@prgeor Can you pls review and approve/merge, if no further suggestions, thanks!

"""
This test case is used to verify that DUT will load kdump crashkernel on kernel panic.
"""
def wait_lc_healthy_if_sup(self, duthost, duthosts, localhost, conn_graph_facts, xcvr_skip_list):
Copy link
Contributor

@prgeor prgeor Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnsinghal is this test supposed to run only on supervisor? Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the test runs on both LCs and RPs. For RPs, there is an additional check to ensure that LCs come up healthy after RP goes down and come back up. (In SONiC distributed chassis, LCs go down as well if RP goes down - Ref: sonic-net/SONiC#945 (comment))

if "Enabled" not in out["stdout"]:
pytest.skip('DUT {}: Skip test since kdump is not enabled'.format(hostname))

reboot(duthost, localhost, reboot_type=REBOOT_TYPE_KERNEL_PANIC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amnsinghal why not trigger panic via echo c > /proc/sysrq-trigger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed what is being done with "REBOOT_TYPE_KERNEL_PANIC". See changes in reboot.py above.

@amulyan7
Copy link
Contributor

@prgeor if no further comments, can you please approve/merge, thanks!

@bpar9
Copy link
Contributor

bpar9 commented Dec 21, 2023

@prgeor , @yxieca , as we addressed all the comments, can we go ahead and merge this? thanks

@shivuv
Copy link

shivuv commented Dec 21, 2023

@prgeor @yxieca @gechiang @abdosi : Please help to merge this PR into master and cherry pick this into 202205, 202305 and 202311 branches.

@abdosi abdosi merged commit 7b1d43a into sonic-net:master Dec 29, 2023
16 checks passed
@abdosi
Copy link
Contributor

abdosi commented Dec 29, 2023

@wangxin please help with cherry-pick into 202205/202305.

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Feb 1, 2024
What is the motivation for this PR?
Add automation test for kdump functionality.

How did you do it?
Induce kernel panic thru sysrq-trigger.

How did you verify/test it?
For platform with kdump enabled, trigger kernel panic, and check for reboot-cause on subsequent bootup.
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Feb 1, 2024
What is the motivation for this PR?
Add automation test for kdump functionality.

How did you do it?
Induce kernel panic thru sysrq-trigger.

How did you verify/test it?
For platform with kdump enabled, trigger kernel panic, and check for reboot-cause on subsequent bootup.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #11485

@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #11486

mssonicbld pushed a commit that referenced this pull request Feb 1, 2024
What is the motivation for this PR?
Add automation test for kdump functionality.

How did you do it?
Induce kernel panic thru sysrq-trigger.

How did you verify/test it?
For platform with kdump enabled, trigger kernel panic, and check for reboot-cause on subsequent bootup.
mssonicbld pushed a commit that referenced this pull request Feb 1, 2024
What is the motivation for this PR?
Add automation test for kdump functionality.

How did you do it?
Induce kernel panic thru sysrq-trigger.

How did you verify/test it?
For platform with kdump enabled, trigger kernel panic, and check for reboot-cause on subsequent bootup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants