Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sonobuoy conformance test fails on Centos/RHEL #1960

Closed
ShylajaDevadiga opened this issue Jun 26, 2020 · 6 comments
Closed

Sonobuoy conformance test fails on Centos/RHEL #1960

ShylajaDevadiga opened this issue Jun 26, 2020 · 6 comments
Assignees
Labels
kind/task Work not related to bug fixes or new functionality
Milestone

Comments

@ShylajaDevadiga
Copy link
Contributor

Version:
k3s v1.18.4+k3s1
Centos 7.6
RHEL 7.8

Describe the bug
Conformance test failures with selinux enabled and selinux disabled.

RHEL 7.8 SELinux Enabled

Plugin: e2e
Status: failed
Total: 4992
Passed: 263
Failed: 14
Skipped: 4715

Failed tests:

[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should deny crd creation [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate custom resource [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate pod and apply defaults after mutation [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny attaching pod [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] listing mutating webhooks should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] patching/updating a mutating webhook should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny custom resource creation, update and deletion [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should honor timeout [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] listing validating webhooks should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny pod and configmap creation [Conformance]
[sig-api-machinery] CustomResourceConversionWebhook [Privileged:ClusterAdmin] should be able to convert from CR v1 to CR v2 [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should unconditionally reject operations on fail closed webhook [Conformance]
[sig-api-machinery] CustomResourceConversionWebhook [Privileged:ClusterAdmin] should be able to convert a non homogeneous list of CRs [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate custom resource with pruning [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should not be able to mutate or prevent deletion of webhook configuration objects [Conformance]

RHEL 7.8 SELinux disabled

Plugin: e2e
Status: failed
Total: 4992
Passed: 263
Failed: 14
Skipped: 4715

Failed tests:
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate custom resource [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate pod and apply defaults after mutation [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny attaching pod [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] listing mutating webhooks should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] patching/updating a mutating webhook should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny custom resource creation, update and deletion [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should honor timeout [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] listing validating webhooks should work [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should be able to deny pod and configmap creation [Conformance]
[sig-api-machinery] CustomResourceConversionWebhook [Privileged:ClusterAdmin] should be able to convert from CR v1 to CR v2 [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should unconditionally reject operations on fail closed webhook [Conformance]
[sig-api-machinery] CustomResourceConversionWebhook [Privileged:ClusterAdmin] should be able to convert a non homogeneous list of CRs [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate custom resource with pruning [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should not be able to mutate or prevent deletion of webhook configuration objects [Conformance]

With SELinux enabled we have these two test fails consistently

should honor timeout [Conformance]
should deny crd creation [Conformance]
@davidnuzik davidnuzik added this to the v1.19 - September milestone Jun 26, 2020
@davidnuzik davidnuzik added [zube]: Next Up kind/task Work not related to bug fixes or new functionality labels Jun 26, 2020
@brandond
Copy link
Member

brandond commented Jul 9, 2020

This appears to be caused by the kernel vxlan checksum bug that has been causing all kinds of issues. According to flannel-io/flannel#1282 (comment) this has been fixed upstream by kubernetes/kubernetes#92035 which is being cherry-picked back to 1.17 and 1.18.

In the mean time, we can switch the flannel backend to something other than vxlan (I personally usually use host-gw) or use ethtool to disable checksum on the vxlan interface. I have confirmed that either workaround results in a clean test run. Long term this should be fixed in the next releases of upstream k8s.

tl;dr do one of:

  • Start k3s server with --flannel-backend=host-gw (or anything else other than vxlan)
  • After k3s startup, run ethtool --offload flannel.1 rx off tx off on all nodes (servers AND agents). Note that this must be done again on every reboot

@brandond
Copy link
Member

Will track underlying issue in #2013

@davidnuzik
Copy link
Contributor

@ShylajaDevadiga you can test this post-patch releases. Once we have these 7/15 patch releases out you can test with them again in CentOS or RHEL and the issue should be resolved.

@davidnuzik
Copy link
Contributor

moving this to next up. QA is going to stand by until patch releases are done - then once available this can be moved back to test.

@davidnuzik
Copy link
Contributor

Ready for testing

@ShylajaDevadiga
Copy link
Contributor Author

k3s version: v1.18.6+k3s1
Post-patch release, conformance tests passed on Centos 7.8 and RHEL 7.8
Centos 7.8:

cat /etc/redhat-release 
CentOS Linux release 7.8.2003 (Core)
sonobuoy results $res
Plugin: e2e
Status: passed
Total: 4992
Passed: 277
Failed: 0
Skipped: 4715
Plugin: systemd-logs
Status: passed
Total: 2
Passed: 2
Failed: 0
Skipped: 0

RHEL 7.8:

cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.8 (Maipo)
sonobuoy results $res
Plugin: e2e
Status: passed
Total: 4992
Passed: 277
Failed: 0
Skipped: 4715
Plugin: systemd-logs
Status: passed
Total: 2
Passed: 2
Failed: 0
Skipped: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/task Work not related to bug fixes or new functionality
Projects
None yet
Development

No branches or pull requests

3 participants