Failure in k8s-operator endpoint-template (PostStart hook) #7396

jcsp · 2022-11-21T12:37:08Z

https://buildkite.com/redpanda/redpanda/builds/18840#01849373-8425-495a-95e6-7c462a832601

--- FAIL: kuttl (2273.54s)
    --- FAIL: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/resources-redpanda (9.51s)
        --- PASS: kuttl/harness/additional-configuration (218.28s)
        --- PASS: kuttl/harness/redpanda-schema-registry-sasl (401.49s)
        --- PASS: kuttl/harness/produce-tls (12.22s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume-tls-client (10.17s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume-sasl (46.31s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume (79.35s)
        --- PASS: kuttl/harness/node-select-tolerations (6.38s)
        --- PASS: kuttl/harness/external-connectivity (9.23s)
        --- PASS: kuttl/harness/explicit-ports-tls (10.15s)
        --- PASS: kuttl/harness/redpanda-schema-registry (649.27s)
        --- PASS: kuttl/harness/create-topic-with-client-auth (11.27s)
        --- FAIL: kuttl/harness/endpoint-template (300.70s)
        --- PASS: kuttl/harness/create-topic-given-issuer (47.35s)
        --- PASS: kuttl/harness/create-topic-given-issuer-with-client-auth (55.63s)
        --- PASS: kuttl/harness/create-topic-given-cert-secret (52.29s)
        --- PASS: kuttl/harness/create-topic-given-cm-secret-client-auth (56.34s)
        --- PASS: kuttl/harness/console-admin-api (164.97s)
        --- PASS: kuttl/harness/console-kafka-mtls (172.62s)
        --- PASS: kuttl/harness/admin-api-tls-client-auth (11.22s)
        --- PASS: kuttl/harness/admin-api-tls (12.21s)
        --- PASS: kuttl/harness/admin-api (10.09s)
        --- PASS: kuttl/harness/centralized-configuration-bootstrap (43.42s)
        --- PASS: kuttl/harness/confluent-schema-registry (60.37s)
        --- PASS: kuttl/harness/console (164.46s)
        --- PASS: kuttl/harness/centralized-configuration-upgrade (117.20s)
        --- PASS: kuttl/harness/centralized-configuration-tls (218.49s)
        --- PASS: kuttl/harness/update-image-and-node-port (294.65s)
        --- PASS: kuttl/harness/update-image-tls-client-auth (291.80s)
        --- PASS: kuttl/harness/update-image-tls (238.90s)
        --- PASS: kuttl/harness/update (85.01s)
        --- PASS: kuttl/harness/centralized-configuration-drift (96.50s)
        --- PASS: kuttl/harness/superusers-prefix (194.81s)
FAIL

The text was updated successfully, but these errors were encountered:

RafalKorepta · 2022-11-21T13:09:40Z

This is miss leading. The amount of logs generated by kuttl from unstable k8s test hides the true problem in the main test suite.

Taken from raw output.

--- FAIL: kuttl (2273.54s)
    --- FAIL: kuttl/harness (0.00s)
        --- PASS: kuttl/harness/resources-redpanda (9.51s)
        --- PASS: kuttl/harness/additional-configuration (218.28s)
        --- PASS: kuttl/harness/redpanda-schema-registry-sasl (401.49s)
        --- PASS: kuttl/harness/produce-tls (12.22s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume-tls-client (10.17s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume-sasl (46.31s)
        --- PASS: kuttl/harness/pandaproxy-produce-consume (79.35s)
        --- PASS: kuttl/harness/node-select-tolerations (6.38s)
        --- PASS: kuttl/harness/external-connectivity (9.23s)
        --- PASS: kuttl/harness/explicit-ports-tls (10.15s)
        --- PASS: kuttl/harness/redpanda-schema-registry (649.27s)
        --- PASS: kuttl/harness/create-topic-with-client-auth (11.27s)
        --- FAIL: kuttl/harness/endpoint-template (300.70s)
        --- PASS: kuttl/harness/create-topic-given-issuer (47.35s)
        --- PASS: kuttl/harness/create-topic-given-issuer-with-client-auth (55.63s)
        --- PASS: kuttl/harness/create-topic-given-cert-secret (52.29s)
        --- PASS: kuttl/harness/create-topic-given-cm-secret-client-auth (56.34s)
        --- PASS: kuttl/harness/console-admin-api (164.97s)
        --- PASS: kuttl/harness/console-kafka-mtls (172.62s)
        --- PASS: kuttl/harness/admin-api-tls-client-auth (11.22s)
        --- PASS: kuttl/harness/admin-api-tls (12.21s)
        --- PASS: kuttl/harness/admin-api (10.09s)
        --- PASS: kuttl/harness/centralized-configuration-bootstrap (43.42s)
        --- PASS: kuttl/harness/confluent-schema-registry (60.37s)
        --- PASS: kuttl/harness/console (164.46s)
        --- PASS: kuttl/harness/centralized-configuration-upgrade (117.20s)
        --- PASS: kuttl/harness/centralized-configuration-tls (218.49s)
        --- PASS: kuttl/harness/update-image-and-node-port (294.65s)
        --- PASS: kuttl/harness/update-image-tls-client-auth (291.80s)
        --- PASS: kuttl/harness/update-image-tls (238.90s)
        --- PASS: kuttl/harness/update (85.01s)
        --- PASS: kuttl/harness/centralized-configuration-drift (96.50s)
        --- PASS: kuttl/harness/superusers-prefix (194.81s)
FAIL

Kuttl returned exit code 1. Creating artifacts tarball

RafalKorepta · 2022-11-21T13:22:37Z

Events from kuttle output doesn't show the root cause of pod-0 not being scheduled

    logger.go:42: 06:03:12 | endpoint-template | endpoint-template events from ns kuttl-test-select-louse:
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-admin-client		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-kafka-root-certificate-7wd6c		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-kafka-root-certificate-7wd6c		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-kafka-root-certificate		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-kafka-root-certificate		Generated	Stored new private key in temporary Secret resource "endpoint-template-kafka-root-certificate-ldcxk"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-kafka-root-certificate		Requested	Created new CertificateRequest resource "endpoint-template-kafka-root-certificate-7wd6c"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-kafka-root-certificate		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Issuer.cert-manager.io endpoint-template-kafka-root-issuer		KeyPairVerified	Signing CA verified		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-operator-client		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-node		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-root-certificate		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Warning	Issuer.cert-manager.io endpoint-template-proxy-root-issuer		ErrGetKeyPair	Error getting keypair for CA issuer: secret "endpoint-template-proxy-root-certificate" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Warning	Issuer.cert-manager.io endpoint-template-proxy-root-issuer		ErrInitIssuer	Error initializing issuer: secret "endpoint-template-proxy-root-certificate" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-redpanda		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-user-client		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-0		WaitForFirstConsumer	waiting for first consumer to be created before binding		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-0		ExternalProvisioning	waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-0		Provisioning	External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-0"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-1		WaitForFirstConsumer	waiting for first consumer to be created before binding		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-1		ExternalProvisioning	waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-1		Provisioning	External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-1"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-2		WaitForFirstConsumer	waiting for first consumer to be created before binding		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-2		ExternalProvisioning	waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-2		Provisioning	External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-2"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-admin-client-vrzps		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-admin-client-vrzps		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-admin-client		Generated	Stored new private key in temporary Secret resource "endpoint-template-admin-client-pzjjd"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-admin-client		Requested	Created new CertificateRequest resource "endpoint-template-admin-client-vrzps"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-admin-client		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-operator-client-gtb6j		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-operator-client-gtb6j		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-operator-client		Generated	Stored new private key in temporary Secret resource "endpoint-template-operator-client-jt6hs"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-operator-client		Requested	Created new CertificateRequest resource "endpoint-template-operator-client-gtb6j"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-operator-client		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-client		Issuing	Issuing certificate as Secret does not exist		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-client		Generated	Stored new private key in temporary Secret resource "endpoint-template-proxy-api-client-xlc78"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-node		Generated	Stored new private key in temporary Secret resource "endpoint-template-proxy-api-node-l5whb"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-root-certificate-6r8wx		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-root-certificate-6r8wx		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-root-certificate		Generated	Stored new private key in temporary Secret resource "endpoint-template-proxy-root-certificate-dbbm2"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-root-certificate		Requested	Created new CertificateRequest resource "endpoint-template-proxy-root-certificate-6r8wx"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-root-certificate		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-redpanda-fwzf7		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-redpanda-fwzf7		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-redpanda		Generated	Stored new private key in temporary Secret resource "endpoint-template-redpanda-5pntk"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-redpanda		Requested	Created new CertificateRequest resource "endpoint-template-redpanda-fwzf7"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-redpanda		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-user-client		Generated	Stored new private key in temporary Secret resource "endpoint-template-user-client-hxcdh"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	PodDisruptionBudget.policy endpoint-template		NoPods	No matching pods found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Claim datadir-endpoint-template-0 Pod endpoint-template-0 in StatefulSet endpoint-template success		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Pod endpoint-template-0 in StatefulSet endpoint-template successful		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Claim datadir-endpoint-template-1 Pod endpoint-template-1 in StatefulSet endpoint-template success		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Pod endpoint-template-1 in StatefulSet endpoint-template successful		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Claim datadir-endpoint-template-2 Pod endpoint-template-2 in StatefulSet endpoint-template success		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC	Normal	StatefulSet.apps endpoint-template		SuccessfulCreate	create Pod endpoint-template-2 in StatefulSet endpoint-template successful		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl		IssuerNotReady	Referenced issuer does not have a Ready status condition		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-client		Requested	Created new CertificateRequest resource "endpoint-template-proxy-api-client-c6tvl"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5		IssuerNotReady	Referenced issuer does not have a Ready status condition		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-node		Requested	Created new CertificateRequest resource "endpoint-template-proxy-api-node-nkxz5"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-user-client-nkxj8		cert-manager.io	Certificate request has been approved by cert-manager.io		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-user-client-nkxj8		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-user-client		Requested	Created new CertificateRequest resource "endpoint-template-user-client-nkxj8"		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:14 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-user-client		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-0		ProvisioningSucceeded	Successfully provisioned volume pvc-dd6f0158-409c-419b-bfe1-fa1686cfcd87		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-1		ProvisioningSucceeded	Successfully provisioned volume pvc-480df756-64fd-4d71-9968-d7ec2088faa6		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Normal	PersistentVolumeClaim datadir-endpoint-template-2		ProvisioningSucceeded	Successfully provisioned volume pvc-d9ab6a89-740e-4369-8b7d-6962fcc6022e		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Normal	Pod endpoint-template-0		Scheduled	Successfully assigned kuttl-test-select-louse/endpoint-template-0 to kind-worker		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Warning	Pod endpoint-template-0		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Warning	Pod endpoint-template-0		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Normal	Pod endpoint-template-1		Scheduled	Successfully assigned kuttl-test-select-louse/endpoint-template-1 to kind-worker6		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Warning	Pod endpoint-template-1		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC	Warning	Pod endpoint-template-1		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	Pod endpoint-template-2		Scheduled	Successfully assigned kuttl-test-select-louse/endpoint-template-2 to kind-worker8		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Warning	Pod endpoint-template-2		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Warning	Pod endpoint-template-2		FailedMount	MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-client		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5		CertificateIssued	Certificate fetched from issuer successfully		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	Certificate.cert-manager.io endpoint-template-proxy-api-node		Issuing	The certificate has been successfully issued		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC	Normal	Issuer.cert-manager.io endpoint-template-proxy-root-issuer		KeyPairVerified	Signing CA verified		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC	Normal	Pod endpoint-template-2.spec.initContainers{redpanda-configurator}		Pulled	Container image "localhost/configurator:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC	Normal	Pod endpoint-template-2.spec.initContainers{redpanda-configurator}		Created	Created container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC	Normal	Pod endpoint-template-2.spec.initContainers{redpanda-configurator}		Started	Started container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-0.spec.initContainers{redpanda-configurator}		Pulled	Container image "localhost/configurator:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-0.spec.initContainers{redpanda-configurator}		Created	Created container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-0.spec.initContainers{redpanda-configurator}		Started	Started container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-1.spec.initContainers{redpanda-configurator}		Pulled	Container image "localhost/configurator:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-1.spec.initContainers{redpanda-configurator}		Created	Created container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-1.spec.initContainers{redpanda-configurator}		Started	Started container redpanda-configurator		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-2.spec.containers{redpanda}		Pulled	Container image "localhost/redpanda:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-2.spec.containers{redpanda}		Created	Created container redpanda		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC	Normal	Pod endpoint-template-2.spec.containers{redpanda}		Started	Started container redpanda		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-0.spec.containers{redpanda}		Pulled	Container image "localhost/redpanda:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-0.spec.containers{redpanda}		Created	Created container redpanda		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-0.spec.containers{redpanda}		Started	Started container redpanda		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-1.spec.containers{redpanda}		Pulled	Container image "localhost/redpanda:dev" already present on machine		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-1.spec.containers{redpanda}		Created	Created container redpanda		
    logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC	Normal	Pod endpoint-template-1.spec.containers{redpanda}		Started	Started container redpanda

RafalKorepta · 2022-11-21T13:23:44Z

The kubectl get pod output shows the following problem

...
            {
              "lastProbeTime": null,
              "lastTransitionTime": "2022-11-20T05:58:15Z",
              "message": "containers with unready status: [redpanda]",
              "reason": "ContainersNotReady",
              "status": "False",
              "type": "Ready"
            },
            {
              "lastProbeTime": null,
              "lastTransitionTime": "2022-11-20T05:58:15Z",
              "message": "containers with unready status: [redpanda]",
              "reason": "ContainersNotReady",
              "status": "False",
              "type": "ContainersReady"
            }
          ],
          "containerStatuses": [
            {
              "image": "localhost/redpanda:dev",
              "imageID": "",
              "lastState": {},
              "name": "redpanda",
              "ready": false,
              "restartCount": 0,
              "started": false,
              "state": {
                "waiting": {
                  "reason": "PodInitializing"
                }
              }
            }
          ],
...

REF redpanda-data#7396

RafalKorepta · 2022-11-22T08:32:28Z

Nov 20 06:05:15 kind-worker kubelet[258]: E1120 06:05:15.425494     258 pod_workers.go:190] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"redpanda\" with PostStartHookError: command '/scripts/postStart.sh' exited with 137: : \"Exec lifecycle hook ([/scripts/postStart.sh]) for Container \\\"redpanda\\\" in Pod \\\"endpoint-template-0_kuttl-test-select-louse(35d2ae0a-bd23-41f7-81a4-89ac99f99f4f)\\\" failed - error: command '/scripts/postStart.sh' exited with 137: , message: \\\"\\\"\"" pod="kuttl-test-select-louse/endpoint-template-0" podUID=35d2ae0a-bd23-41f7-81a4-89ac99f99f4f

For what ever reason the postStart script failed.

RafalKorepta · 2022-11-22T08:56:29Z

Second time https://buildkite.com/redpanda/redpanda/builds/18895#01849d6c-9736-4275-b872-dd1116132d33 the same problem

--- FAIL: kuttl/harness/endpoint-template (601.47s)

RafalKorepta · 2022-11-22T08:57:34Z

Third time https://buildkite.com/redpanda/redpanda/builds/18875#01849b55-2a33-426c-9267-ae7b4410d220 the same problem

From this test run one pod named ednpoint-template-1 couldn't finish PostStart hook. The following logs was gathered from the Redpanda container:

2022-11-21T19:11:14.036075263Z stderr F INFO  2022-11-21 19:11:14,035 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036108999Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036312853Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036370634Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.036411964Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)
2022-11-21T19:11:14.036415432Z stderr F INFO  2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036420704Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036604935Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036633878Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.036663665Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)
2022-11-21T19:11:14.036678215Z stderr F INFO  2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036689774Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036817588Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036851303Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.03687066Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. REF redpanda-data#7396

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. To find more evidence what is happening kind export logs command was added. It should be collected by the buildkite agent and served for later investigations. REF redpanda-data#7396

k8s: Gather k8s events from failed test

RafalKorepta · 2022-11-22T13:33:03Z

The script that tries to remove maintenance mode

#!/usr/bin/env bash
set -e

until NODE_ID=$(curl --silent --fail http://${POD_NAME}.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local:9644/v1/node_config | grep -o '\"node_id\":[^,}]*' | grep -o '[^: ]*$'); do
	sleep 0.5
done
echo "Clearing maintenance mode on node ${NODE_ID}"
until [ "${status:-}" = "200" ] || [ "${status:-}" = "400" ]; do
	status=$(curl -X DELETE --silent -o /dev/null -w "%{http_code}" http://${POD_NAME}.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local:9644/v1/brokers/${NODE_ID}/maintenance)
	sleep 0.5
done

RafalKorepta · 2022-11-22T13:35:03Z

Redpanda config

redpanda:
    data_directory: /var/lib/redpanda/data
    empty_seed_starts_cluster: false
    seed_servers:
        - host:
            address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
            port: 33145
        - host:
            address: endpoint-template-1.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
            port: 33145
        - host:
            address: endpoint-template-2.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
            port: 33145
    rpc_server:
        address: 0.0.0.0
        port: 33145
    kafka_api:
        - address: 0.0.0.0
          port: 9092
          name: kafka
          authentication_method: none
        - address: 0.0.0.0
          port: 9093
          name: kafka-external
          authentication_method: none
    kafka_api_tls:
        - name: kafka-external
          key_file: /etc/tls/certs/tls.key
          cert_file: /etc/tls/certs/tls.crt
          truststore_file: /etc/tls/certs/ca/ca.crt
          enabled: true
          require_client_auth: true
    admin:
        - address: 0.0.0.0
          port: 9644
          name: admin
    advertised_rpc_api:
        address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
        port: 33145
    advertised_kafka_api:
        - address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
          port: 9092
          name: kafka
        - address: 0-5c2a3f8f4c-kafka.example.com
          port: 30804
          name: kafka-external
    developer_mode: true
    auto_create_topics_enabled: true
    cloud_storage_segment_max_upload_interval_sec: 1800
    default_topic_replications: 3
    enable_rack_awareness: true
    fetch_reads_debounce_timeout: 10
    group_topic_partitions: 3
    id_allocator_replication: 3
    log_segment_size: 536870912
    storage_min_free_bytes: 10485760
    topic_partitions_per_shard: 1000
    transaction_coordinator_replication: 3
rpk:
    tune_network: true
    tune_disk_scheduler: true
    tune_disk_nomerges: true
    tune_disk_write_cache: true
    tune_disk_irq: true
    tune_cpu: true
    tune_aio_events: true
    tune_clocksource: true
    tune_swappiness: true
    coredump_dir: /var/lib/redpanda/coredump
    tune_ballast_file: true
    overprovisioned: true
pandaproxy:
    pandaproxy_api:
        - address: 0.0.0.0
          port: 8082
          name: proxy
        - address: 0.0.0.0
          port: 8083
          name: proxy-external
    pandaproxy_api_tls:
        - name: proxy-external
          key_file: /etc/tls/certs/pandaproxy/tls.key
          cert_file: /etc/tls/certs/pandaproxy/tls.crt
          truststore_file: /etc/tls/certs/pandaproxy/ca/ca.crt
          enabled: true
          require_client_auth: true
    advertised_pandaproxy_api:
        - address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
          port: 8082
          name: proxy
        - address: 0-5c2a3f8f4c-pandaproxy.example.com
          port: 32576
          name: proxy-external
pandaproxy_client:
    brokers:
        - address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
          port: 9092
        - address: endpoint-template-1.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
          port: 9092
        - address: endpoint-template-2.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
          port: 9092
schema_registry: {}

andrwng · 2022-11-22T17:30:04Z

From Redpanda's perspective, a couple observations:

this node believes it isn't a cluster founder, meaning it either found that another node has a cluster UUID (only possible if we have previously been able to send RPCs), or it's not configured as a part of the seed_servers list
it looks like this is a brand new node that doesn't yet have a node ID (the code path hit means that we're trying to get assigned a node ID because we don't have one stored locally yet)

Just curious if that aligns with what we think the test is doing; is it adding a brand new node to a cluster? For a brand new node, is it possible to skip the maintenance mode disabling? More broadly speaking, if the issue is that we can't send any RPCs while initializing, that seems problematic, and maybe we need to expose a different readiness endpoint for these hooks to use.

2022-11-21T19:11:14.036108999Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036312853Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
...
2022-11-21T19:11:14.036420704Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036604935Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
...
2022-11-21T19:11:14.036689774Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036817588Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4

It looks like the node is having trouble sending RPCs to any of the seed nodes. Is that something that we should be expecting when initializing a pod?

dlex · 2022-11-22T18:28:07Z

once more in a feature branch: https://buildkite.com/redpanda/redpanda/builds/18931#0184a033-7ea9-4375-a819-226a353770d7

andrwng · 2022-11-22T18:59:50Z

Another thing I'm confused about is why this is only happening only sometimes, if the issue truly is that DNS setup cannot proceed until the node exposes its admin endpoint.

joejulian · 2022-11-23T00:24:49Z

Just for posterity, exited with 137 means that it was still trying with the pod was deleted.

RafalKorepta · 2022-11-23T08:20:26Z

It looks like the node is having trouble sending RPCs to any of the seed nodes. Is that something that we should be expecting when initializing a pod?

Thanks @andrwng I will drill down with the DNS (core dns) pod if it was causing the problem.

Just curious if that aligns with what we think the test is doing; is it adding a brand new node to a cluster?

This tests is related to setting up the external advertised kafka api DNS using some template mechanism. It happens that this particular test will create 3 Redpanda cluster. Others tries to create only 1 Redpanda cluster.

For a brand new node, is it possible to skip the maintenance mode disabling?

It's not that easy. We would need to implement some hacks around when node is considered formed and when not. If we can not use Admin API reliably then I we might end up with some kind of workaround.

More broadly speaking, if the issue is that we can't send any RPCs while initializing, that seems problematic, and maybe we need to expose a different readiness endpoint for these hooks to use.

We are not using readiness probe to put Redpanda broker out of maintenance mode. It's PostStart hook https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks

RafalKorepta · 2022-11-23T14:26:37Z

DNS containers doesn't have any useful logs. I don't want to blame kind cluster or buildkite environment, but I run out of options.

Maybe we have too many kind nodes, but it's a stretch hypothesis.

RafalKorepta · 2022-11-28T14:17:13Z

Another occurrence:
https://buildkite.com/redpanda/redpanda/builds/19160#0184bd5a-bcef-427d-86f6-47c3cb00947f/6-168

RafalKorepta · 2022-12-01T14:49:01Z

Another: https://buildkite.com/redpanda/redpanda/builds/19308#0184cb41-789f-4675-95f1-ac85941c890e/1464-1492

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. To find more evidence what is happening kind export logs command was added. It should be collected by the buildkite agent and served for later investigations. REF redpanda-data#7396

dlex · 2023-01-01T20:03:56Z

One more: https://buildkite.com/redpanda/redpanda/builds/20442#01856bee-53ee-4718-add0-9b42681a2f6c

joejulian · 2023-01-13T00:19:21Z

We need to collect previous logs from pods. All we can see is that pod/endpoint-template-1's initContainer was running at the time we collected logs.

dotnwat · 2023-01-13T02:08:12Z

We need to collect previous logs from pods. All we can see is that pod/endpoint-template-1's initContainer was running at the time we collected logs.

can that be done automatically and added into the buildkite artifacts so that we can access them?

joejulian · 2023-01-24T17:50:17Z

That will be the next step when this next has someone working on it, yes.

RafalKorepta · 2023-02-06T17:54:21Z

Based on 1 week of failures there was no failures in endpoint-template. I will close this issue. If it reappears I will re-open.

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. To find more evidence what is happening kind export logs command was added. It should be collected by the buildkite agent and served for later investigations. REF redpanda-data#7396

…dpanda-datagh-7396/gather-more-events k8s: Gather k8s events from failed test

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. To find more evidence what is happening kind export logs command was added. It should be collected by the buildkite agent and served for later investigations. REF redpanda-data#7396

…edpanda-datagh-7396/gather-more-events k8s: Gather k8s events from failed test

Due to failing PostStart hook it seems that one of the Redpanda broker is in degraded state and AdminAPI is not responsive. The Pod log collector does not work if pod is in `PodInitializing` state. To find more evidence what is happening kind export logs command was added. It should be collected by the buildkite agent and served for later investigations. REF redpanda-data#7396 (cherry picked from commit 441a967)

jcsp added kind/bug Something isn't working area/k8s ci-failure labels Nov 21, 2022

jcsp assigned joejulian Nov 21, 2022

RafalKorepta pushed a commit to RafalKorepta/redpanda that referenced this issue Nov 21, 2022

k8s: Gather k8s events from failed test

418dfa4

REF redpanda-data#7396

RafalKorepta mentioned this issue Nov 21, 2022

k8s: Gather k8s events from failed test #7402

Merged

6 tasks

RafalKorepta changed the title ~~Failure in k8s-operator centralized-configuration, decommission~~ Failure in k8s-operator endpoint-template (PostStart hook) Nov 22, 2022

RafalKorepta added a commit that referenced this issue Nov 22, 2022

Merge pull request #7402 from RafalKorepta/rk/gh-7396/gather-more-events

2212820

k8s: Gather k8s events from failed test

RafalKorepta self-assigned this Nov 22, 2022

dlex mentioned this issue Nov 22, 2022

[v22.3.x] Fix a use-after-move in partition_balancer #7437

Merged

joejulian unassigned joejulian and RafalKorepta Jan 13, 2023

RafalKorepta closed this as completed Feb 6, 2023

joejulian pushed a commit to joejulian/redpanda that referenced this issue Mar 10, 2023

(split) Merge pull request redpanda-data#7402 from RafalKorepta/rk/re…

4b06b0f

…dpanda-datagh-7396/gather-more-events k8s: Gather k8s events from failed test

joejulian pushed a commit to joejulian/redpanda that referenced this issue Mar 24, 2023

operator: Merge pull request redpanda-data#7402 from RafalKorepta/rk/r…

4fe9d83

…edpanda-datagh-7396/gather-more-events k8s: Gather k8s events from failed test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure in k8s-operator endpoint-template (PostStart hook) #7396

Failure in k8s-operator endpoint-template (PostStart hook) #7396

jcsp commented Nov 21, 2022 •

edited by RafalKorepta

Loading

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 22, 2022

RafalKorepta commented Nov 22, 2022 •

edited

Loading

RafalKorepta commented Nov 22, 2022 •

edited

Loading

RafalKorepta commented Nov 22, 2022

RafalKorepta commented Nov 22, 2022

andrwng commented Nov 22, 2022

dlex commented Nov 22, 2022

andrwng commented Nov 22, 2022

joejulian commented Nov 23, 2022

RafalKorepta commented Nov 23, 2022

RafalKorepta commented Nov 23, 2022

RafalKorepta commented Nov 28, 2022

RafalKorepta commented Dec 1, 2022

dlex commented Jan 1, 2023

joejulian commented Jan 13, 2023

dotnwat commented Jan 13, 2023

joejulian commented Jan 24, 2023

RafalKorepta commented Feb 6, 2023

Failure in k8s-operator endpoint-template (PostStart hook) #7396

Failure in k8s-operator endpoint-template (PostStart hook) #7396

Comments

jcsp commented Nov 21, 2022 • edited by RafalKorepta Loading

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 21, 2022

RafalKorepta commented Nov 22, 2022

RafalKorepta commented Nov 22, 2022 • edited Loading

RafalKorepta commented Nov 22, 2022 • edited Loading

RafalKorepta commented Nov 22, 2022

RafalKorepta commented Nov 22, 2022

andrwng commented Nov 22, 2022

dlex commented Nov 22, 2022

andrwng commented Nov 22, 2022

joejulian commented Nov 23, 2022

RafalKorepta commented Nov 23, 2022

RafalKorepta commented Nov 23, 2022

RafalKorepta commented Nov 28, 2022

RafalKorepta commented Dec 1, 2022

dlex commented Jan 1, 2023

joejulian commented Jan 13, 2023

dotnwat commented Jan 13, 2023

joejulian commented Jan 24, 2023

RafalKorepta commented Feb 6, 2023

jcsp commented Nov 21, 2022 •

edited by RafalKorepta

Loading

RafalKorepta commented Nov 22, 2022 •

edited

Loading

RafalKorepta commented Nov 22, 2022 •

edited

Loading