Description
Problem Description
SYSTEMIC ISSUE: Both StatefulSet and Service configurations across multiple Jupyter image types have label/selector mismatches that cause critical deployment and networking issues.
StatefulSet Issues
StatefulSet configurations have empty spec.selector and template.metadata.labels fields causing:
- Kubernetes rejecting manifests with: .spec.selector: Invalid value: {}: field is immutable and must be specified
- Pods created without labels
- Rolling update operations failing
- Future label updates blocked due to selector immutability constraints
Service Issues
Service manifests have hardcoded app: notebook labels in both metadata.labels and spec.selector, while kustomization adds additional app: jupyter-{component}-ubi9-python-3-{version} labels with includeSelectors: true. This causes:
- Service selectors demanding both labels after kustomize render
- Pod templates only receiving the generated label (no app: notebook)
- Empty Endpoints lists and 503 errors when accessing notebook services
Affected Files
StatefulSet configurations (13 files with empty selectors/labels):
- jupyter/datascience/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/datascience/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/minimal/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/minimal/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/pytorch/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/pytorch/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/rocm/pytorch/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/rocm/pytorch/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/rocm/tensorflow/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/tensorflow/ubi9-python-3.11/kustomize/base/statefulset.yaml
- jupyter/tensorflow/ubi9-python-3.12/kustomize/base/statefulset.yaml
- jupyter/trustyai/ubi9-python-3.11/kustomize/base/statefulset.yaml
Service configurations (all corresponding service.yaml files with hardcoded app: notebook)
Impact
This affects the networking, deployment, and update functionality of ALL Jupyter notebook deployments across all image types and Python versions in the repository.
Suggested Fix
Option 1: Let kustomize handle all labeling
Empty selectors and labels, let kustomize inject them consistently.
Option 2: Use distinct label keys
Modify kustomization to use app.kubernetes.io/instance instead of app for injection.
Context
This systemic issue was identified during code reviews across multiple PRs and consolidated to track comprehensively.
References:
- Original StatefulSet PR: RHOAIENG-28511: merge the python-3.12 branch to opendatahub-io/notebooks#main #1230
- ROCm TensorFlow PR: RHOAIENG-27434: Create Rocm Tensorflow Python 3.12 Image #1259
- ROCm PyTorch PR: RHOAIENG-27435: Create Rocm Pytorch Python 3.12 Image #1249
- Consolidated from: StatefulSet selector and template labels configuration issue in jupyter/rocm/tensorflow #1264 (StatefulSet/ROCm TensorFlow), Service selector conflicts with LabelTransformer in ROCm TensorFlow kustomize manifests #1265 (Service/ROCm TensorFlow), Service/Pod label mismatch in ROCm PyTorch kustomize manifests causes connectivity issues #1251 (Service/ROCm PyTorch)
Reported by: @jiridanek
Metadata
Metadata
Assignees
Labels
Type
Projects
Status