Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate CEL validation rules not to enforce required fields when ObserveOnly #166

Merged
merged 3 commits into from
Apr 11, 2023
Merged

Generate CEL validation rules not to enforce required fields when ObserveOnly #166

merged 3 commits into from
Apr 11, 2023

Conversation

turkenh
Copy link
Member

@turkenh turkenh commented Mar 1, 2023

Description of your changes

This PR generates CEL validation rules not to enforce all the required fields when Observe Only as proposed in the design.

This is achieved by adding the following CEL rule for non-identifier (e.g. region) top level parameters (i.e. spec.forProvider.<param>) after removing the // +kubebuilder:validation:Required marker:

// +kubebuilder:validation:XValidation:rule="self.managementPolicy == 'ObserveOnly' || has(self.forProvider.<parameter>)",message="<parameter> is a required parameter"

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Tested by re-generating schema for upbound/provider-aws, check this PR to see it in action.

@turkenh turkenh changed the title Generate CEL validation rules not to require field when ObserveOnly Generate CEL validation rules not to enforce required fields when ObserveOnly Mar 1, 2023
@turkenh turkenh marked this pull request as ready for review March 1, 2023 09:26
@turkenh turkenh mentioned this pull request Mar 2, 2023
3 tasks
@turkenh turkenh requested a review from lsviben March 2, 2023 09:05
Copy link
Member

@sergenyalcin sergenyalcin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @turkenh I left a comment

var identifierFields []string
for _, node := range t.Root.Nodes {
if node.Type() == parse.NodeAction {
match := parameterPattern.FindStringSubmatch(node.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, we check the {{ .parameters.<field_name> }} for catching the root level required fields. This will cover some external name configurations if we use this convention. However, as you know, we also have another external name configurations, for example IdentifierFromProvider. For this case, we rely to the API schema of resource for the required fields while constructing the ID by terraform. What do you think about this type of cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of IdentifierFromProvider, provider assigns an id to the resource and we don't need (or at least know) parameters that will be included in that identifier. Noticed some helper methods in provider-aws like FormattedIdentifierFromProvider which should be updated to set relevant parameters as identifier fields.

In upjet, noticed that we need to mark the parameter as identifier in ParameterAsIdentifier method and updated accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not also be able to extract all the argument names from an arbitrary templating expression here (although the logic implemented should cover the most common case of config.TemplatedStringAsIdentifier, i.e., {{ .parameters.<field_name> }}).

And I also agree we cannot cover all the config.ExternalName instances as @sergenyalcin mentioned above (there will even be anonymous ones, which will necessitate manual configuration of the newly introduced config.ExternalName.IdentifierFields as discussed below).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded in the other thread: #166 (comment)

@turkenh
Copy link
Member Author

turkenh commented Apr 10, 2023

Rebased on latest main.

Copy link
Collaborator

@ulucinar ulucinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @turkenh, left some comments for discussion. I believe we need to discuss a little bit more if we can reliably figure out Terraform ID components.

@@ -127,6 +131,11 @@ func (g *Builder) AddToBuilder(typeNames *TypeNames, r *resource) (*types.Named,
obsType := types.NewNamed(typeNames.ObservationTypeName, types.NewStruct(r.obsFields, r.obsTags), nil)
g.genTypes = append(g.genTypes, obsType)

for _, p := range r.topLevelRequiredParams {
g.validationRules += "\n"
g.validationRules += fmt.Sprintf(`// +kubebuilder:validation:XValidation:rule="self.managementPolicy == 'ObserveOnly' || has(self.forProvider.%s)",message="%s is a required parameter"`, p, p)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to bump the crossplane-runtime dependency so that the spec.managementPolicy is available for the MRs. One question is what happens if we have this rule generated for the MRs but the spec.managementPolicy field is missing? I assume it will be safe because self.managementPolicy == 'ObserveOnly' condition will evaluate to false and has(self.forProvider.<argument>) will act just like the previous required marker, thus keeping backward compatibility. I assume we do not error in that case because of an invalid path. Is this really the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried this out and we get the following error during CRD creation:

The CustomResourceDefinition "instances.rds.aws.upbound.io" is invalid: spec.validation.openAPIV3Schema.properties[spec].x-kubernetes-validations[0].rule: Invalid value: apiextensions.ValidationRule{Rule:"self.managementPolicy == 'ObserveOnly' || has(self.forProvider.instanceClass)", Message:"instanceClass is a required parameter"}: compilation failed: ERROR: <input>:1:5: undefined field 'managementPolicy'
 | self.managementPolicy == 'ObserveOnly' || has(self.forProvider.instanceClass)
 | ....^

We will need to ensure that we are updating crossplane-runtime and crossplane-tools dependency with management policy, as part of rolling out this feature to the providers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving this a try @turkenh.

r.paramTags = append(r.paramTags, fmt.Sprintf(`json:"%s" tf:"%s"`, f.JSONTag, f.TFTag))
} else {
// Required fields should not have omitempty tag in json tag.
// TODO(muvaf): This overrides user intent if they provided custom
// JSON tag.
r.paramTags = append(r.paramTags, fmt.Sprintf(`json:"%s" tf:"%s"`, strings.TrimSuffix(f.JSONTag, ",omitempty"), f.TFTag))
}
req := !f.Schema.Optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that CRD validation rules have graduated to beta with Kubernetes 1.25. We may not be enforcing required fields at the Kubernetes API level for the previous Kubernetes versions if the corresponding alpha feature is not explicitly enabled.

Also, though practically less important because we will making previously required fields optional, we will also be introducing breaking API changes for such clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that CRD validation rules have graduated to beta with Kubernetes 1.25. We may not be enforcing required fields at the Kubernetes API level for the previous Kubernetes versions if the corresponding alpha feature is not explicitly enabled.

Correct. This was already discussed during the design here, and there was an agreement that this is reasonable given the reconciler will report the error from cloud API indicating that a required field is missing after the first reconciliation.

pkg/types/field.go Show resolved Hide resolved
pkg/types/builder.go Outdated Show resolved Hide resolved
// - req => required
// - !f.Identifier => not identifiers - i.e. region, zone, etc.
// - len(f.CanonicalPaths) == 1 => top level, i.e. not a nested field
if req && !f.Identifier && len(f.CanonicalPaths) == 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's assume a.b is a required argument that also happens to be a component in the Terraform ID. Let's also assume that a.c is a required argument that's not a component of the resource's Terraform ID. In this case, although we should be generating a CRD validation rule and not the unconditional +kubebuilder:validation:Required marker for a.c, the check on the length of the canonical path will prevent this. Not sure how much a problem this is because I don't recall any examples where a nested field is an identifier field, e.g., a composite type's field happens to be a path component. Please also see the relevant discussion below for types.NewField.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you noticed, we don't support nested identifier fields right now and I added a todo for the future.

For now, I believe this should be ok due to the following two reasons:

  1. There is no known real case for this.
  2. Even if there were, users would have to specify a.c when they want to observe (or import) that specific resource which is more like a degraded UX than a blocker. There will be no difference for default operation (i.e. FullControl), they have to specify a.c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. As discussed offline, our plan is to make manual external-name configurations via config.ExternalName.IdentifierFields as needed.

@@ -47,6 +48,8 @@ var (
GetIDFn: ExternalNameAsID,
DisableNameInitializer: true,
}

parameterPattern = regexp.MustCompile(`{{\s*\.parameters\.([^\s}]+)\s*}}`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Edge cases may arise when we use arbitrary templating expressions apart from {{ .parameters.<argument name> }} as a parameter to config.TemplatedStringAsIdentifier. This is not common but is theoretically possible. Please also see the comment for config.ExternalName.IdentifierFields, which is also related to the discussion here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I am getting this. Could you elaborate with some examples?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already discussed these offline but just taking a note here for future reference. Some examples we've found with @sergenyalcin are:

(We have already agreed these are not blockers. Just taking a note here for future reference).

var identifierFields []string
for _, node := range t.Root.Nodes {
if node.Type() == parse.NodeAction {
match := parameterPattern.FindStringSubmatch(node.String())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not also be able to extract all the argument names from an arbitrary templating expression here (although the logic implemented should cover the most common case of config.TemplatedStringAsIdentifier, i.e., {{ .parameters.<field_name> }}).

And I also agree we cannot cover all the config.ExternalName instances as @sergenyalcin mentioned above (there will even be anonymous ones, which will necessitate manual configuration of the newly introduced config.ExternalName.IdentifierFields as discussed below).

// resource identifier. We need to know these fields no matter what the
// management policy is including the Observe Only, different from other
// (required) fields.
IdentifierFields []string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried that we introduce a new external name configuration field without a robust defaulting for it so that we will have to do manual configurations to get proper required checks in place. We have covered some defaulting for this in config.TemplatedStringAsIdentifier but there are even anonymous config.ExternalName instances which will not be covered.

If this is not properly configured (i.e., if an identifier parameter is not correctly marked as so), we would expect the Terraform validation to catch it but this would still be a compromise in our API quality. Is there a way to extract this information regarding which Terraform configuration arguments are part of the Terraform ID from the native schema? What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but there are even anonymous config.ExternalName instances which will not be covered.

Could you point to some examples? Can we configure them as part of rolling the feature out to the providers?

If this is not properly configured (i.e., if an identifier parameter is not correctly marked as so), we would expect the Terraform validation to catch it, but this would still be a compromise in our API quality.

To be clear, if an identifier parameter is not marked correctly, we will replace required annotation with CEL validation rule which marks it as required. So, the impact is:

Default (not Observe Only):

  • 1.25+ => No difference, the user will get warned during resource creation.
  • 1.24- => Resource will be created and we get an error after the first reconciliation

Observe Only:

  • User may create the resource without providing an identifier field but should get an error after the first reconciliation.

I'm a little bit worried that we introduce a new external name configuration field without a robust defaulting for it.

Is there a way to extract this information regarding which Terraform configuration arguments are part of the Terraform ID from the native schema? What do you think?

I agree with your concerns and have already investigated ways to figure this out before introducing a new configuration; however, unfortunately, couldn't find one. I believe this is a similar problem to external name configuration. The information could be inferred from the import documentation, but there is no robust way of automating this without human intervention. So, the best I can find is to leverage that existing configuration to figure the individual parameters out. Happy to try if you have some other ideas.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean with anonymous config.ExternalName instances are those which we anonymously define instead of a reusable definition (such as functions like config.TemplatedStringAsIdentifier or variables like the config.IdentifierFromProvider). An example is here.

As we have discussed offline, we may start getting reports on cases where we cannot enforce required fields at the Kubernetes API level as you mentioned above and although the remedy is to supply a value for the required field at runtime, we will also want to do a manual configuration on config.ExternalName.IdentifierFields as a permanent fix. This looks like to be the best option unless we extend the discussion to XRM or related stuff, so I believe we are good to go.

Copy link
Collaborator

@ulucinar ulucinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @turkenh, lgtm.

Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
@turkenh turkenh merged commit 56bd8e0 into crossplane:main Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants