Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sanity: fix nil pointer error for IDGen when using Ginkgo suite #220

Merged
merged 1 commit into from
Sep 27, 2019

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Sep 23, 2019

What type of PR is this?

/kind bug

What this PR does / why we need it:

When adding IDGen, only one of the two code paths that trigger testing
was updated such that it sets a default ID generator when the config
doesn't already have one. GinkgoTest was not updated.

As a result, existing E2E suites which use GinkgoTest and don't set the
new field crash at runtime with a nil pointer error when updated to
csi-test 2.2.0.

Updating the instance provided by the caller is also a bit
questionable because it is an undocumented side effect. It's cleaner
to treat the struct as read-only and implement the fallback in an
accessor function.

Special notes for your reviewer:

We should create a 2.2.1 with this fix...

Does this PR introduce a user-facing change?:

fix nil pointer error when Config.IDGen is not set

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 23, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 23, 2019
@pohly
Copy link
Contributor Author

pohly commented Sep 23, 2019

/assign @davidz627

@pohly
Copy link
Contributor Author

pohly commented Sep 23, 2019

/cc @okartau

Found by Olev when updating PMEM-CSI to csi-test 2.2.0.

@k8s-ci-robot
Copy link
Contributor

@pohly: GitHub didn't allow me to request PR reviews from the following users: okartau.

Note that only kubernetes-csi members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @okartau

Found by Olev when updating PMEM-CSI to csi-test 2.2.0.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -44,7 +44,7 @@ type IDGenerator interface {
GenerateInvalidNodeID() string
}

var _ IDGenerator = &DefaultIDGenerator{}
var defaultIDGen IDGenerator = &DefaultIDGenerator{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid global variables, consider creating a "NewDefaultIDGenerator" function instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just return &DefaultIDGenerator{} - but that becomes mute if we don't use the accessor at all.

if c.IDGen != nil {
return c.IDGen
}
return defaultIDGen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaulting in an accessor is a code smell. This will surprise someone eventually. Could you just provide a default ID generator in the GinkgoTest? If you want to fix it once and for all you could also make the creation of a Config into a NewConfig function which defaults IDGen if not specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A NewConfig function is useful and would simplify the addition of new entries were the null entry isn't a sensible default. But depending on it being used to construct the Config is an API break.

For now I'll go for a deep-copy function instead which takes the user supplied Config and sanitizes it, like you did in Test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying Config doesn't work. That the tests use the user-supplied instance is part of the API, too. In PMEM-CSI we use that to fill in some fields that depend on the current test in a BeforeEach.

I guess we have to continue with updating the user-supplied instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pohly@8ac7c59 contains a version which updates IDGen in both code paths.

But I am not convinced that this is actually the better solution compared to the current one with the accessor. It creates a special case just for IDGen, everything else is checked at test runtime.

@davidz627 perhaps I can convince you that the accessor isn't so bad after all? 😐
I updated the defaulting to not use a global variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still slightly confused as to why callers (or a constructor) can't just set the IDGen to a default when creating the Config struct? If the Config struct must have an IDGen field set to function (it does) it should enforce that at creation time.

Another issue with defaulting in the accessor is now you've created an "API break" in that you MUST use the accessor. It's easy to imagine a scenario where someone adds code that doesn't use the accessor and grabs the IDGen directly and breaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think defaulting IDGen either or creation or near creation of the Config object is the correct way to do this. There's no API break and theres no unexpected behavior around accessing it.

The comment about IDGen being optional is probably incorrect as we can see it's being assumed + used in several places in the tests. It is actually required and should be defaulted or set by a caller.

If for some reason this doesn't work I would prefer to rev the major version by requiring a constructor (honestly requiring a constructor should never be the case - we should be able to default elsewhere where the config is "initialized") with the defaulting rather than force usage of an accessor that has implicit defaulting behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introducing one [constructor] now is an API break

Adding a new function is generally not considered an API break. The accessor you created is also a new function, and I don't consider that to be an API break either. Both approaches fix the bug in a new API, and users will have to switch to that API to get the bug fix.

// IDGen is an optional interface for callers to provide a generator for
// valid Volume and Node IDs. Defaults to DefaultIDGenerator which generates
// generic string IDs

My interpretation is close to yours, @pohly. To me, this says "Users don't need to supply this value and anyone reading it should expect the default behavior provided by DefaultIDGenerator". Since users are expected to create this struct directly, that means the nil value should be a useful default. If users were using a constructor, I would assume the constructor to handle defaulting.

The fact that the intention was to have this default, but no defaulting was provided in the library is a bug. However, I feel the bug lies with the API design and not its implementation. i.e. Asking users to initialize a struct directly means that the nil value should be useful, but that is not possible for interface types.

If the impact on users is comparable (minor code change to use new API) then we should focus on which API is preferable. The options I see here are,

  1. Add an accessor with defaulting behavior
  2. Add a constructor with defaulting behavior
  3. Document that users must explicitly provide DefaultIDGenerator or another generator.

Option 1 is the worst fit with the existing API. The Config type has no methods or accessors today. Defining one for a single field is likely to be missed by someone when they access every other field directly.

Option 2 provides the most forward compatibility, since a constructor function can usually adopt a functional options pattern.

Option 3 is updating the comments to describe the API implemented rather than the one intended. This may be reasonable, given how few users this library has.

Regarding DeepCopy,

That the tests use the user-supplied instance is part of the API, too.

If a behavior is observable, that does not mean it must be treated like an API. (see also, Hyrum's Law) While it is good to avoid breaking users of this library, it is also good to avoid accumulating tech debt.

Just as a side comment, the godoc for this package does not make it easy to figure out how to use this package. Is there documentation elsewhere that users should be reading? Should we link to it from the godoc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introducing one [constructor] now is an API break

Adding a new function is generally not considered an API break.

Yes, as long as calling it is optional. But the idea was to make calling
the constructor mandatory because there new non-null values (like the
interface implementation) would be set. That would be an API break,
because current users of the API don't call it and then fail.

The fact that the intention was to have this default, but no defaulting was provided in the library is a bug. However, I feel the bug lies with the API design and not its implementation. i.e. Asking users to initialize a struct directly means that the nil value should be useful, but that is not possible for interface types.

I don't think there has been much thought spent on API design in this
package... I suppose we now try to reconstruct what the API really is
based on the exported types and how they get used in reality.

If the impact on users is comparable (minor code change to use new API) then we should focus on which API is preferable. The options I see here are,

  1. Add an accessor with defaulting behavior
  2. Add a constructor with defaulting behavior
  3. Document that users must explicitly provide DefaultIDGenerator or another generator.

Option 1 is the worst fit with the existing API. The Config type has
no methods or accessors today. Defining one for a single field is
likely to be missed by someone when they access every other field
directly.

Option 2 provides the most forward compatibility, since a constructor
function can usually adopt a functional
options

pattern.

That would also be my preferred solution. But let's postpone that until
we need to break the API for some other reason.

As a short-term solution let's fix the missing initialization, which is
the alternative solution that I mentioned in #220 (comment)

I've updated this PR with that code.

Option 3 is updating the comments to describe the API implemented
rather than the one intended. This may be reasonable, given how few
users this library has.

Regarding DeepCopy,

That the tests use the user-supplied instance is part of the API, too.

If a behavior is observable, that does not mean it must be treated like an API. (see also, Hyrum's Law) While it is good to avoid breaking users of this library, it is also good to avoid accumulating tech debt.

True. But this aspect of the package is useful, so it should be part of
the API.

Just as a side comment, the
godoc
for this package does not make it easy to figure out how to use this
package. Is there documentation elsewhere that users should be
reading?

Unfortunately not. New users probably discover it based on usage in
other projects, which then also serves as example for how to use it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick reply.

introducing one [constructor] now is an API break

Adding a new function is generally not considered an API break.

Yes, as long as calling it is optional. But the idea was to make calling
the constructor mandatory because there new non-null values (like the
interface implementation) would be set. That would be an API break,
because current users of the API don't call it and then fail.

My point was that adding an accessor is also a new "mandatory" function. However, I think "mandatory" is not the only way to view such a change. Any client that continues to use the existing API will experience the exact same behavior, bug and all. From that perspective, this is not a breaking API change.

The fact that the intention was to have this default, but no defaulting was provided in the library is a bug. However, I feel the bug lies with the API design and not its implementation. i.e. Asking users to initialize a struct directly means that the nil value should be useful, but that is not possible for interface types.

I don't think there has been much thought spent on API design in this
package... I suppose we now try to reconstruct what the API really is
based on the exported types and how they get used in reality.

That's a fair assessment, but I do not believe we need to be absolutely rigid with regard to the existing API. The cost of moving all callers on to a new and improved API may be worth it for said API.

If the impact on users is comparable (minor code change to use new API) then we should focus on which API is preferable. The options I see here are,

  1. Add an accessor with defaulting behavior
  2. Add a constructor with defaulting behavior
  3. Document that users must explicitly provide DefaultIDGenerator or another generator.

Option 1 is the worst fit with the existing API. The Config type has
no methods or accessors today. Defining one for a single field is
likely to be missed by someone when they access every other field
directly.
Option 2 provides the most forward compatibility, since a constructor
function can usually adopt a functional
options

pattern.

That would also be my preferred solution. But let's postpone that until
we need to break the API for some other reason.

As a short-term solution let's fix the missing initialization, which is
the alternative solution that I mentioned in #220 (comment)

I've updated this PR with that code.

Option 3 is updating the comments to describe the API implemented
rather than the one intended. This may be reasonable, given how few
users this library has.
Regarding DeepCopy,

That the tests use the user-supplied instance is part of the API, too.

If a behavior is observable, that does not mean it must be treated like an API. (see also, Hyrum's Law) While it is good to avoid breaking users of this library, it is also good to avoid accumulating tech debt.

True. But this aspect of the package is useful, so it should be part of
the API.

Are you referring to the nil panic fix, or the fact that test break if you DeepCopy a Config? I don't see how the latter is useful.

Just as a side comment, the
godoc
for this package does not make it easy to figure out how to use this
package. Is there documentation elsewhere that users should be
reading?

Unfortunately not. New users probably discover it based on usage in
other projects, which then also serves as example for how to use it.

It can always be improved in the future! May also be a good-first-issue task for a learning contributor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the nil panic fix, or the fact that test break if you DeepCopy a Config? I don't see how the latter is useful.

Shared ownership of Config with the ability to modify it later is useful in a Ginkgo suite due to the way how tests are run: there's registration of tests at init time, then code from the caller is executed (custom BeforeEach), then the code of each test runs. But that's a discussion for another time. For now let's simply ensure that code using 2.1.0 can update to 2.3.0 without having to change how the package is used (no API change).

@davidz627 can you lgtm to get this merged?

When adding IDGen, only one of the two code paths that trigger testing
was updated such that it sets a default ID generator when the config
doesn't already have one. GinkgoTest was not updated.

As a result, existing E2E suites which use GinkgoTest and don't set the
new field crash at runtime with a nil pointer error when updated to
csi-test 2.2.0.
Copy link

@misterikkit misterikkit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

@misterikkit: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@davidz627
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 27, 2019
@k8s-ci-robot k8s-ci-robot merged commit 477c7e9 into kubernetes-csi:master Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants