Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using validators with $ref #814

Open
CagtayFabry opened this issue Jun 24, 2020 · 2 comments
Open

using validators with $ref #814

CagtayFabry opened this issue Jun 24, 2020 · 2 comments

Comments

@CagtayFabry
Copy link
Contributor

CagtayFabry commented Jun 24, 2020

What is the recommended approach to adding validation on properties that are only defined as $ref ?

To give an example:
Let's say I have a property that is supposed to be a quantity:

type: object
properties:
  quantity_prop:
    description: |
      a simple quantity property
    $ref: tag:stsci.edu:asdf/unit/quantity-1.1.0

Now I have written and registered a simple custom validator called my_validator that runs some kind of check (details not important). I want to "trigger" this validation on the above quantity property.

My initial approach was rather naive:

type: object
properties:
  quantity_prop:
    description: |
      a simple quantity with custom validator
    $ref: tag:stsci.edu:asdf/unit/quantity-1.1.0
    my_validator: true

Unfortunately the above does no trigger execution of my_validator. Does this have something to do with how $ref is handled internally? Do other entries get dropped when $ref is "resolved"?

The workaround I ended up using was using allOf with a custom "empty" object that only triggers the validation. So this works:

type: object
properties:
  quantity_prop:
    description: |
      a simple quantity with custom validator using allOf
    allOf:
      - $ref: tag:stsci.edu:asdf/unit/quantity-1.1.0
      - type: object
        my_validator: true

However I find the first approach more flexible and concise.

@eslavich
Copy link
Contributor

This looks like an asdf bug. We "resolve" references to other schemas by loading them up and replacing the node containing the $ref with the external object. This means that any other property alongside that $ref is overwritten. Previous JSON schema drafts were vague, but the latest draft states explicitly that we should not be doing this:

   Attempting to remove all references and produce a single schema
   document does not, in all cases, produce a schema with identical
   behavior to the original form.

   Since "$ref" is now treated like any other keyword, with other
   keywords allowed in the same schema objects, fully supporting non-
   recursive "$ref" removal in all cases can require relatively complex
   schema manipulations.  It is beyond the scope of this specification
   to determine or provide a set of safe "$ref" removal transformations,
   as they depend not only on the schema structure but also on the
   intended usage.

We're planning to make a 2.7 release sometime within the next couple of weeks. I'll see about getting this fixed in time to include in that release.

Incidentally, we implement a custom validator for YAML tags that offers another way to sidestep this issue:

type: object
properties:
  quantity_prop:
    description: |
      a simple quantity with custom validator
    tag: tag:stsci.edu:asdf/unit/quantity-1.1.0
    my_validator: true

Instead of including all the quantity schema material in your schema, this checks the tag of the object only, which confirms the object's type without descending into the details. Structuring the schema this way actually saves the quantity object from being validated twice, since it will always be validated against its own schema when asdf recognizes its tag.

We don't make use of this technique in the core schemas yet, but lately I've been thinking we should, since validation is the slowest part of opening a file and it would improve performance to not repeat it.

@eslavich
Copy link
Contributor

I was confused about what is going on here. The asdf library only "resolves" $refs in schemas provided via the custom_schema option. For schemas associated with tags, the $ref material is descended into by the normal jsonschema validation mechanism.

... and the jsonschema code ignores other schema node properties when $ref is present. That happens here:

https://github.com/Julian/jsonschema/blob/v3.2.0/jsonschema/validators.py#L318

Going back to the language from the latest JSON schema draft:

Since "$ref" is now treated like any other keyword, with other
   keywords allowed in the same schema objects...

That seems to imply that previous drafts did not allow other keywords in the same schema objects alongside $ref. I can't at the moment find language in the draft we're using (Draft 4) that specifies this one way or another.

So it seems that this is a Draft 4 behavior that we can't do anything about. @CagtayFabry do you agree with that assessment? This is motivation to bump up to Draft 8 / 2019-09 with ASDF Standard 2.0.0. The only issue is that software support doesn't yet exist for the latest draft, but maybe it will by the time we need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants