Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options to control caching of schemas #1018

Merged
merged 1 commit into from
Apr 16, 2024

Conversation

justin-tay
Copy link
Contributor

Closes #1016

This adds additional configuration options to control the caching / preloading schema behavior.

  • Option to not cache schema $ref. This means that after resolving a $ref during evaluation the schema loaded is discarded and won't be used for subsequent evaluation runs. This is set to a default of true to cache by default. This may be needed if there is a large permutation of evaluation paths due to multiple nested applicators like allOf, oneOf and anyOf which makes caching all evaluation paths take up too much memory.
  • Option to limit the max nesting depth when preloading $ref. This is set to a reasonable default of 40.

@stevehu stevehu merged commit ffec188 into networknt:master Apr 16, 2024
0 of 4 checks passed
@txshtkckr
Copy link

txshtkckr commented Jun 10, 2024

Any chance we could get a release with this option in it? I'm encountering this issue in production use: https://bitbucket.org/atlassian/adf-builder-java/issues/105

On a related note, looking at CachedSupplier, it holds on to its delegate indefinitely. I have not dug into the library enough to be sure if there are thread-safety concerns around making the delegate mutable and null-ing it out after using it, but that would allow it to release all the references captured in the lambda at the call point, which could greatly reduce the memory pressure that this is causing.

In a heap dump that I analyzed, the retained set from the RefValidator was 450M, and even a very sprawled eval path tree shouldn't have this kind of impact. But the supplier captures the validation context, among possibly other things, and the heap I'm looking at shows it having captured 90,000 of those. The impact of that is almost definitely not trivial.

@txshtkckr
Copy link

txshtkckr commented Jun 11, 2024

If thread-safety is an issue, an easy workaround would be to change the delegate field to volatile and write back a constant supplier after performing the calculation:

    @Override
    public T get() {
        if (cache == null) {
            T value = delegate.get();
            cache = value;
            delegate = () -> value;
        }
        return cache;
    }

This would release whatever garbage the supplier is holding while still providing the guarantee that the delegate is never null.

@justin-tay justin-tay deleted the gh1016 branch June 12, 2024 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recursive calls leading to an OOM when parsing a valid schema
3 participants