Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop on Node should not be recursive #229

Open
apoelstra opened this issue Jul 1, 2024 · 4 comments
Open

Drop on Node should not be recursive #229

apoelstra opened this issue Jul 1, 2024 · 4 comments

Comments

@apoelstra
Copy link
Collaborator

We've done a great job eliminating recursion throughout the library, aside from in the typechecker. One subtle place is in the default Drop impl on Node. This should be manually implemented as something that uses a post-order iterator to replace each node with Inner::Unit, iteratively dropping the original.

I took a crack at this and couldn't figure it out after ten minutes but it's definitely possible.

@uncomputable
Copy link
Collaborator

Does that mean that dropping a deeply nested node can result in a stack overflow, as of now?

@apoelstra
Copy link
Collaborator Author

Yep. It's not too hard to write a test case -- you can construct a take(take(take(...(unit)...))) iteratively then do nothing with it but drop, and it'll blow out the stack inside the drop glue for Arc.

@uncomputable
Copy link
Collaborator

Ok, wow. I thought that Drop cannot stack-overflow because it is implemented in an iterative manner. Then I thought about that algorithm and I realized that it cannot be done in valid Rust. Every type has a distinct implementation of Drop so the generic algorithm that works for all types has to be recursive, it seems. We can avoid recursion by manually implementing Drop for a specific type. Is that correct?

@apoelstra
Copy link
Collaborator Author

Yes. To avoid recursion you need to manually implement Drop in a way that you iterative drop its children leaving the original object "disabled" in some way (e.g. replacing it with a unit node using mem::replace) that the recursive drop logic will then be trivial.

uncomputable added a commit that referenced this issue Jul 4, 2024
c22dbaa types: drop BoundMutex and instead use references into the type context slab (Andrew Poelstra)
2e24c49 types: pull unify and bind into inference context (Andrew Poelstra)
a26cf7a types: remove set and get methods from BoundRef (Andrew Poelstra)
33f58fa types: introduce BoundRef type, use in place of Arc<BoundMutex> in union-bound (Andrew Poelstra)
021316c types: abstract pointer type in union-bound algorithm (Andrew Poelstra)
eccc332 types: add &Context to recursive type constructors (Andrew Poelstra)
65b35a9 types: add &Context to type constructors (Andrew Poelstra)
8e08900 types: make `bind` and `unify` go through Context (Andrew Poelstra)
8eeab8f types: introduce inference context object, thread it through the API (Andrew Poelstra)
9b0790e cmr: pull Constructible impl on Cmr into an impl on an auxiliary type (Andrew Poelstra)

Pull request description:

  Our existing type inference engine assumes a "global" set of type bounds, which has two bad consequences: one is that if you are constructing multiple programs, there is no way to "firewall" their type bounds so that you cannot accidentally combine type variables from one program with type variables from another. You just need to be careful. The other consequence is that if you construct infinitely sized types, which are represented as a reference cycle, the existing inference engine will leak memory.

  To fix this, we need to stop allocating type bounds using untethered `Arc`s and instead use a slab allocator, which allows all bounds to be dropped at once, regardless of their circularity. This should also improve memory locality and our speed, as well as reducing the total amount of locking and potential mutex contention if type inference is done in a multithreaded context.

  This is a 2000-line diff but the vast majority of the changes are "API-only" stuff where I was moving types around and threading new parameters through dozens or hundreds of call sites. I did my best to break everything up into commits such that the big-diff commits don't do much of anything and the real changes happen in the small-diff ones to make review easier.

  By itself, this PR does **not** fix the issue of reference cycles, because it includes an `Arc<Context>` inside the recursive `Type` type itself. Future PRs will:

  * Take a single mutex lock during calls to the top-level `bind` and `unify` calls, so that these all happen atomically, including all recursive calls.
  * Add another intermediate type under `Type` which eliminates the `Arc<Context>` and its potential for circular references. Along the way, make the `Bound` type private, which is not really used outside of the types module anyway.
  * Do "checkpointing" during type inference that makes node construction atomic; this is #226 which is **not** fixed by this PR.
  * (Maybe) move node allocation into the type inference context so that nodes can be slab-allocated as well, which will address #229 "for free" without us figuring out a non-recursive `Drop` impl for `Arc<Node<N>>`.

ACKs for top commit:
  uncomputable:
    ACK c22dbaa

Tree-SHA512: 0fd2fdd9fe3634068d67279d517573df04fafa60b70e432f59417880982ad22e893822362973f946f1deb6279080aec1efdd942dfd8adad81bbddc7d55077336
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants