Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fasttraversal package; mutable path objects #45

Open
warpfork opened this issue Feb 14, 2020 · 6 comments
Open

fasttraversal package; mutable path objects #45

warpfork opened this issue Feb 14, 2020 · 6 comments

Comments

@warpfork
Copy link
Collaborator

warpfork commented Feb 14, 2020

It's not that the traversal package isn't fast. But it does do some things in the name of 'safety' that sort of prevent it from being really fast.

Namely, it copies traversal.Progress by value many times (might not be a huge issue; I think they all end up on the stack, or at least I sure hope they do; a benchmark to check this is warranted); and, it uses ipld.Path.Join heavily, which most definitely involves a new allocation every single time we traverse over a node. This latter one is pretty severe.

We could amortize a ton of these allocations used for creating Path values if we had a mutable implementation of Path, and just pre-allocated an overly large slice for it, and kept changing that one value in-place rather than creating a whole new Path for every single step we take in a traversal.

This is a big "if", and comes with big caveats, though. It becomes very very easy to misuse such an API where values have limited lifetimes in which they're correct, and yet at the same time we can provide no compile-time checks to ensure references to those values are never held outside of those lifetimes.

Making a fasttraversal package might be a good compromise. Code can be written against the traversal package until the user notices this is a performance issue; then the changeover should be easy. We could keep it with the exact same user-facing interface as the traversal package -- so you could change the import statement from "go-ipld-prime/traversal" to traversal "go-ipld-prime/fasttraversal" and make no other changes! -- but with the caveats that if you hold onto any part of the fasttraversal.Progress (and most importantly, its contained ipld.Path values) outside of your VisitFn, then your application will have incorrect/undefined behavior.

There is plenty of prior art for this. For a particular example, many git implementations I've seen (in a variety of languages!) have iterators which do exactly this thing with the mutable paths and lifetime caveat; it's a known issue that this will become a performance thing when you're flying over a lot of data like this quickly. So, although I find such mutability-heavy APIs a little scary for confident correctness reasons, there's definitely established references out there suggesting it's worth it.

This could be a bit tricky to implement. The ipld.Path implementation would also have to change to accomodate some of this, which is a change in the core package and thus touchy. I'd like to have the type system indicate if a path is going to be mutable, but that isn't possible without getting different visitor function interfaces, which might be undesirable. And having a fasttraversal.Progress type is also unfortunate for the same reasons of changing the visitor function interface defn; maybe this can be handled with type aliases, but I'm not actually sure if that's allowed or not.

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 5, 2020

pprof-allocs-traversing

Here's some evidence for the cost of Path manipulation. It's definitely the biggest thing on the chart.

This comes after and includes performance improvements from #49 (if it didn't, you'd also see a lot of allocations in the use of iterators from boxing allocations during handling keys) -- once those fixes are implied, it's pretty much Path that's left as a major cost.

Iterators could also potentially be improved. See 7c595d6 for that; that commit didn't land, but has some useful notes; it's definitely doable.

I haven't taken a look at the selector.NewSegmentIterator part yet. Might be possible to improve that too.

@rvagg
Copy link
Member

rvagg commented Mar 6, 2020

Screenshot 2020-03-06 13 09 00

How does one read this, who owns that 10923032?

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 6, 2020

Good question. Neither, I think -- that's the number that pass through that area. Note how there's a "0 of x" inside the boxes those arrows are attached to.

4456516+8367117=12823633. Everything's coming from the functions called below these.

And apparently 12823633-10923032=1900601 is the number of allocs owned by the last layer of recursions. (Probably not a particularly actionable detail, though.)

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 6, 2020

There's a bunch of different lenses you can use to view this data, btw; I just posted a screenshot of one of the more visual all-in-one summaries. If someone was gonna drive at addressing this, they'd want to just take this as a broad starter cue, and then generate and inspect their own pprof data.

(I have some sparse writeups about how-to-pprof I'm slowly accumulating in a gist, in case it's helpful to anyone out there: https://gist.github.com/warpfork/b13d4e0afcfc571cbec8cb4373f61510 )

@rvagg
Copy link
Member

rvagg commented Mar 9, 2020

Ahh, Path/AppendSegment takes up 8M, selector/NewSegmentIterator takes up 2M and basic/MapIterator takes up 2M, accounting for the 12M coming through the top of the funnel.

👍

@Stebalien
Copy link

So, this is still showing up as a pretty big CPU sink in graphsync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants