Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a 'skip' parameter to writev1 so that the beginning of a car can … #291

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

willscott
Copy link
Member

@willscott willscott commented Jan 26, 2022

Remaining:

  • cleaner/more efficient teeing implementation (remove un-needed buffer copies)
  • Add test demonstrating skipping arbitrary prefixes of the car

v2/selective.go Outdated Show resolved Hide resolved
v2/selective.go Outdated Show resolved Hide resolved
@willscott
Copy link
Member Author

The remaining potential work here is allow skip to be an arbitrary byte offset in the car file. currently it's limited to block boundaries, and because of the API it's difficult to know which block boundary nearest to the requested skip offset has been used.
If it's acceptable for the caller to skip only to block boundaries, this could be sufficient as-is, otherwise the wrapper for the individual reader needs to be updated to also partially skip.

@willscott
Copy link
Member Author

I cleaned up the tee-ing writer and added a test that seems to verify that we can use this to advance to arbitrary byte offsets in a car.

@rvagg maybe can provide a first round of review

type counter struct {
totalRead uint64
// Counter tracks how much data has been read.
type Counter struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we use a struct for this? is it something to do with needing a pointer to it? why isn't a *uint64 good enough for this purpose?


// TraverseResumer allows resuming a progress from a previously encountered path in the selector.
type TraverseResumer interface {
RewindToPath(from datamodel.Path) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I right in assuming we don't have a use-case for this at the moment, that we're just supporting it because it's not a difficult extension of doing it to an offset?

got thoughts on what a use-case for might me?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the most useful thing to have pointed out. I think the more useful method to expose here will be a RewindToOffset in addition to the current RewindToPath. I'll work on adding that

v2/selective.go Outdated Show resolved Hide resolved
var _ io.ReadSeeker = (*SkipWriterReaderSeeker)(nil)

// NewSkipWriterReaderSeeker creates an io.ReadSeeker around a ReWriter.
func NewSkipWriterReaderSeeker(ctx context.Context, size uint64, cons ReWriter) *SkipWriterReaderSeeker {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we have a better name than cons? it's not obvious to me what this name is supposed to be (consume?), it's puzzling when we get to c.cons() and it seems like it should be more obvious what it's doing - should it be read as "start consuming now"?

v2/selective.go Show resolved Hide resolved
v2/selective.go Show resolved Hide resolved
@rvagg
Copy link
Member

rvagg commented Aug 10, 2022

This is some super-impressive code @willscott, once I got my head around it. The scope is large and it's doing some really complex stuff and I can't really fault it from a review. Most of my suggested changes are just comments I think might be helpful for anyone else following the breadcrumbs in future.

I'm now toying with integrating it into Boost and will report back.

v2/selective.go Outdated Show resolved Hide resolved
@rvagg
Copy link
Member

rvagg commented Aug 12, 2022

(FYI I messed this branch up a tiny bit so ended up rebasing it on master, removing the merge commits in the process, so the commits have changed but the resulting code is the same)

@willscott
Copy link
Member Author

@rvagg are we able to merge this - was there something else we were waiting for on it?

@rvagg
Copy link
Member

rvagg commented Sep 27, 2022

Yeah, it needs #327 to get closer to working as advertised, but even then it's still not quite right. I stepped back from working on this as the complexity increased and I started questioning the utility since we mainly want this (for now) for the explore-all case and it's the selectors that make this difficult to properly do.

As it is, it can't replace Boost's impl because of caching enabled by explore-all shortcuts that we're not allowing ourselves to touch here.

I'd be fine merging in #327 to here, but I'm not sure we should merge this in since it's not going to give people what it says it does.

@rvagg
Copy link
Member

rvagg commented Sep 27, 2022

I forgot to add that I'm not intending to suggest this is a lost cause. My thinking, the last time I touched this, was that we should try to implement an efficient explore-all case here, and then iterate on that for the more complex cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏃‍♀️ In Progress
Development

Successfully merging this pull request may close these issues.

5 participants