Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

IPLD Prime In IPFS: Target Merge Branch #36

Merged
merged 16 commits into from
Aug 12, 2021
Merged

Conversation

hannahhoward
Copy link
Contributor

Official merge branch for IPLD Prime In IPFS effort.

Switches usage over to go-fetcher

See: #34

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working my way through the libraries and some of the issues I raise might come lower down the stack, but wanted to get a few thoughts out there while I'm still reviewing.

DAG ipld.NodeGetter

ResolveOnce ResolveOnce
FetchConfig fetcher.FetcherConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of exposing this as a fetcher.FetcherConfig struct as opposed to some interface?

Perhaps this question is really about go-fetcher, but it looks like in the non-test code here we really just need something that exposes NewSession(ctx) fetcher.Fetcher

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote weakly (both weak vote and weakly held position) that that might be premature abstraction. If there's probably only that one shape of config fields in the fetcher component, we might as well admit it, because it'll be more discoverable and easier to autocomplete. But, I don't have strong feelings about this, not having a super complete grasp of how diverse fetcher config might be planned to become.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it ended up being an interface anyway! (due to various other changes happening in go-fetcher) :)

Comment on lines 51 to 55
func NewBasicResolver(ds ipld.DAGService) *Resolver {
func NewBasicResolver(bs blockservice.BlockService) *Resolver {
fc := fetcher.NewFetcherConfig(bs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We previously took in a DagService here which, on the read end, is being replaced by the Fetcher. Shouldn't we pass in a Fetcher interface of some sort instead of a blockservice which then gets wrapped?

Perhaps it's the same outcome either way, but IIUC part of the "ipld-prime" way here is to be dealing with things on a DAG/node rather than block level which means that the thing we want is something that cares about DAGs rather than blocks.

WDYT about having something like a FetcherSession (or better name) interface that just implements NewSession(context.Context) Fetcher? It might make writing tests simpler and allow us to more easily make changes later (i.e. if we need to extend the input interface to do more than either the structs we're using already support then new functionality and the change is easy, or they don't and the change would have been hard either way).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this all ended up happening!

resolver/resolver.go Show resolved Hide resolved
break
}
// resolve node before last path segment
nodes, lastCid, depth, err := r.resolveNodes(ctx, c, pathSelector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm likely missing something here, but why can't I just use resolveNodes for the full path selector/why do we need to stop and make the last path segment special?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPLD selectors will follow a link transparently. If we're going to match the current behavior of the function, which returns the CID of the block pointed to by the last segment on the path without loading the block itself, we need to stop the selector traverse before the block gets loaded. If we traverse the full path with a selector, the block will get loaded. Perhaps @warpfork can clarify

Copy link
Member

@warpfork warpfork Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is like the difference between stat and lstat in linux syscalls. Most of IPLD functions are behaviorally like stat; they'll traverse links implicitly. In particular, yeah, as currently written, traversal.Walk loads links before inspecting them to see if the link node is matched itself.

I'd be open to having more lstat-like functionality in go-ipld-prime. (If someone wants to do that, 👍 But assume that won't make this PR easier to land today, so, 🤷 )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without loading the block itself

To make sure I understand, are you saying the issue is that if rootCID/foo/bar is a CID bafybar then we'll end up loading the block instead of just returning the CID? In the case where rootCID/foo/bar is an integer but rootCID/foo is a CID we'll have to load the rootCID/foo block anyway.

If so @warpfork any thoughts on if this is the way to resolve this? I'd think that basically all of the logic from this function would live in a selector traversal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann your summary is correct.

In practice, most path in IPFS is protobuf, where each segment really is a whole new node. We're trying to not have any more actual block loads (which are a network fetch) than before, and I believe the code as written mirrors what is current in terms of network fetches.

In the long term, as @warpfork points out, this repository is highly transitional at this point, so there may not be elegant solutions to everything. We are shooting for as much backward compatibility as possible.

resolver/resolver.go Outdated Show resolved Hide resolved
resolver/resolver.go Outdated Show resolved Hide resolved
resolver/resolver.go Show resolved Hide resolved
resolver/resolver.go Show resolved Hide resolved
Comment on lines 71 to 93
resolver.FetchConfig.NodeReifier = unixfsnode.Reify
node, lnk, err := resolver.ResolvePath(ctx, p)
if err != nil {
t.Fatal(err)
}

uNode, ok := node.(unixfsnode.PathedPBNode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is UnixFS really required here, aren't we just traversing DagPb nodes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iiuc this go-path library is really all for unixfsv1 semantics. If we just wanted to traverse over dagpb nodes directly, we'd... traversal.* over ipld.Node, and there'd be nothing here.

I'm hoping in the longer run, this whole library will also just disappear into being a "normal" traversal.* over these things from the unixfsnode package, which are actually ADLs that are making the unixfsv1 semantics legible as normal Node step-wise semantics. IIUC, where were at now, with this PR, is that we're eroding go-path into being becoming this (but not getting rid of go-path yet, because incrementalism/scope-avoidance/you-know).

What this is testing is that the NodeReifier hook system is actually producing one of those ADLs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea what we're trying to do is maintain current behavior -- which is pathing by string, which neccesitated the unixfsnode library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with regular protonodes -- not unixfs files/directories -- you need the ADL to do pathing by string.

resolver/resolver_test.go Outdated Show resolved Hide resolved
go.mod Outdated
Comment on lines 6 to 20
github.com/ipfs/go-blockservice v0.1.4
github.com/ipfs/go-cid v0.0.7
github.com/ipfs/go-datastore v0.4.5
github.com/ipfs/go-fetcher v1.2.0
github.com/ipfs/go-ipfs-blockstore v0.1.4
github.com/ipfs/go-ipfs-exchange-offline v0.0.1
github.com/ipfs/go-ipld-cbor v0.0.3
github.com/ipfs/go-ipld-format v0.2.0
github.com/ipfs/go-log v1.0.4
github.com/ipfs/go-merkledag v0.3.2
github.com/ipfs/go-unixfsnode v1.1.1
github.com/ipld/go-ipld-prime v0.9.1-0.20210402181957-7406578571d1
github.com/stretchr/testify v1.7.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of bothers me that we've ended up with so many extra dependencies here, even though they're mostly just used for testing. I feel like @warpfork may have some thoughts/suggestions here.

Copy link
Member

@warpfork warpfork Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... there's a mixture of things going on here though...

  • I bet a couple of these are new because the go mod tidy tool is listing more things lately (and they're not actually new dependencies).
  • Yeah, I'd like this list to be less...
  • ... but testing is a predominatingly worthy cause, if that's what's causing them to appear.
  • It's maybe not worth spending a ton of time on it in this repo, if we hope to erode this repo into the abyss entirely. (But let's watchdog intensely on the things that replace it, yeah.)
  • The real problem, no matter which way we slice it, is... if you really wanna be sad, go look at the lock file. Somehow we've got transitive dependencies sprawling all the way to etcd (??) and btcd (?!) and gopherjs (??!). I don't know which of these direct dependencies has those wildly out of control transitive sprawls, but that's probably the thing I'd worry about first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a few less now. Some are just new dependencies (the fetcher). A few are yes -- for testing. While we could drop a bunch of these by switching away from protonodes, I'd rather not we ship v0.10.0 with these changes -- the main consume of this functionality is UnixFS, so it feels important to test the main exercised code path.

acruikshank and others added 7 commits July 22, 2021 11:22
* first pass

* Update resolver/resolver.go

Co-authored-by: Eric Myhre <hash@exultant.us>

* update dependencies to tagged versions

* correctly handles nested nodes within blocks

* return link from resolve path so we can fetch container block

* return expected NoSuchLink error

* more accurate errors

* feat(resolver): remove resolve once

remove ResolveOnce as it's no longer used and is just confusing

Co-authored-by: acruikshank <acruikshank@example.com>
Co-authored-by: Eric Myhre <hash@exultant.us>
Co-authored-by: hannahhoward <hannah@hannahhoward.net>
resolve go vet and staticcheck issues. note we had to ignore two lines that use deprecated
behavior, but which replacing could have unintended effects
@hannahhoward
Copy link
Contributor Author

Note: I had to update go-log cause of dependencies

This module uses EventBegin in logging, and I've for the moment told the linter to ignore the Deprecation warnings, as switching away from EventBegin feels out of scope for this ticket. (also I don't know what we tend to replace it with)

@warpfork
Copy link
Member

warpfork commented Aug 4, 2021

Fairly large comment coming. It will be non-blocking for this PR, but is triggered by my journey trying to understand the PR. You can skip it, unless you're thinking about where we're going with all this in the future.

TOC:

  • We have a lot of repos
  • Discovery log
  • Summary

We have a lot of repos

In reviewing this, I got the feeling I'm implicitly reviewing things across a number of other repos.

And indeed: I drew a graph of go modules that reach between go-ipfs and go-ipld-prime as of today's tip of PR7976, and the current versions any of them reference amongst themselves:

module graph reaching between go-ipfs and go-ipld-prime as of PR7976

It's pretty hairy.

Some of these (especially some of the sheer versions within each module) should go away by the time we're done with the mergeparty. Some of the complexity will stick around. And some of these repos are new (which frankly terrifies me, by default, unless they're outweighed by the number of repos disappearing).

I want to understand:

  • which things were legacy, but are already removed by the end of this mergeparty; vs
  • which things were legacy, but will still stick around at the end of this mergeparty, in a deprecated/transitional state; vs
  • which things are new but only intended as a transitional state; vs
  • which things, either new or old, we intend to have stick around.

More things in those middle two states means we've got more work to do in the future, but also means there's a broader range of what we frame as acceptable and successful for the current work cycle.

☁️ ☁️ ☁️ ☁️ ☁️

discovery log

This section is going to be high detail and high noise. Consider skipping it and going straight to the Summary at the end.

The following notes are stream of consciousness.

And I'm going to brain-OOM at least once in the middle of these notes. I'm confessing that very intentionally: I think other people will probably also brain-OOM when trying to chart this, because it's legitimately a maze. We'll need to keep this in mind when allocating time for work (and for planning work!) in the future. If we don't make good roadmaps, and document intended scope for repos... we'll get stuck doing re-discovery like this, and it's both time-consuming, and ultimately reduces the odds of success in reaching a simpler state by the end of renovations. Roadmaps and scope docs are good. Not time-cheap to make. But worth it.

Okay! Here we go:

  • go-path (this repo) is now depending on go-fetcher.
    • This mostly replaces dependencies it previously had on go-ipld-format.DAGService. We consider this a good thing, and a first class goal. (go-ipld-format.DAGService depends on the go-ipld-format.Node interface, which, broadly speaking, we're trying to replace with go-ipld-prime.Node.) Okay. 👍
  • go-fetcher is a bunch of glue code, and in particular, wraps a blockservice.BlockService from https://github.com/ipfs/go-blockservice into something that can be communicated with using go-ipld-prime.Node.
    • (This repo may have had more aspirations than this when created, but I don't know if that remains true. I'm calling what I see.)
    • I'm going to assume this accomplishes connecting new code to important parts of existing codebases we still need to plug together. Okay. 👍 But let's recurse into that and try to check that assumption...
  • Is go-blockservice a temporarily convenient place to shore up the renovation wave, or something we want to continue to route energy through for a long time to come?
    • the go-blockservice readme is mostly a TODO mentioning it would be nice if "blockservice" and "blockstore" were more the same thing.
      • I'm going to stop recursing into understanding that right now. I'm starting to think this is some part of the legacy world that we should be trying to continue moving away from in the future, somehow (in large part due to this lack of clarity).
    • The one thing I do know is that a lot of existing critical code for actual data-on-disk implements ... wait. "datastore"? Things like go-ds-flatfs implement "datastore"? What's... that? I am now juggling the concepts of "datastore", "blockstore", and "blockservice". How do I get from a "datastore" to a "blockservice"? Are there supposed to be this many layers here? Do the number of layers provide value that's worth their complexity? Are they even layers?
    • Brain OOM, stack reset. 😵
    • I'm going to assume that someone looked at this, and determined that go-fetcher can get work done by wrapping this blockservice thing, and that has proved sufficient for now, and this at least an acceptable temporary convenient place to shore up any renovation waves. I will assume I currently know nothing else about a timescale on which things below this may be revisited. 🤕 Okay. Stack pop, back up to the top.
      • This makes me struggle to analyze the degree to which go-fetcher should be seen as a permanent, intentional API nexus, or if it's a transitional one, and could be eventually bundled back into some other repo for reduced friction and repo fragmentation.
  • go-path (this repo) is now depending on go-unixfsnode.
    • go-unixfsnode is probably our pride and joy out of this whole raft of changes and repos. We're gathering together all the unixfs and pathing logic there, in one place, and cleaning it up to be an ADL at the same time, which considerably clarifies the logic, standardizes the APIs, etc, and even gets us closer to new features like "selectors over unixfs". Moving things to depend on this is 👍 👍 👍
    • So what is go-path left with? It's all facades to minimize change and reduce the amount of refactoring for things downstream of go-path.
  • go-path is also, still, depending on go-ipld-format. (The thing we're most pointed trying to move off of.)
    • Some of go-path's public interfaces are returning things like go-ipld-format.NodeGetter, go-ipld-format.Node, and go-ipld-format.Link (which, n.b., is the struct with things like size fields in it, not just a CID, which is what we usually mean when we say "link" in modern conversations). We can't get rid of these yet because we're trying to minimize change.
  • Recheck: why are we trying to minimize change here in go-path? Who are the downstreams, and what makes this repo's public APIs a good place to try to hold?
    • The two things that depend on go-path that I can see are: go-ipfs and go-mfs.
    • If it was just go-ipfs, I'd say: "let's just move this into the go-ipfs repo; it'll make future work have less friction".
    • I guess we're probably unwilling to recurse into renovating go-mfs. I have heard it's complex over there for some reason.
    • Okay. ✅
  • Is the go-path/resolver.Resolver type at all sensible anymore?
    • Probably not, frankly. Questioned previously in another review.
    • But again, apparently the choice made in this PR is that it's sticking around because we're trying to avoid having to change more downstreams today.
  • Note that currently neither go-unixfsnode depends on go-fetcher, nor vice-versa.
    • That seems good and correct. 👍 👍
    • go-path is drawing them together.
    • If go-path is renovated out of existence in the future, the gluing go-unixfsnode to something that does IO like go-fetcher will have to move... somewhere. That is a decision for down the road. Shouldn't be a problem. 🆒
  • go-path (this repo) depends on go-merkledag.
    • I think we're mostly trying to consider go-merkledag legacy and try to move away from it, too. (It's got icky stuff like weirdly protobuf-specific APIs all over the place, etc.)
    • but since go-path as a whole is a transitionary state thing we're going to try to move off of... we won't waste time worrying about it depending on other transitionary-move-away-plz stuff.
  • Not seen in this repo, but in the neighborhood:
    • go-ipfs, interface-go-ipfs-core, and go-mfs all still also depend on go-unixfs (as opposed to go-unixfsnode, which is more modern). This is probably all "future work". Okay. ✅
    • go-mfs is referring to a pretty old version tag of go-path? It's probably already tested to be not a problem with the new version, though, because Minimum Version Selection from go-ipfs should've already caused it to be tested together.
      • But it may be worth bumping go-mfs's version references anyway, just to reduce the range of things we have to consider in the future?

☁️ ☁️ ☁️ ☁️ ☁️

summary

Let's answer those original questions by color-coding these repos:

  • ⚫ = legacy, successfully removed by the end of this mergeparty.
  • 🟤 = legacy, but failed to remove yet: will still stick around after the end of this mergeparty, and still considered in a deprecated/transitional state.
  • 🟠 = new but only intended as a transitional state.
  • 🟢 = either new or old, we intend to have stick around.

Green and black are good. Brown and orange mean there's still work to do in future cycles.

Repos:

  • go-path: 🟤
  • go-fetcher: 🟠 (... I think? unclear, see fulltext above)
  • go-unixfsnode: 🟢
  • go-unixfs: 🟤
  • go-merkledag: 🟤
  • go-ipld-format: 🟤
  • (and there are a couple other new 🟠 too, but not in the reach beneath this repo, so I'll elide them from this post.)
  • go-ipld-format.DAGService (not at repo scale): ⚫ (yay! got one!)

Okay. Ouch. That's a lot of brown circles. We didn't get a lot of successful removes. That means we're definitely not done renovating. And it also means we didn't score a lot of wins for overall complexity reduction yet (-> we might not actually be scoring dev velocity improvements by the end of this PR & mergeparty, either).

I think we still consider this a good outcome. A lot of work was accomplished. (A lot of it is just at higher granularity than this repo/module-sized field-of-view can easily see.) There's one green circle that's a new thing that we're really happy about, and this PR here in go-path is bringing that into the fold and making it load-bearing. Win. We got a bunch of go-ipld-prime interfaces showing up higher and higher into this dependency graph. Win. Even if we do have a lot of these brown circles, they're increasing becoming facades. Win.

And now we're at the end of a time-box for working on this.

Hopefully we're in a better position for the next round of work. And hopefully in that next round of work, we'll take a bunch more of these brown circles and move them to black.


Sorry for the verbosity. I just really needed to refresh on what the goalposts are. This PR is easier to review (and a lot easier to be happy about!) when one remembers that yeah, a lot of it is gonna smell janky because it's still constrained by legacy -- it's transitional code and we know it -- and the goal isn't to polish this, it's to move on quickly and try to replace most of this with other new APIs that are better, in the next round of work.

Knowing and agreeing that we're not looking for polish / sanitization / complexity-reduction out of this patchset makes it much easier to approve.

Copy link
Member

@warpfork warpfork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm basically good with this.

The reached state is not attractive, but the reasoning about where to park scope for right now seems to hold.

go.mod Outdated
Comment on lines 13 to 14
github.com/ipld/go-codec-dagpb v1.2.1-0.20210330082435-8ec6b0fbad18
github.com/ipld/go-ipld-prime v0.9.1-0.20210402181957-7406578571d1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: When we're ready to merge let's use tagged versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that is understood.

// Resolver provides path resolution to IPFS
// It has a pointer to a DAGService, which is uses to resolve nodes.
// It reference to a FetcherFactory, which is uses to resolve nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "It references a" or "It has a reference to a"

break
}
// resolve node before last path segment
nodes, lastCid, depth, err := r.resolveNodes(ctx, c, pathSelector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without loading the block itself

To make sure I understand, are you saying the issue is that if rootCID/foo/bar is a CID bafybar then we'll end up loading the block instead of just returning the CID? In the case where rootCID/foo/bar is an integer but rootCID/foo is a CID we'll have to load the rootCID/foo block anyway.

If so @warpfork any thoughts on if this is the way to resolve this? I'd think that basically all of the logic from this function would live in a selector traversal.

Comment on lines 139 to 141
if len(nodes) < 1 {
return nil, nil, fmt.Errorf("path %v did not resolve to a node", fpath)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this allowed to not equal 1? Since we're using a pathLeafSelector shouldn't we only match the last element in the path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any instance in which it could be greater than 1 based of the way selectors work.

It should equal exactly 1, assuming the entire path was resolved -- if it wasn't, then it will have 0 nodes, indicating a failure.

resolver/resolver.go Show resolved Hide resolved
}

// ResolveSingle simply resolves one hop of a path through a graph with no
// extra context (does not opaquely resolve through sharded nodes)
func ResolveSingle(ctx context.Context, ds ipld.NodeGetter, nd ipld.Node, names []string) (*ipld.Link, []string, error) {
// Deprecated: fetch node as ipld-prime or convert it and then use a selector to traverse through it.
func ResolveSingle(ctx context.Context, ds format.NodeGetter, nd format.Node, names []string) (*format.Link, []string, error) {
return nd.ResolveLink(names)
}

// ResolvePathComponents fetches the nodes for each segment of the given path.
// It uses the first path component as a hash (key) of the first node, then
// resolves all other components walking the links, with ResolveLinks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the code or the comment here need to change? Why aren't we using ResolveLinks as before, do the functions not quite match up anymore, is it an efficiency thing, etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should change and I will make said change.

As for why they don't quite operate the same any more... the functions just don't line up in quite the same way and it ends up being easier to seperate them.

resolver/resolver_test.go Show resolved Hide resolved
Comment on lines +88 to +90
uNode, ok := node.(unixfsnode.PathedPBNode)
require.True(t, ok)
fd := uNode.FieldData()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? randNode() generates a DagPB node that is likely not valid UnixFS.

Should we be checking the "Data" field in the IPLD node instead, or am I missing the point here?

Copy link
Contributor Author

@hannahhoward hannahhoward Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnixFSNode the repo should probably be renamed something along the lines of "ADLs for DagPB" because it's not entirely just UnixFS.

"PathedPBNode" is an ADL that's not quite UnixFS but also not just DagPB with no additional logic.

Basically, it turns a non-unixfs DagPB node into a map where the keys are the Link names.

We end up needing this for path traversal even when we're not dealing with unixFS.

Perhaps this particular ADL should not live in the go-unixfsnode repo. I don't particularly feel like it's pressing to move it right this second though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is interesting. It seems like we currently (i.e. before the IPLD prime PRs) support pathing through DagPB (not UnixFS) data by name even when ignoring the UnixFS resolver. This also seems to preclude any ability to address by path the Data fields in DagPB nodes or the Size fields in their links.

AFAICT this seems like a mistake since it prevents some of the DagPB data from being addressable by pathing. This is likely something we'll need to resolve as part of ipfs/kubo#7976 (comment).

resolver/resolver_test.go Outdated Show resolved Hide resolved
resolver/resolver_test.go Outdated Show resolved Hide resolved
…fields (#42)

* fix(resolver): LookupBySegment to handle list indexes as well as map fields

* Add test for /mixed/path/segment/types/1/2/3
@hannahhoward
Copy link
Contributor Author

hannahhoward commented Aug 6, 2021

@warpfork regarding your larger analysis, I largely agree with your conclusions.

I would say the goals of this work were two fold:

  1. Unlock a couple key stakeholder use cases, namely using DAG-JOSE
  2. Learn more about what is needed to fully replace our old ipld libraries in IPFS

I would say overall these two goals have been achieved. With the caveat that I think we learned more about some things than others.

go-unixfsnode, while somewhat inadvertant, turned out to be the most fruitful exploration here, teaching us a lot about what ADLs can do, and how to actually get rid of go-unixfs eventually.

go-fetcher in my mind was originally intended to eventually replace go-merkledag and go-blockservice, but that would require a bunch more work. Moreover, I'm not sure we've proved the interfaces by any means. I suspect the integration with Filecoin will help drive clarity on what our top level "go get me some stuff, either from disk or the internet" interface should ultimately look like. That's why I'm personally happy it did not make it on to the IPFS Core API. I think we learned something but not as much as I'd hoped. As you say, right now it's mostly glue code and utility functions on top of go-ipld-prime & go-blockservice

While we haven't "won" yet in the sense of making the code simpler, ultimately, almost every single node now used in IPFS has an ipld-prime implementation underneath (mostly do, for better or worse, to our go-ipld-legacy shim layer). This was always a part 1 of many, and I really would like to get it merged soon so it's not hanging out there forever.

@hannahhoward
Copy link
Contributor Author

@aschmahmann I've address the various PR comments and I hope we're at the point of approval?

@BigLep
Copy link

BigLep commented Aug 10, 2021

@aschmahmann : is going to leave a comment on some weirdness he sees in the test.

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a nit, but otherwise seems good to go here.

@hannahhoward hannahhoward merged commit ea3a116 into master Aug 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants