Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPLD Pathing Redux #97

Closed
jbenet opened this issue Apr 30, 2016 · 7 comments
Closed

IPLD Pathing Redux #97

jbenet opened this issue Apr 30, 2016 · 7 comments

Comments

@jbenet
Copy link
Member

jbenet commented Apr 30, 2016

A week or so ago, @nicola @dignifiedquire @Stebalien and I discussed lots of IPLD things in detail.

Among them, we discussed pathing again. One result here was that we revisited the "two pathing delimiters" and link-properties problem. @Stebalien and I got to something in the end, and many approve (including @dignifiedquire and @diasdavid). @mildred i would be very interested in your thoughts. And I believe this is more in line with your original thinking too, or at least things you and I considered.

In the meeting we recalled many observations from previous conversations, including:

  • we want the format to be as absolutely simple as possible, requiring very little conceptual overhead
  • pathing with two different delimiters gets complicated for users
  • the // shorthand for /@link/ increases conceptual complexity
  • breaking apart into layers (Layers 2, 3, 4) helps address graph shaping problems
  • the {"@link": "hash"} construction is getting around the problem that JSON does not allow representing a different value type natively (eg a link value). Other formats (CBOR, YML) do not have this problem.

Right now we have:

{
  "mode": 0444,
} // hash = Qmfoo

{
  "foo": {
    "@link": "Qmfoo",
    "mode": 0755,
  }
}

> /foo
{
  "@link": "Qmfoo",
  "mode": 0755,
}

> /foo/mode
0755

> /foo/@link/mode
0444

> /foo//mode
0444

Recalled and new observations where

  • The "link properties" problem is introduced by using an object to represent links.
  • We use an object to represent links because other JSON formats (like EJSON) use objects like {"$date": "2016-01-01"} to represent other Value Types.
  • The "link objects" (such as a directory entry with a mode in a directory) are different from "link values". The current way to do things mixes these two.
  • We'll already have to re-path in layer 4 or layer 3 to address things like /foo/bar/baz representing a unixfs directory traversal. (ie. the shape of the data, and without // or /@link/ traversals)
  • Users will have to use /@link/ a lot all over the place.
  • There is no way to address the Link Value on it's own.

We found a very nice middle ground between #1 in https://github.com/ipfs/ipld-examples/ and the current way to do things.

The proposal is to move to something like:

{
  "mode": 0444,
} // hash = Qmfoo

{
  "foo": {
    "entry": {"&": "Qmfoo"},
    "mode": 0755,
  }
}

> /foo/entry
{"&": "Qmfoo"}

> /foo/mode
0755

> /foo/entry/mode
0444
  • Instead of using {"@link": "<hash>"} to represent a Link Value, use {"&": "<hash>"}
  • Link Values are not objects, and SHOULD NOT include other properties.
  • Users can still create "link objects" to add properties to links
  • / is used for all traversal (local properties or link resolution)
  • No need for a // shorthand
  • Users are free to use link or whatever other propety name they wish. this is not a reserved keyword

To be clear, this is what #1 proposed all along, which has been championed by many of us along the way. It took a while for us to discover all the subtle, intricate problems that emerged otherwise.

Why @ instead of @link:

Why @? We did not want to use @link because we do not want to have things like:

{
  "link": {"@link": "Qmfoo"}
  "mode": 0755
}

Well then, maybe use &?

We can use any symbol for the Link Value. & was proposed (and seems to be winning favor) because:

  • @ is just a hold-over from JSON-LD.
  • & is commonly used across programming languages to describe "references"
  • $ is overused and could create problems.

However & may have problems:

  • It has meaning in URL query strings. (not sure this is a problem, i dont think it shows up in paths anywhere, but it might...)

CBOR formatting simplification

@dignifiedquire mentioned that this significantly simplifies the CBOR formatting implementation. I'll let him describe more as I don't recall the comments very well.

@jbenet
Copy link
Member Author

jbenet commented Apr 30, 2016

Overall, this yields a much, much simpler pathing system, and the shape of the data is almost exactly the same as what we currently have.

I'd hate to change the format yet again, but this is very much a "now or never" type of thing (before lots of data exists with this), and many people are asking to do this.

@Stebalien
Copy link
Member

I like & because it's the C-like reference operator but, just for completeness, what about /? Were we planning on allowing / in field names?

{
  "link": {"/": "Qmfoo"},
  "mode": 0755
}

@mildred
Copy link
Contributor

mildred commented May 4, 2016

I quite like the whole idea of dropping extra link properties and have just plain links without other properties. It makes for a simpler design and is much more simple to use. I would compare it to XML which has node properties and node content <node prop="foo">value</node>.

I'm not sure about the & however. Perhaps we should allow for escaping it, or perhaps we can just live with the fact that we might get extra links when we meant none. I don't think that would be a problem.

@nicola
Copy link
Member

nicola commented May 4, 2016

In other words, this is the "direct-link" that I originally thought IPLD had? and there is no link properties, there are just properties and the higher layer will know which ones are link properties?

@jbenet
Copy link
Member Author

jbenet commented May 10, 2016

@Stebalien said

what about /?

i'm ok with {"/": "Qmfoo"} as well. helps, because just / should never be a name.

Were we planning on allowing / in field names?

there may be representations that would merit compressing: eg compress { a: { b: { c: { d: 1 } } } } into { a/b/c/d: 1 }.

this is easy to achieve by stating:

  • we consider keys in the sorted order
  • { a: { b: 1 } } = { a/b: 1 }
  • given { a: { b: 1}, a/b: 2 }, resolve "a/b" == 1

i think that for simplicity, any / should be escaped with url encoding, and / is a property name is just invalid. may be way simpler. (that's how we have it now i believe)


@mildred

I quite like the whole idea of dropping extra link properties and have just plain links without other properties. It makes for a simpler design and is much more simple to use.

thanks, and thanks for pointing it out months ago.


@nicola said

In other words, this is the "direct-link" that I originally thought IPLD had? and there is no link properties, there are just properties and the higher layer will know which ones are link properties?

Yes, sort of. In essence, yes. but the difference is that in an object like:

{
  files: {
    foo.jpg: {
      link: {/: Qmfoojpghash},
      mode: 755,
      owner: jbenet,
    }
  }
}

i would consider

    {
      link: {/: Qmfoojpghash},
      mode: 755,
      owner: jbenet,
    }

the link object (not special at all, just a lose abstraction for the higher layer), and

{/: Qmfoojpghash}

the link value.

Thus, we can guide people to create structures like that, when they want metadata. and, given we have the higher layers (which we all like and pointed out independently), we can still make things like this work:

> unixfs cat $hash/dir1/dir2/foo
bar
> unixfs stat $hash/dir1/dir2/foo.jpg
16777220 3316382 -rw-r--r-- 1 jbenet staff 0 65 "Apr 23 00:53:49 2016" "Apr  7 23:27:47 2016" "Apr  7 23:27:47 2016" "Apr  7 23:27:47 2016" 4096 8 0 foo

or with FUSE or a kernel fs

> ipfs mount /ipfs
> cat /ipfs/$hash/dir1/dir2/foo
bar
> stat /ipfs/$hash/dir1/dir2/foo.jpg
16777220 3316382 -rw-r--r-- 1 jbenet staff 0 65 "Apr 23 00:53:49 2016" "Apr  7 23:27:47 2016" "Apr  7 23:27:47 2016" "Apr  7 23:27:47 2016" 4096 8 0 foo

in short, yes, everyone who wanted no link properties was right all along (we all were). and everyone who wanted properties in links (we all did) can still have them. (we can have our cake and eat it too).

@jbenet
Copy link
Member Author

jbenet commented May 10, 2016

btw, the link object doesn't have to show up in the IPLD spec as a construct at all, this is more of a documentation / best practice thing for higher abstractions.

@dignifiedquire
Copy link
Member

Closing as these changes have been merged into the spec

@nicola nicola mentioned this issue Jun 13, 2016
dignifiedquire added a commit that referenced this issue Jul 1, 2016
Remove link properties from CBOR tags as per #97
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants