RFC: unite the files APIs #552

alanshaw · 2019-11-05T09:13:29Z

Not finished yet!

License: MIT Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>

Stebalien

I really like these changes!

Stebalien · 2019-11-05T11:55:38Z

SPEC/FILESv2.md

+
+#### 1.2 Changes to returned values
+
+1. Importing a single file will now yield two entries, one for the imported file and one for the containing directory. Note this change can be considered almost backwards compatible; in the current API you'd receive an array of one value which you would access like `files[0]`. If you collect the entries in the new API you'd still access it like that.


Note: in terms of backwards compat, I'd like to just switch to a new API version and create a shim to maintain support for v0.

Stebalien · 2019-11-05T11:58:19Z

SPEC/FILESv2.md

+#### 1.2 Changes to returned values
+
+1. Importing a single file will now yield two entries, one for the imported file and one for the containing directory. Note this change can be considered almost backwards compatible; in the current API you'd receive an array of one value which you would access like `files[0]`. If you collect the entries in the new API you'd still access it like that.
+2. Instead of a `hash` property, entries will instead have a `cid` property. In entries yielded from core it will be a CID instance, not a string (as agreed in [ipfs/interface-js-ipfs-core#394](https://github.com/ipfs/interface-js-ipfs-core/issues/394)). In the HTTP API/CLI it will necessarily be a string, encoded in base32 by default or whatever `?cid-base`/`--cid-base` option value was requested.


I'd like to consider returning namespaced paths instead of CIDs:

In theory, it allows us to import into other filesystem formats (e.g., swarm, git?, etc.).

If we support embedding small files in directory entries, not all files will have CIDs (unixfs 2.0 should, ideally, support that).

Stebalien · 2019-11-05T12:04:15Z

SPEC/FILESv2.md

+| `ipfs stat` | ✅ | ✅ |
+| `ipfs write` | ❌ | ✅ |
+
+The `/ipfs` directory in MFS problem can simply be avoided by either assuming IPFS path (the current solution) or by denying writes to a directory of this name.


If we're going to join mfs with everything else, I'd like to namespace MFS under something like /local or `/files. That way, we can continue adding new namespaces without clobbering user-defined directories.

/ipfs

/ipns

/ipld?

/p2p?? (might be nice to expose this as sockets when mounted with fuse).

/git/Cid/... ???

I would also like to have clean namespacing to not worry about edge cases and vulnerabilities related to that.

I'd go with /local because

no cognitive overhead

it emphasizes data being "local to the node"

things named localhost and .local already exist in other systems, and familiarity reduces cognitive overhead

Stebalien · 2019-11-05T12:05:06Z

SPEC/FILESv2.md

+
+The file system APIs will be streaming by default. Due to the way we store and retrieve data it makes sense for our API methods to stream content when retrieving it locally or over the network. Buffering APIs can cause OOM issues, give no feedback to the user on progress and they can be trivially wrapped to collect all items in order to achieve the same effect as a buffering API.
+
+Streaming APIs will use a language native / standard library feature that is supported in all runtimes that IPFS is actively targeting. This prevents bloat and by only supporting one streaming mechanism it reduces API surface area.


Note: Let's also stop streaming bytes. If we switch to only streaming objects, we can stream metadata (progress, etc.).

lidel · 2019-11-05T14:23:34Z

SPEC/FILESv2.md

+| `ipfs files read` | `ipfs read` |
+| `ipfs files rm` | `ipfs rm` |
+| `ipfs files stat` | `ipfs stat` |
+| `ipfs files write` | `ipfs write` |


❤️ this proposal, and have only one concern:

While I like the symmetry of import / export, I have a problem with write if it uses different DAG builder than import, because people will continue to be confused why file added with import has different CID than write.

Current ipfs add creates balanced dag, while ipfs files write creates trickle-dag that supports adding data at the end, so is more like append operation.

Loose idea: does it make sense to have write which creates the same DAG as import and additional append command that is a special version of write that only adds data at the end and builds a trickle dag?

The js files.write implementation takes a strategy parameter that can be balanced or trickle (because ipfs.add and ipfs.files.write both use the same unixfs-importer which receives the parameter).

If you files.write a file, it'll get converted into a trickle DAG, unless you pass strategy=balanced in which case it'll convert it into a balanced DAG (rebalancing as necessary).

I think we could standardise on one and allow the user to override as they see fit.

If we could then extend the UnixFS v1.5 metadata proposal to include the originally selected DAG strategy we'd be able to optimise for appending to trickle DAGs.

daviddias · 2019-11-05T15:39:36Z

YES! ❤️ Will review more carefully soon. Prev work for context to others at ipfs/specs#98 && https://github.com/ipfs/interface-js-ipfs-core/issues/284

achingbrain · 2019-11-07T15:33:14Z

SPEC/FILESv2.md

+
+Adding files to MFS removes any performance overhead of creating/maintaining pinset DAG nodes, unburdens the user from understanding pinning (for the most part), improves visibility of added files and makes it significantly easier to find or remove files that were previously imported.
+
+The `ipfs import` command _optionally_ takes a MFS path option `--dest`, a directory into which imported files are placed. Note the destination directory is automatically created (but not any parents). If the destination directory exists already then an error is thrown, unless the `--overwrite` flag is set. This causes any existing files with the same name as the imported files to be overwritten.


Small point, but maybe --force instead of --overwrite? Feels a bit more unixy.

alanshaw added 9 commits October 29, 2019 18:05

feat: background, pathing and one problem

e2e1e2c

License: MIT Signed-off-by: Alan Shaw <alan.shaw@protocol.ai>

feat: more problems

3752306

feat: more words

7792664

feat: add toc and some tweaks

2773cb6

feat: allow both paths

3d7645d

fix: table

739df1d

feat: more tweaking

fa2ead7

feat: more tweaking

45310b2

feat: fix wordo

5a625fd

Stebalien reviewed Nov 5, 2019

View reviewed changes

lidel reviewed Nov 5, 2019

View reviewed changes

achingbrain reviewed Nov 7, 2019

View reviewed changes

lidel mentioned this pull request Nov 25, 2019

Language: replace "upload" with "import" ipfs/ipfs-companion#817

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: unite the files APIs #552

RFC: unite the files APIs #552

alanshaw commented Nov 5, 2019

Stebalien left a comment

Stebalien Nov 5, 2019

Stebalien Nov 5, 2019

Stebalien Nov 5, 2019

lidel Nov 5, 2019 •

edited

Loading

Stebalien Nov 5, 2019

lidel Nov 5, 2019 •

edited

Loading

achingbrain Nov 7, 2019

daviddias commented Nov 5, 2019

achingbrain Nov 7, 2019


		#### 1.2 Changes to returned values

		1. Importing a single file will now yield two entries, one for the imported file and one for the containing directory. Note this change can be considered almost backwards compatible; in the current API you'd receive an array of one value which you would access like `files[0]`. If you collect the entries in the new API you'd still access it like that.


		The file system APIs will be streaming by default. Due to the way we store and retrieve data it makes sense for our API methods to stream content when retrieving it locally or over the network. Buffering APIs can cause OOM issues, give no feedback to the user on progress and they can be trivially wrapped to collect all items in order to achieve the same effect as a buffering API.

		Streaming APIs will use a language native / standard library feature that is supported in all runtimes that IPFS is actively targeting. This prevents bloat and by only supporting one streaming mechanism it reduces API surface area.


		Adding files to MFS removes any performance overhead of creating/maintaining pinset DAG nodes, unburdens the user from understanding pinning (for the most part), improves visibility of added files and makes it significantly easier to find or remove files that were previously imported.

		The `ipfs import` command _optionally_ takes a MFS path option `--dest`, a directory into which imported files are placed. Note the destination directory is automatically created (but not any parents). If the destination directory exists already then an error is thrown, unless the `--overwrite` flag is set. This causes any existing files with the same name as the imported files to be overwritten.

RFC: unite the files APIs #552

Are you sure you want to change the base?

RFC: unite the files APIs #552

Conversation

alanshaw commented Nov 5, 2019

Stebalien left a comment

Choose a reason for hiding this comment

Stebalien Nov 5, 2019

Choose a reason for hiding this comment

Stebalien Nov 5, 2019

Choose a reason for hiding this comment

Stebalien Nov 5, 2019

Choose a reason for hiding this comment

lidel Nov 5, 2019 • edited Loading

Choose a reason for hiding this comment

Stebalien Nov 5, 2019

Choose a reason for hiding this comment

lidel Nov 5, 2019 • edited Loading

Choose a reason for hiding this comment

achingbrain Nov 7, 2019

Choose a reason for hiding this comment

daviddias commented Nov 5, 2019

achingbrain Nov 7, 2019

Choose a reason for hiding this comment

lidel Nov 5, 2019 •

edited

Loading

lidel Nov 5, 2019 •

edited

Loading