Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

feat: preload content #1464

Merged
merged 21 commits into from
Jul 27, 2018
Merged

feat: preload content #1464

merged 21 commits into from
Jul 27, 2018

Conversation

alanshaw
Copy link
Member

@alanshaw alanshaw commented Jul 24, 2018

refs #1459

This PR adds a new config property preload:

new IPFS({
  preload: {
    enabled: false,
    addresses: ['/multiaddr/api/address']
  }
})
  • preload.enabled (default false) enables/disabled preloading - should the default be false?
  • preload.addresses array of node API addresses to preload content on. This are the addresses we make a /api/v0/refs?arg=QmHash request to, to initiate the preload

This PR upgrades the following APIs to preload content. After adding content with ipfs.files.add (for example), we make a request to the first preload gateway addresses (providing preload.enabled is true), and will fall back to the second etc.

  • dag.put
  • block.put
  • object.new
  • object.put
  • object.patch.*
  • mfs.*

MFS preloading is slightly different - we periodically submit your MFS root to the preload nodes when it changes.

NOTE: this PR adds an option to dag, block and object APIs allowing users to opt out of preloading by specifying preload: false in their options object.

TODO

  • This PR will make a head request to the root hash of uploaded content, I need to double check that this will preload that hash as well as all of it's descendants or if we need to send a preload request for EVERY hash we add Now uses /api/v0/refs?arg=QmHash instead
  • Add the new bootstrap and gateway addresses for the preload nodes when provided by @lgierth

@ghost ghost assigned alanshaw Jul 24, 2018
@ghost ghost added the status/in-progress In progress label Jul 24, 2018
@alanshaw alanshaw mentioned this pull request Jul 24, 2018
23 tasks
@alanshaw alanshaw changed the title [WIP] feat: preload content feat: preload content Jul 25, 2018
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • preload.enabled (default false) enables/disabled preloading - should the default be false?

IIUC the reason of endeavor is to have it enabled by default in browser context to restore magic of uploaded content being immediately seen by public gateways.

So it should be enabled by default in browsers, can be disabled by default in node.

// Tools like IPFS Companion redirect requests to IPFS gateways to your local
// gateway. This is a hint to those tools that they shouldn't redirect these
// requests as they will effectively disable the preloading.
const redirectOptOutHint = 'x-ipfs-preload'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to note is that browser Companion expects x-ipfs-companion-no-redirect.
If it does not see it in URL, it will redirect to local gateway, which we don't want (local node on a laptop may go offline).

We can add another URL hint to Companion, but is it necessary?
To limit the number of moving pieces before Dweb Summit I would just use x-ipfs-companion-no-redirect for now.

@daviddias
Copy link
Member

enables/disabled preloading - should the default be false?

I propose that we default to true. It improve user experience a ton.

@lidel
Copy link
Member

lidel commented Jul 25, 2018

This PR will make a head request to the root hash of uploaded content, I need to double check that this will preload that hash as well as all of it's descendants or if we need to send a preload request for EVERY hash we add

@alanshaw check Lars' comment at #1459 (comment)

IIUC with /api/v0/refs?r=true&arg=<cid> when adding N files you only need 1 preload request (for root) instead of N+1 (one per file + wrapping dir).

@alanshaw
Copy link
Member Author

Preload on MFS - do we? It’ll mean that every cp, mv, write etc. will need a preload. It's potentially a lot of churn and a lot of preload requests.

Half of me feels like MFS is your own storage and you don’t really deal with CIDs. Right now if I wanted to share something with someone via IPFS I'd use ipfs.add and share the hash, not copy to MFS, stat path, and then share the hash.

If we don't actively preload MFS content, it'll still be accessible (provided the sharer stays online) due to adding the preload nodes to our bootstrap list, but it won't be as readily available.

On the other hand, I think it's confusing to have some content preloaded and other content not. Data added to IPFS should always be readily available and preloading everything will ensure that.

I am leaning towards preload MFS also.

@diasdavid @achingbrain @olizilla can have opinions?

@daviddias
Copy link
Member

@alanshaw you can set a rule that when MFS gets used, you set a flag that gets checked every 30s (or so) and if it is true, it will preload the current root. This way we don't bombard the preload gateways for every node update.

@alanshaw
Copy link
Member Author

alanshaw commented Jul 26, 2018

I think that could work but we'd have to switch to requesting /api/v0/refs?r=true&arg=QmFoo (#1459 (comment)) instead of the HEAD request otherwise we'd have to recurse over everything in MFS & send (potentially) hundreds of requests.

@achingbrain
Copy link
Member

achingbrain commented Jul 26, 2018

it's confusing to have some content preloaded and other content not

This is probably why it's worth doing. There will be a lot of network activity and extra storage taken up on the preload peers, and the MFS abstraction becomes really leaky when you have to share new CIDs for a given path every time you update something but it's probably better than not being able to get the content someone else has shared with you.

set a flag that gets checked every 30s (or so) and if it is true, preload the current root.

If you wrap ipfs.dag.put with something that does the preloading, every MFS operation will get preloaded automagically as that's what it uses internally. No setInterval required.

@alanshaw alanshaw changed the title feat: preload content [WIP] feat: preload content Jul 26, 2018
@alanshaw alanshaw changed the title [WIP] feat: preload content feat: preload content Jul 26, 2018
@alanshaw
Copy link
Member Author

Ok this should be good for another look now - preload is enabled by default and it now uses the /api/v0/refs API for preloading. It will preload at the first address configured and fallback to the subsequent ones on failure.

Preload API requests are now done asynchronously so they don't effect the time it takes to add content. The preload module keeps track of in flight requests and cancels any that are still flying when stop is called on the node.

@@ -231,6 +231,9 @@ Creates and returns an instance of an IPFS node. Use the `options` argument to s
- `enabled` (boolean): Make this node a relay (other nodes can connect *through* it). (Default: `false`)
- `active` (boolean): Make this an *active* relay node. Active relay nodes will attempt to dial a destination peer even if that peer is not yet connected to the relay. (Default: `false`)

- `preload` (object): Configure external nodes that will preload content added to this node
- `enabled` (boolean): Enable content preloading (Default: `true`)
- `addresses` (array): Multiaddr API addresses of nodes that should preload content. NOTE: nodes specified here should also be added to your node's bootstrap address list at `config.Boostrap`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will later need to over all the config values to the Config object like we did for js-libp2p.

@@ -26,4 +26,4 @@ exports.dht = require('./dht')
exports.dns = require('./dns')
exports.key = require('./key')
exports.stats = require('./stats')
exports.mfs = require('ipfs-mfs/core')
exports.mfs = require('./mfs')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

})

return mfs(mfsSelf, mfsSelf._options)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This work is solid. Great job with covering all the cases and adding tests for them.

@@ -8,6 +8,10 @@ const schema = Joi.object().keys({
Joi.string()
).allow(null),
repoOwner: Joi.boolean().default(true),
preload: Joi.object().keys({
enabled: Joi.boolean().default(true),
addresses: Joi.array().items(Joi.multiaddr().options({ convert: false }))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we require at least one entry here?

return log.error('failed to stat MFS root for preload', err)
}

if (rootCid !== stats.hash) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, if the root cid has not changed, it looks like we stop checking for changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good catch! If only there was a test for this...

License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
License: MIT
Signed-off-by: Alan Shaw <alan@tableflip.io>
@alanshaw alanshaw merged commit bffe080 into master Jul 27, 2018
@ghost ghost removed the status/in-progress In progress label Jul 27, 2018
@alanshaw alanshaw deleted the feat/preload-content branch July 27, 2018 14:04
@olizilla
Copy link
Member

🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants