Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

IPLD Data Importing - Set of Importers #41

Closed
daviddias opened this issue Dec 13, 2015 · 4 comments
Closed

IPLD Data Importing - Set of Importers #41

daviddias opened this issue Dec 13, 2015 · 4 comments

Comments

@daviddias
Copy link
Member

One of the core pieces to get jsipfs working, is having a IPLD importer (file -> chunker -> merkleDAG), so that we can create MerkleDAG structures that can be used to be moved around bitswap.

The goals of the importers are described here - https://github.com/ipfs/specs/blob/master/overviews/implement-ipfs.md#ipld-data-importing

Probably it is possible to use some of the work developed by @mafintosh on his hyperdrive module, which although focuses on rabin chunking, should be easy to detach and use another chunking algorithm.

This takes me to my next point. It would be great to have a set of primitives to digest, chunk and parse files, so that different chunkers can be replaced/added without any trouble. Similar to the efforts we've been doing for libp2p interfaces.

@mafintosh
Copy link

I'm actually working on making on decoupling the chunker so any chunking stratrgy can br used (a chunker is just a through stream)

@jbenet
Copy link
Member

jbenet commented Dec 14, 2015

It would be great to have a set of primitives to digest, chunk and parse files, so that different chunkers can be replaced/added without any trouble. Similar to the efforts we've been doing for libp2p interfaces.

Agree fully. I started with Go but vendoring crap... time to pick it back up!

So far, we have this model in go-ipfs:

  • chunkers or splitters algorithms that read a stream and produce a series of chunks. for our purposes should be deterministic on the stream. divided into:
    • universal chunkers which work on any streams given to them. (eg size, rabin, etc). should work roughly equally well across inputs.
    • specific chunkers which work on specific types of files (tar splitter, mp4 splitter, etc). special purpose but super useful for big files and special types of data.
  • layouts or topologies graph topologies (eg balanced vs trickledag vs ext4, ... etc)
  • importer is a process that reads in some data (single file, set of files, archive, db, etc), and outputs a dag. may use many chunkers. may use many layouts.

Do we have any difference here? let's put this into a spec, break up the relevant go-ipfs packages, and start on the js stuff. 😄

@mafintosh yay! collab! \o/

@jbenet
Copy link
Member

jbenet commented Dec 14, 2015

@whyrusleeping see the model above.

@daviddias
Copy link
Member Author

We've made unixfs-engine to import and export files and directories to MerkleDAG structs, the same way that go-ipfs does.

We will touch again on the idea of the layout agnostic importer again. @mafintosh How is your stuff? Did you end up to decouple the chunker part?

I'm also going to move ahead and close this issue, so that we can aggregate the discussion on ipfs/specs#57

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants