Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ipfs/go-bitswap] Reduce Broadcasting #81

Open
Stebalien opened this issue Oct 5, 2021 · 5 comments
Open

[ipfs/go-bitswap] Reduce Broadcasting #81

Stebalien opened this issue Oct 5, 2021 · 5 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature status/deferred Conscious decision to pause or backlog

Comments

@Stebalien
Copy link
Member

Currently, when we can't find some content, we add it to the "broadcast" list and ask every connected peer. Unfortunately, this means that bitswap nodes under heavy load will:

  1. Take a while to find blocks (heavy load).
  2. Believe these blocks "can't be found" and broadcast requests.
  3. Slow down even more.

Furthermore, these massive broadcast lists end creating a lot of unnecessary traffic.

Insight: Sessions tend to be highly correlated (by design) so asking for more than one CID in a session is unlikely to give us any extra information. At the very least, broadcasting a request for a new CID B won't get us any closer to finding some previously broadcast CID A.

Given this insight, we can trim down the broadcast list to one want per session, which is much more manageable. It also doesn't really matter which want, as long as it's relevant.

Proposal:

  1. When a session has no peers and/or is "stuck", broadcast a single want request. This want should probably be the oldest still outstanding want in the session.
  2. When broadcasting, rank peers by "class" and iteratively widen the broadcast. Some heavily loaded nodes like public gateways may simply ignore some classes and/or "batch" requests.
    1. Peers who previously gave us blocks but got kicked out of the session (IIRC, this class exists but I'm not sure).
    2. Peers we've never seen before (never sent a broadcast to).
    3. Peers we've previously sent broadcasts to.
@dirkmc
Copy link
Contributor

dirkmc commented Oct 6, 2021

I agree with the general idea 👍 We don't need to broadcast so many CIDs.

Two considerations:

  • We may become connected to new peers during a long-running transfer. They are most likely to have the root CID. So probably it makes sense to always include the first CID in the broadcast.
  • Nodes are most likely to have the top branches in the DAG. As the transfer progresses, the blocks are more likely to be sparsely distributed (Node A has some, Node B has some blocks overlapping with Node A, some unique to Node B, etc). So towards the end of the transfer it may be better to include more CIDs in the broadcast.
  • It may make sense to iteratively increase the number of CIDs in the broadcast.

@Stebalien
Copy link
Member Author

So, my point here is that including more than one CID doesn't actually help us in the end. See:

Insight: Sessions tend to be highly correlated (by design) so asking for more than one CID in a session is unlikely to give us any extra information. At the very least, broadcasting a request for a new CID B won't get us any closer to finding some previously broadcast CID A.

Basically, sure, adding additional CIDs may get us those blocks. But it won't get us any closer to finding the CIDs that are already in our broadcast set.

We may become connected to new peers during a long-running transfer. They are most likely to have the root CID. So probably it makes sense to always include the first CID in the broadcast.

Yes, at that point we've already moved on. I'm proposing announcing the last block requested to find the "most specific" peers. "general" peers (e.g., ones that have the root only) may still be able to help "share the load" so to speak, but I'm not sure if it's worth broadcasting to find those peers.

@dirkmc
Copy link
Contributor

dirkmc commented Oct 8, 2021

it won't get us any closer to finding the CIDs that are already in our broadcast set

Ok makes sense 👍

I'm proposing announcing the last block requested to find the "most specific" peers

I think we may want to broadcast want-haves for several CIDs at once.
Consider the following scenario:

First we broadcast a request for cid1, and get back Peer A:

Peer A: cid 1, cid 2, cid 3, cid 4, cid 5

Peer B, C & D connect to us.

Peer B: cid 1, cid 6
Peer C: cid 1, cid 7
Peer D: cid 1, cid 8

We broadcast a want-have for cid 6 and get a HAVE from Peer B.
We broadcast a want-have for cid 7 and get a HAVE from Peer C.
We broadcast a want-have for cid 8 and get a HAVE from Peer D.

Instead of broadcasting one want-have at a time it would be more efficient to broadcast all of them together.

@Stebalien
Copy link
Member Author

It's a performance trade-off, but I don't think this scenario is all that common. The basic assumption of sessions is that most blocks being requested through the session are correlated in some way. If different peers have individual blocks, the session is pretty useless.

But I guess the best solution is to adapt:

  1. If we find we're having to pause to broadcast frequently, we increase the number of CIDs (in the session) that we broadcast.
  2. If we find that we're not having to frequently broadcast, we decrease the number of CIDs.

NOTE: we'd only increase the number of CIDs if the broadcast was actually successful. That is, it found a new peer that wasn't previously in the session that had the block.

@lidel lidel added kind/enhancement A net-new feature or improvement to an existing feature status/deferred Conscious decision to pause or backlog and removed need/triage Needs initial labeling and prioritization labels Apr 15, 2022
@RubenKelevra
Copy link
Contributor

@Stebalien how about using probabilities via feedback from previous attempts to determine which nodes should get a "broadcast" request?

We do number_of_supplies / number_of_requests for each peer – so for example, we get 0.03 for a 3% chance of a successful reply.

Then we feed this into an 'exponential moving average' with an alpha of 0.005. If we don't do any requests to peers, their value should just be frozen. This avoids dropping of the value by supplying zeros and also avoids that we have to do calculations.

To make sure new connections get some chance of getting requests, we should send a value of 1 into the 'exponential moving average' to get it started.

Now we use the result as the probability that we send a given CID-broadcast to a node we know (maybe not even be connected to). We might want to redial some known "useful" nodes in the past, to check if they got useful data for us now.

So if we have nodes which don't supply any useful data for us in the recent past, we slowly phase them out from being asked and just focus on the nodes which do useful response to us.

@Jorropo Jorropo changed the title Reduce Broadcasting [ipfs/go-bitswap] Reduce Broadcasting Jan 27, 2023
@Jorropo Jorropo transferred this issue from ipfs/go-bitswap Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature status/deferred Conscious decision to pause or backlog
Projects
None yet
Development

No branches or pull requests

4 participants