Skip to content
This repository has been archived by the owner on Dec 7, 2019. It is now read-only.

Deduplicate the DialTimeout variable #32

Closed
Stebalien opened this issue Jun 7, 2018 · 7 comments
Closed

Deduplicate the DialTimeout variable #32

Stebalien opened this issue Jun 7, 2018 · 7 comments

Comments

@Stebalien
Copy link
Member

Currently, DialTimeout is defined both here and in go-libp2p-swarm. However, only the version in go-libp2p-swarm is used.

The tricky part is that this package also defines AcceptTimeout, which is used (by transports). Basically, the swarm enforces the maximum dial timeout (because it can) while transports have to enforce accept timeouts (because the swarm has no control over them).

One solution would be to enforce the dial timeout in the transport. However, this feels wrong.

Given that this package depends on go-libp2p-net anyways, we may just want to move both the DialTimeout and the AcceptTimeout to that package.

Thoughts @whyrusleeping?

@whyrusleeping
Copy link
Contributor

@Stebalien I think the difference here (that I think was already pointed out somewhere) is that the timeout in swarm is the timeout for the entire dial process, where the timeout in the transport is the timeout for a single dial operation. Those values should definitely be separate and different.

@whyrusleeping
Copy link
Contributor

to be a little more clear, the timeout in swarm includes secio and other negotiations, while the timeout in the transport should probably just be the 'connect' call (in the tcp case).

@Stebalien
Copy link
Member Author

To get on the same page, currently:

  1. The DialTimeout from go-libp2p-transport is unused.
  2. The DialTimeout in go-libp2p-swarm covers the entire connect process (spanning multiple dials, address lookup, etc).
  3. The AcceptTimeout in go-libp2p-transport covers the entire connection negotiation.

(1) and (2) are bugs.


However,

to be a little more clear, the timeout in swarm includes secio and other negotiations, while the timeout in the transport should probably just be the 'connect' call (in the tcp case).

It's that "in the tcp case" part that bugs me. This won't generalize well to other transports. What if we just made this an option on the TCP transport? That is: libp2p/go-tcp-transport#24

Eventually, I'd like to abstract out a second layer of "raw" transports. That would allow every higher level transport that uses TCP to use this underlying transport. However, we punted on this to simplify things.

@magik6k
Copy link

magik6k commented Jun 8, 2018

cc myself

also for discoverability the pr removing the variable from go-libp2p-swarm - libp2p/go-libp2p-swarm#71

I like the idea of having timeouts on each layer, we should make sure to have some reasonably consistent way of setting them, otherwise we may end-up with way too many hardcoded numbers.

@whyrusleeping
Copy link
Contributor

we should make sure to have some reasonably consistent way of setting them, otherwise we may end-up with way too many hardcoded numbers.

👍

@Stebalien
Copy link
Member Author

Stebalien commented Jun 11, 2018

we should make sure to have some reasonably consistent way of setting them, otherwise we may end-up with way too many hardcoded numbers.

Unfortunately, we have a bunch of transports that all behave differently. What I don't want is a single variable that we recursively define as "the dial timeout is the dial timeout, take that to mean what you will".

With libp2p/go-tcp-transport#24 and libp2p/go-libp2p-swarm#71, we have:

       swarm          -- swarm.DialTimeout (complete dial to a peer)
        |
    transports        -- transport.DialTimeout (per dial to a peer, including secio)
        | 
  +---+-+-+------+
  |   |   |      |
 TCP QUIC Relay Onion  -- transport options (not constants) for partial dial timeouts.

Each of these transports need different timeouts:

  • For the TCP transport, we'd want to specify:
    • Expected network RTT
    • Expected secio time
  • For the QUIC transport, we can ask for finer-grained timeouts but the transport currently doesn't support them.
  • For the Relay transport, the timeout would have to be 2x transport.DialTimeout
  • For an Onion transport...

For now, I've introduced (libp2p/go-tcp-transport#24) a single TCP option (not constant) for setting the initial RTT for the TCP transport (defaulting to 5s). In the future, we can either:

  1. Default this to some general network RTT.
  2. Configure this in the swarm to some general network RTT.

@Stebalien
Copy link
Member Author

Conclusion: We're leaving the timeouts in this package and making them apply per-dial (for the entire dial). We may need more timeouts later but we can discuss that in libp2p/go-libp2p#1547

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants