Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot advertise WebTransport with public IP addresses discovered by identify (e.g. port forwarding behind a NAT with a dynamic public IP) #2233

Closed
MarcoPolo opened this issue Mar 29, 2023 · 18 comments · Fixed by #2239
Assignees

Comments

@MarcoPolo
Copy link
Collaborator

I haven't repro'd this myself yet, but we should investigate and add a test case for this.

@marten-seemann
Copy link
Contributor

We don't do address discovery via AutoNAT any more (#2148). But you're raising an interesting point, especially for #2229: How should AutoNAT treat WebTransport addresses when certificates are rotated?

One could argue that AutoNAT should only care about the part before the certificate hash. This would make sure that the WebTransport address doesn't change its NAT status when we rotate certificates. However, it would be nice if AutoNAT would discover if there was a problem with the certificate hashes.

Alternatively, we could write a bit of extra logic for WebTransport addresses: When certificates are rotated, we could just "transfer" the NAT status from the old to the new address, so that there's no downtime for WebTransport addresses. We'd then retroactively confirm that the new address actually works.

@aschmahmann
Copy link
Collaborator

aschmahmann commented Mar 30, 2023

@marten-seemann perhaps a different case, but @MarcoPolo was effectively filing a placeholder issue here for something I was reporting (thanks Marco 🙏).

User Problem

If I enable port forwarding on my router say 4001 -> mymachine:4001 and leave everything default (except UPnP disabled) then eventually my node discovers its public addresses via identify and then I can start advertising public addresses and become reachable 🎉.

However, this does not work with (ipv4) WebTransport addresses.

Technical issue

I think where things are currently falling apart is here

local := conn.LocalMultiaddr()
if !ma.Contains(ifaceaddrs, local) && !ma.Contains(oas.host.Network().ListenAddresses(), local) {

Unfortunately, doing this comparison fails because conn.LocalMultiaddr() for WebTransport gives the multiaddr without the certhashes while oas.host.Network().ListenAddresses() gives multiaddrs with the certhashes leading to a comparison failure.

Fix

I'm not sure which fix is most correct. Obviously we could just filter out the certhashes from the listen addresses before doing the comparison, but in order to run into fewer problems here I suspect (in lieu of having types that enforce this) we need at least:

  1. To be clear about what functions like conn.Local/RemoteMultiaddr return? Do they return the full multiaddr without the peerID component, the full (non-security) connection information, something else?
  2. Helper functions to help deal with comparing/converting multiaddr types (i.e. dealing with when a /p2p/... is included vs not, dealing with when certhash/... is included vs not)

Any ideas what's correct here (e.g. should conn.Local/RemoteMultiaddr be fixed, the ObservedAddrManager code, or something else)?

Will put up a PR with the ObservedAddrManager fix to test/demonstrate. This got more complicated, so wanted to have some conversations before doing more here.

@aschmahmann
Copy link
Collaborator

Had a chat with @MarcoPolo today about this and it seems like a reasonable strategy here would be something like:

  1. Define an interface to be used by implementers of Network that exposes how to get a transport given a multiaddr (or the transport ID's code). It is meant to operate on the last transport component of the multiaddr (i.e. the one closest to full resolution)
  2. Define an interface to be used by implementers of Transport that exposes a function that will take a multiaddr that's missing information from the transport and fills it in, as well as one that does the inverse
  3. Have the ObservedAddrManager's Addrs and AddrsFor return more fully resolved multiaddrs (e.g. with certhashes)
    • Perhaps something like doing the equality checks above after "upgrading" the local multiaddr with more information. Could also do the comparisons "downgraded" and then upgrade the local address afterwards.

There are some tradeoffs here that seem fine, including:

  1. Exposing more information via the passed interfaces
  2. Coupling some of this resolution to the transport implementation or creating more dependencies across packages -> we already have this present (e.g. the ObservedAddressManager has a host and the Swarm already has per-code transports)
  3. conn.Local/RemoteMultiaddr are still confusing here (as are the calls to the connection gater using those addresses) -> yep, that's probably a separate issue to tackle

@marten-seemann
Copy link
Contributor

I'm not convinced that 1. and 2. is the right approach. I'd like us to avoid introducing more complexity here before implementing #2229. Address management is already a mess, and we really need to clean it up. That's why #2229 is a P0.

@aschmahmann
Copy link
Collaborator

I'm not convinced that 1. and 2. is the right approach.

Maybe not, I think @MarcoPolo favored this approach of allowing the transports to do more of the transport-specific work. This seems pretty reasonable, it definitely felt pretty bad in the two PRs linked above to be putting in knowledge specifically about webtransport into the host/contructor code when it feels like it should just be another transport.

I'd like us to avoid introducing more complexity here before implementing #2229.

Yeah, I get it. When you have a larger refactor in mind the last thing you want are more moving pieces.

That's why #2229 is a P0.

It does make me feel better that it's a P0, but it seems like it's going to be a lot of work/time to land especially given upcoming events, etc. I'm really excited about all the work libp2p has done with WebTransport 🎉 so trying to find ways we can get at it sooner.

Do you have any ideas around how this could be implemented in a way that's more compatible with how you see the refactor playing out?

@marten-seemann
Copy link
Contributor

Yes, realistically speaking this won’t happen before IPFS Thing. I think I’d prefer to just merge your PR, as it’s the most succinct solution of the problem. As you’ve said, it’s not the cleanest way to solve it, but at least it doesn’t introduce too many new moving parts.

@MarcoPolo
Copy link
Collaborator Author

MarcoPolo commented Mar 31, 2023

We could do the most minimal thing here, which is a bit hacky:

  1. Add a helper func RemoveCerthashes.
  2. ObservedAddrManager Calls RemoveCerthashes before comparing things.
  3. basic_host will copy cerrthashes to webtransport addrs before returning multiaddrs from Addrs. This way conusmers of an addr get the correct and up to date webtransport addrs.

It's hacky because 1 & 3 involve these components having special knowledge of the whole system. But it would probably be the fastest and smallest short term fix. And we could write a test for the behavior to protect against a future regression.

@marten-seemann
Copy link
Contributor

marten-seemann commented Mar 31, 2023

That sounds reasonable. There won’t be any way of having an address pipeline that doesn’t have special knowledge of certhashes if we

  1. want AutoNAT to test the certhash and
  2. want this to work smoothly across certificate rotations.

I think that’s fine, as long as this doesn’t leak out of the pipeline (see what I did there ;)).

@aschmahmann
Copy link
Collaborator

I think I’d prefer to just merge your PR, as it’s the most succinct solution of the problem. As you’ve said, it’s not the cleanest way to solve it, but at least it doesn’t introduce too many new moving parts.

I think this might be clear already, but just making sure:

Unfortunately there are two issues and mine/Marco's existing PRs don't fix this problem

@MarcoPolo
Copy link
Collaborator Author

There won’t be any way of having an address pipeline to have special knowledge of certhashes if we

Is there a typo there? Did you mean

There won’t be any way of not having an address pipeline to have special knowledge of certhashes if we

Because I think there will be, but we can cross that bridge when we get there.

Or am I misunderstanding?

@SgtPooki
Copy link
Member

SgtPooki commented Apr 2, 2023

Have a method that outputs these that are broken in the wild:

with js-libp2p@0.43.3 and the connectionGater config of

connectionGater: {
    denyDialMultiaddr: (peerId, multiaddr) => {
      const isWebTransport: boolean = multiaddr.toString().includes('/webtransport')
      if (isWebTransport) {
        const hasCerthash: boolean = multiaddr.toString().includes('/certhash')
        if (!hasCerthash) {
          console.log('denyDialMultiaddr: ', peerId.toString(), multiaddr.toString())
          return true
        }
      }
      return false
    }

  },

I get the following output after running for a while

denyDialMultiaddr:  12D3KooWScrMbdX4eCmfjT2BM2iyX1Nf8rrUZUYNjvHNKtwxfBzD /ip4/192.0.2.0/udp/1234/quic/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/127.0.0.1/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/192.168.90.220/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/127.0.0.1/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/192.168.90.220/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWMJE47ftCn8H4HudbYhVMGRMdGbXkggGj76YdGJP3mbcp /ip4/54.173.143.87/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWScrMbdX4eCmfjT2BM2iyX1Nf8rrUZUYNjvHNKtwxfBzD /ip4/192.0.2.0/udp/1234/quic/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/127.0.0.1/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/192.168.90.220/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWNRsVgwCE4jsWS7ZSGjRfkiiSdy5rmtyWvGnCENpevfoZ /ip4/141.95.66.35/udp/4001/quic-v1/webtransport
src_sw_ts.bundle.js:1105 denyDialMultiaddr:  12D3KooWH2m5yQJtQeTbJ3Q4pSuXKgKkrwn282cJrefM5p9bMFEW /ip4/15.204.183.214/udp/4001/quic-v1/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWScrMbdX4eCmfjT2BM2iyX1Nf8rrUZUYNjvHNKtwxfBzD /ip4/192.0.2.0/udp/1234/quic/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWMJE47ftCn8H4HudbYhVMGRMdGbXkggGj76YdGJP3mbcp /ip4/54.173.143.87/udp/4001/quic-v1/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/127.0.0.1/udp/4001/quic-v1/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWQPZ6tvxnzd9tZ2SZwpxWUEhKcHEqkMCZ927trfYmTAjC /ip4/192.168.90.220/udp/4001/quic-v1/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWH2m5yQJtQeTbJ3Q4pSuXKgKkrwn282cJrefM5p9bMFEW /ip4/15.204.183.214/udp/4001/quic-v1/webtransport
getDhtLibp2pConfig.ts:35 denyDialMultiaddr:  12D3KooWNRsVgwCE4jsWS7ZSGjRfkiiSdy5rmtyWvGnCENpevfoZ /ip4/141.95.66.35/udp/4001/quic-v1/webtransport

And adding the peerIds I see to a set and doing some numbers:

% total peers with webtransport multiaddrs: 180 / 480 = 37.5%
Of webtransport peers, % that have broken webtransport multiaddrs: 5 / 180 = 2.7777777777777777%
% total peers with broken webtransport multiaddrs: 5 / 480 = 1.0416666666666665%

@aschmahmann
Copy link
Collaborator

I think there are a few different types of broken here. One is if some advertised addresses are missing certhashes, another is if some addresses aren't advertised at all (e.g. they have private IPs for quic-v1 and webtransport, but only public IPs for quic-v1). Getting an understanding of both sets would be helpful, my guess is the latter is a bigger set than the former but would need to measure to gain more confidence.

I'm also not sure if collecting numbers in a browser context (where you have fewer transports) is going to skew the numbers for your collection. It might not, but not sure.

@marten-seemann
Copy link
Contributor

@MarcoPolo Indeed, that was a very unfortunate typo in my post. I corrected it. Sorry for the confusion caused.

Agreed that we need a solution now. We can’t wait for the address pipeline to come into existence. No matter how quickly we manage to write the code for it, it still depends on AutoNAT v2, which requires a gradual rollout. So we’re likely talking about months anyway.

I suggest to start thinking about this from an API perspective. Assume that we have the address pipeline, how would the user tell their libp2p stack that they think that their node has a specific public address? Realistically, there are two options:

  1. A libp2p constructor option: AddressHints(… ma.Multiaddr)
  2. A function on the host that allows passing in of address hints at runtime, e.g. Host.ProcessAddressHint(ma.Multiaddr)

This would act somewhat similar to the AppendAnnounces in Kubo, but it could perform some address transformations, such as:

  • Append certhashes to WebTransport addresses.
  • If no WebTransport address is passed in, generate one, assuming that QUIC and WebTransport are sharing a port.

Kubo could then stop wrapping the Host.Addrs function (which I consider a hack anyway), and instead just pass the addresses into libp2p as an address hint.

Once we have the address pipeline, we can start confirming these addresses via AutoNAT v2.
Until that fully works, we can just trust that these addresses work, without any AutoNAT confirmations. This is effectively the same behavior that users get from AppendAnnounces in Kubo today (there’s also no verification step since it’s bypassing libp2p entirely).

@aschmahmann @MarcoPolo I believe this resolves both problems. Does this make sense to you? Any preferences on the API? I’m leaning towards a constructor option, since I don’t think that users will dynamically discover addresses outside of the libp2p stack (of course, libp2p will do address discovery internally).

@aschmahmann
Copy link
Collaborator

aschmahmann commented Apr 2, 2023

@marten-seemann I'm not sure how your proposal solves the issue with port-forwarding behind a dynamic IP. AFAICT the only thing that would be resolved via giving the host address hints for this case (as opposed to the externally-known IP/DNS address case) is if someone had go-libp2p auto-generated certs for some WebTransport addresses and external CA signed ones for others. However, this isn't possible today in go-libp2p anyway so isn't our current concern.

So for dealing with port-forwarding behind a dynamic IP (I'm going to rename this issue to that to avoid confusion with #2223) it seems like the main thing to figure out is where the plumbing goes to deal with attaching the autogenerated WebTransport certhashes to the dynamically discovered via identify public IP addresses.

So far there are two options proposed:

  1. Put some custom WebTransport code in the (basic)host and observed address manager (Cannot advertise WebTransport with public IP addresses discovered by identify (e.g. port forwarding behind a NAT with a dynamic public IP) #2233 (comment))
  2. Put some custom WebTransport code in WebTransport and then have the (basic)host and observed address manager call those functions on given multiaddrs after figuring out which transport to consult for processing a given multiaddr (Cannot advertise WebTransport with public IP addresses discovered by identify (e.g. port forwarding behind a NAT with a dynamic public IP) #2233 (comment))

I think @MarcoPolo and I are both fine doing option 1 (more hard coding) for now if there's concern about adding extraneous interfaces/public exports and then figuring out how to deal with the abstractions more cleanly (e.g. if we run into a similar issue with WebRTC) later.


The proposal around address hints seems better suited for resolving #2223.


Does this seem about right to both of you or am I missing something? (Note: I'm changing the issue name, but feel free to change it to something else if you have something better in mind)

@aschmahmann aschmahmann changed the title WebTransport addresses discovered via AutoNat are missing certhashes Cannot advertise WebTransport with public IP addresses discovered by identify (e.g. port forwarding behind a NAT with a dynamic public IP) Apr 2, 2023
@marten-seemann
Copy link
Contributor

The proposal around address hints seems better suited for resolving #2223.

Sorry, posted on the wrong issue 🤦🏻‍♂️

That said, I've convinced myself that the correct solution is to advertise the address including the certificate hash in Identify. To do that, that's just a super easy fix inside the WebTransport transport, to add the certhashes to the address returned by RemoteMultiaddr.

Why is this the correct solution? Assume a node behind a node that complete scrambles ports is listening on two WebTransport addresses, with different certificate configurations. In that case, the node would need to know which address is which one, and without the certhash, it won't be able to do so.

The downside is that this fix will take a while to propagate through the network. Not sure if we can live with that. On the other hand, this would justify implementing the hackiest / easiest solution, since we'll be able remove that code anyway right after the next release.

@MarcoPolo
Copy link
Collaborator Author

Assume a node behind a node that complete scrambles ports is listening on two WebTransport addresses, with different certificate configurations. In that case, the node would need to know which address is which one, and without the certhash, it won't be able to do so.

Certhashes are generated per host by their key. A host wont have more than one set of certhashes.

Also certhashes expire/rollover. We want to keep that in mind when comparing certhashes.

@MarcoPolo
Copy link
Collaborator Author

Because this is preventing the widespread use of webtransport, I'm going to this hacky fix to unblock things: #2233 (comment). @marten-seemann is there something specific with that proposed change you disagree with?

@marten-seemann
Copy link
Contributor

Assume a node behind a node that complete scrambles ports is listening on two WebTransport addresses, with different certificate configurations. In that case, the node would need to know which address is which one, and without the certhash, it won't be able to do so.

Certhashes are generated per host by their key. A host wont have more than one set of certhashes.

That's an implementation detail. It would be nice to not hard-code that into any components outside of the WebTransport transport.
That said, I'm happy with any hack that solves the immediate problem, as long as we keep this in mind when we do the address pipeline refactoring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants