Questions on libp2p fundamentals

These questions were raised on public slack recently, but I think they should be answered publicly over here. (@raul, @vyzo, @stebalien)

  1. Who initiates the identify protocol during connection establishment (dialer, listener, or both)?

  2. Why is the logic behind storing observed addresses so complex?

  1. Is there something like multihash, but for signatures?
  • I linked to this multisig issue, but it’s several years old: https://github.com/multiformats/multiformats/issues/23
  • If Soramitsu were to implement multisig themselves:
    • is there any prior work that’s not reflected in that issue?
    • would a PR to the multicodec table for e.g. ed25519 signatures be well received
  1. Who initiates the identify protocol during connection establishment (dialer, listener, or both)?

In JS we do both, and it looks like go does as well (quickly looking at the code). The reason for this is so that both peers can learn the metadata (protocols, listening addresses, public key) about one another as the protocol is currently only request only (not accounting for the push protocol). We will also get our observed address from the peer, so it’s good to know that for both peers in the connection.

  1. Why is the logic behind storing observed addresses so complex?

I’m not sure about how the heuristics were determined, but as far as to the complexity, this is because observed addresses are unreliable as they may change across peers/connections. For example, on a NATd network, I am going to have different ports across my connections to external peers due to the network port mapping. I may even have different public IPs. If we simply take any observed address our peer sends us, the likelihood that another peer will be able to use that address is extremely small. We need to make a concerted effort to validate the observed addresses before we announce them to the network, otherwise we end up making it harder for peers to actually dial us. Adding addresses should be a very intentional action.

Doing this over time is also important, because our networks can change, especially for mobile devices and laptops, and that has a large impact on observed addresses.

  1. Is there something like multihash, but for signatures?

I’m not sure about this one.

Thanks for posting this here @mike, this does seem like a good venue. @Warchant hopefully this helps clear up some of the things we were discussing on Friday. I’m not sure what everyone else’s handle is, so feel free to share with the team :smile:

Hello! This topic is mostly about Identify, so I decided to land a little question here.

As I see, @yusef, @jacobheun, in this line, the Protobuf message is prepended with a varint, showing its length? But, in the documentation it’s not mentioned at all. So, is there such a varint, and if not, why the length field is being read from the connection before reading the Identify message itself?

Thanks.

The prepended varint isn’t specific to Identify, it applies to all Protobuf messages. The varint is added so that protobuf knows how much of the buffer to read, otherwise there’s no way to know when a single message stops. A buffer can contain any number of messages, so we have to have the correct segment of the buffer to decode the right thing, in this case the Identify message.