When performing a Noise XX handshake does the Identity Key in the Handshake Payload always have to match that of the PeerID in the Multiaddr dialed?
For example:
When performing a random walk on the DHT we usually receive a list of K closer peers to the target peer. This list looks something like this…
Then when it comes time to dial these peers. I encapsulate the PeerID onto a dial-able address from the list to get a final Multiaddr similar to /ip4/xxx.xxx.xxx.xx/tcp/4001/p2p/QmfT8...Dnt. This lets the Noise Protocol Handler know the PeerID of who we’re expecting to dial. Once the Handshake payload comes through it ensures that the Remote Peer’s Identity Key matches that of the PeerID encapsulated onto the mutliaddr. If this check fails it drops the connection immediately.
Is this the correct behavior? The reason I ask is because the majority of peers returned during the DHT FIND_NODE walks end up failing this check (something like 2/3rds).
I was assuming there’s something wrong with my XX handshake implementation but the Bootstrap node list seems to work correctly (each nodes p2p PeerID matches their Identity Key in the Noise Handshake) and I can connect and interact with them as expected.
What are some possible explanations for the PeerID / PubKey mismatches that I’m encountering?
In the end it’s up to you, maybe you don’t care about security.
Current implementations error if that happen.
Is this the correct behavior? The reason I ask is because the majority of peers returned during the DHT FIND_NODE walks end up failing this check (something like 2/3rds).
Aren’t you just trying to dial localhost ?
Many nodes listens on localhost:4001, yours likely does too, so in the past you used to dial yourself over and over because other nodes would advertise the same IP as you. (now I belive go-libp2p filters localhost address and only try them if they have been found on the LAN dht (or via mdns))
The DHT is a wild world, it has a very big churn rate, there is many nodes that just popout of existence reusing the same address as a previous one and die after a few minutes. It wouldn’t surprise me if the DHT keep old records arround cached for sometime.
Or in other words.
If you cannot reproduce this in a simple 1 to 1 case, where you know the peer ID know it should match but doesn’t, I wouldn’t wory about it.
Okay so it sounds like enforcing this check is necessary.
I know I included a couple internal addresses in the example list, but I’m ensuring I only dial external addresses during the FIND_NODE lookups (filtering out local listening address as you mentioned).
So there’s a good chance that these records are just outdated? Maybe a node at that IP address rebooted with a different PeerID?
In case someone else comes across this thread with a similar issue…
It turns out I wasn’t handling embedded ED25519 pub keys properly and they were failing the NOISE Handshake IdentityKey == PeerID check described above.