Background
- I’ve written permissionless blockchain software using rust-libp2p that uses Kademlia and Gossipsub and these seem to work fine.
- I have adequate theoretical understanding of the Kademlia and Gossipsub algorithms.
So my questions are not about how to use libp2p, and not about wanting a high-level overview of what Kademlia and Gossipsub are as algorithms.
Instead, my questions are about how the different libp2p protocols work with and relate to each other under the umbrella of libp2p. E.g., do they sit in a layered architecture? Do they interact through events? Do they directly call each other? How much of one protocol’s internal state is visible to another protocol?
I have two specific questions (below), but the general qualm that I’m expressing in this post is that libp2p really needs an architecture and a set of docs that help non-maintainers form an intuitive and accurate mental model of what libp2p is.
I hope someone with knowledge about libp2p internals can respond to my specific questions, and pointers into further reading and info-dumping about architecture would also be appreciated.
Specific questions
How does Kademlia depend on Identify?
The rust-libp2p docs page for Kademlia say that add_address must be called on receiving an Identify message to add the sender to the receiver’s routing table.
A specific question is then: how does Kademlia in the sender trigger the sending of the Identity message in the first place?
The libp2p Kademlia specification only has a single, and brief mention of Identify, and none of the descriptions of DHT operations reference Identify, or call for sending an Identify message as a step. Also, the RPC protobuf Message definition contains a nested “Peer” message, whose fields are mostly duplicates of what is contained in an Identify message, this makes sending an Identify message for Kademlia-peering seem redundant.
Additionally, the fact that Kademlia can ostensibly trigger Identify messages confuses the mental model I’ve been trying to form of libp2p as a “layered architecture”. I’m tempted to imagine Kademlia and Identify as two protocols in the same layer, and Kademlia depending on Identify in this manner feels like a violation of this relationship.
How does Gossipsub use Kademlia for peer discovery?
Multiple articles in the docs state that Kademlia can be used for peer discovery, and the article on pubsub explicitly states that Gossipsub needs a separate protocol for peer discovery, with DHTs mentioned as an option.
Say I choose Kademlia for peer discovery. An obvious thing I’ll look for is some kind of interface that I can use to “plug in” Kademlia into Gossipsub. I would initially expect to find some kind of method that’s generic over a peer discovery mechanism and which I can “pass” Kademlia into.
However, searching through the rust-libp2p Gossipsub codebase, there are no non-comment matches for ”discovery”.
My best hunch about how Gossipsub uses Kademlia, after reading and reflecting on libp2p’s written materials is that every time Kademlia (or any other protocol) opens a connection to a new peer, Gossipsub somehow finds out. When it finds out, it decides whether to also open a substream to the new peer for itself. But how and where exactly this happens, I don’t know.