Kad-DHT Protobuf Messages, Peer Routing and Content Providers

brandon · February 21, 2022, 7:47pm

Hi, I’m in the process of implementing the Kad-DHT protocol for a new LibP2P implementation.

I was hoping to get some more info / clarity around the various message types and how Content Providing actually works on the DHT.

I’ve read through a bunch of documentation (and the GO and JS implementations) and can’t seem to fully wrap my head around the formatting of DHT Protobuf Messages (specifically PUT_VALUE, GET_VALUE and ADD_PROVIDER)

Regarding Content Providing and Fetching

What content is the DHT actually storing? Is it primarily Public Keys?

Regarding PUT_VALUE & GET_VALUE Messages

Is the key prefixed with a namespace depending on what kind of Record it’s pointing to?
- /pk/ for public keys, /ipns/ for ipns records?
- are there other namespaces?
Does the Key in the DHT Record always match that of the Key in the PUT/GET_VALUE Message?
Is the value in the DHT Record always just a public key? Can you store other information in it? If so, are there any limitations apart from size?

If the DHT is used for storing Public Keys

Why is the DHT used for public key sharing when the Identify Protocol supports the transfering of PubKeys.
- Is it so we can verify signed content without having to establish a connection to the individual?
- If so, whats an example of signed content we would be verifying (I noticed that the Signature & Author fields of the DHT Protobuf message has been removed)
Ex: If we’re trying to find the public key of a PeerID (b58String or cidString) is the key in GET_VALUE the utf8 encoding of “/<namespace>/<cid>” → “/pk/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN”
Ex: Should I be able to query the bootstrap node at /dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN for their public key using GET_VALUE with the key set to SHA256("/pk/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN")?

Providing Content

Can I Provide content while in Client mode, or does content providing require you to be in Server mode? Should I be providing my public key every time I join the DHT network?

Known Key Value Records for Testing

Are there any lists of keys with known values that I can test against?

If this community can help me figure this out, I’d be happy to compile & contribute what I’ve learned so far to the Content Routing documentation issue on github at /libp2p/docs/issues/23

Thanks!

Jorropo · February 21, 2022, 8:00pm

In go-libp2p, the DHT doesn’t even know about prefixes, it just receive /pk/... string and proceed to hash it and use the hash.

It is earlier in go-namesys, or go-bitswap that prefixing is done.

I think this is a design you can use too, it is not DHT’s job to deal with prefixing, it is the job of the DHT consumers.

IPNS and history.

This comments explains it: https://github.com/ipfs/go-namesys/blob/56a567d4fae3f1f7f4358ef18985f591d33b56ce/publisher.go#L237-L239

Maybe it works idk.

However
for new works, using the identify protocol sounds far better to me.

Yes.

Client = creating PUT / GET requests.

Server = serving PUT / GET requests (in other words, helping other nodes).

Jorropo · February 21, 2022, 8:01pm

No, it is provider records.

mxinden · February 22, 2022, 10:19am

I’m in the process of implementing the Kad-DHT protocol for a new LibP2P implementation.

Curious to hear more! What language are you using? Are you targeting the IPFS DHT in specific or any libp2p based DHT?

Great responses by @Jorropo above

Also see the recent additional paragraph on client mode: https://github.com/libp2p/specs/blob/master/kad-dht/README.md#client-and-server-mode

For FINDNODE you can start with finding one boot node only being connected to another. You can find the list of bootstrap nodes via this command.

You can compare your results with those of ipfs or libp2p-lookup.

For GETVALUE and GETPROVIDERS @lidel might now of a list of CIDs always present on the network?

brandon · February 24, 2022, 6:41am

Thanks tor the quick responses @Jorropo & @mxinden!

I think I was conflating the generic Kad DHT protocol with the default / standard IPFS implementation at /ipfs/kad/1.0.0

If I understand it correctly the IPFS version of the Kad DHT is restricted to storing PublicKey Records and IPFS/IPNS Records. Therefor we can’t just ask nodes running /ipfs/kad/1.0.0 to PUT a record like Record(key: “hello”, value: “world”).

While if we implemented a custom Kad DHT with a protocol prefix of /custom/kad/1.0.0 that supports Records of any content/type (aka no namespaces or validators in place) then we could PUT a record like Record(key: “hello”, value: “world”).

Is the above correct?

Are these provider records any Record that conforms to the Record Protobuf and in the case of /ipfs/kad/1.0.0 are they the IPNS Records defined here?

I’ve been working on implementing LibP2P in Swift. I’ve managed to hack together all of the various Multiformats and Crypto/PeerID libraries, along with TCP, UDP, WS, MSS, Noise, MPLEX, mDNS and I’m working on getting Kad DHT, Floodsub and Gossipsub set up at the moment. It’s all very much a proof of concept though, the code is pretty gnarly.

Planning on making the GitHub repo public once I can get all of the basics working

Jorropo · February 24, 2022, 6:11pm

Yes. It is.
However in case that not obvious, you should note that will create a second DHT aside of IPFS’s one. One that only your nodes would support and host content with.
(that doesn’t exclude you from IPFS’s dht you can run both at the same time)

I would try to not use that.
It would be trivial to DOS and abuse. I think such DHT would be very unreliable (even more than IPFS’s one).
I would in order first try :

use pubsub (which use provider records).
use provider records to find neighbors and then implement a custom node to node protocol.
create a custom DHT with a custom validator.
Use a broadcast type network (blockchain).

Creating a custom DHT without validator isn’t even on that list.

brandon · February 24, 2022, 6:46pm

Awesome. Yeah I don’t plan on creating the hypothetical DHT noted above, I was just trying to understand the limitations/restrictions on PUTing values to the standard/default /ipfs/kad/1.0.0 DHT.

Thanks again for the help @Jorropo, I really appreciate it!

brandon · February 24, 2022, 7:46pm

Also a quick clarification on DHT Message formatting…

// Record represents a dht record that contains a value
// for a key value pair
message Record {
    // The key that references this record
    bytes key = 1;

    // The actual value this record is storing
    bytes value = 2;

   ...

    // Time the record was received, set by receiver
    // Formatted according to https://datatracker.ietf.org/doc/html/rfc3339
    string timeReceived = 5;
};

With respect to the /ipfs/kad/1.0.0 DHT

For IPNS/IPFS Records
- Is the value field just the serialized IPNS Record
For Public Key Records
- Is the value field just the raw public key bytes? or a marshaled version of the Public key?

And in both cases is the key just the UTF8 encoding of the <namespace><cid>?

And when the DHT is attempting to store this record it uses the SHA2_256(<namespace><cid>) to find the k closest peers before calling PUT_VALUE on each peer?

Separation of Libp2p and IPFS with respect to the DHT

If we are to draw the line between Libp2p and IPFS, does Libp2p only use the default /ipfs/kad/1.0.0 DHT for peer discovery? A bare bones libp2p instance with Kad DHT enabled for peer discovery will automatically connect to the /ipfs/kad/1.0.0 DHT, but it doesn’t necessarily implement the logic to handle IPNS/IPFS Records and therefore can’t be used to PUT / GET / PROVIDE values. Is it then IPFS’s job to extend the /ipfs/kad/1.0.0 protocol handlers with the logic necessary to handle these Records and actually utilize the DHT for content routing?

Apologies if these questions are ignorant, I’m trying to read as much documentation/code as I can find before posting here.

mxinden · February 27, 2022, 6:51pm

Yes. E.g. see rust-libp2p:

github.com

libp2p/rust-libp2p/blob/b1859464c956824c45cb9767486af1334fab2a13/protocols/kad/src/protocol.rs#L43-L44


      
          /// The protocol name used for negotiating with multistream-select.
          pub const DEFAULT_PROTO_NAME: &[u8] = b"/ipfs/kad/1.0.0";

Not sure how this is handled in go-libp2p. In rust-libp2p, even though unaware of IPNS/IPFS it still offers geeting and putting (provider-) records.

That is great. Please keep us updated.

No apologies needed! Impressed by your research work thus far. Also happy to jump on a call in case that is helpful, just send me a mail. In addition, would be cool to have you join the community calls sometimes. Libp2p Community Calls

elielmathe · October 9, 2024, 6:47am

I would like to add this question to the thread: Can DHT work in ‘server’ mode in a browser?

guissou · October 15, 2024, 1:08pm

Yes, as long as it is publicly reachable

Topic		Replies	Views
DHT hourly key/value republish Implementers and Contributors	4	1195	April 18, 2019
Brainstorming DHT improvements Research and paper discussions	1	1453	April 4, 2019
DHT - FIND_NODE with put() to peer id Implementers and Contributors	2	487	April 2, 2019
DHT peers not appear in routing table go	1	821	August 10, 2020
Difference between Rust-libp2p and Go-libp2p Users and Developers	1	439	December 6, 2021

Kad-DHT Protobuf Messages, Peer Routing and Content Providers

Related topics