Share.ipfs.io peer and content discovery

Hey everyone,

I’m just copying my question from share.ipfs.io - pcp interoperability · Issue #127 · ipfs-shipyard/ipfs-share-files · GitHub to this platform.

I’m trying to let pcp interop with share.ipfs.io.

I saw that share.ipfs.io uses the libp2p/js-libp2p-webrtc-star transport, so I jumped in and started my own implementation of a go-libp2p-webrtc-star transport based on mtojek/go-libp2p-webrtc-star (which is outdated) and libp2p/go-libp2p-webrtc-direct.

I still have trouble to completely grasp the peer and content discovery flow though. Please correct me if I’m wrong:

  • js-libp2p-webrtc-star is not only used as a transport but also for peer discovery via its discovery property (next to the bootstrap discovery mechanism).
  • When a user opens share.ipfs.io the web app opens a websocket connection to and registers itself at the signalling star (and simultaneously connects to the set of bootstrap nodes)
  • Then the signalling star tells the app about all currently connected peers
  • Since autoDial is enabled, the app optimistically connects to all peers by exchanging ICE candidates through the signalling star.

Now I’m wondering how content is discovered if users access the app with a CID URL fragment. If my understanding above is correct I could imagine that:

  • the app connects to more and more peers and optimistically exchanges bitswap messages with newly connected peers until it finds the right peer.
  • the app somehow leverages pubsub. I saw that pubsub with Gossipsub is enabled but I can’t see any use of it.
  • something else?

Other questions:

  • Is there a similar inter-module-communication mechanism for the double responsibility of the js-libp2p-webrtc-star transport that also does peer discovery in the Go module system?
  • Just to tighten my understanding: Is the Bootstrap discovery mechanism strictly necessary? Due to autoDial and peers discovered through the signalling server.

Cheers,
Dennis

Yes, but this is an approach that we would like to opt out at some point, once a node can better identify what types of connections it needs. Tracking in Connection Manager Overhaul · Issue #744 · libp2p/js-libp2p · GitHub

Content discovery is a different module than Peer Discovery. js-ipfs by default uses GitHub - libp2p/js-libp2p-delegated-content-routing: Leverage other peers in the network to perform Content Routing calls. where it leverages go-ipfs nodes to perform DHT queries in their behalf. This means that nodes are able to do provide(cid) and then findProviders(cid) for content routing purposes. However, it is important pointing out that to actually get the content, you will need to connect with the providers (i.e you have to have the same transports, and in the browser if they have Websockets, the other peer needs to have a WSS address).

The bootstrap nodes are particularly important to bootstrap your node’s network topology (if the DHT is enabled), as it should contain DHT nodes that will help the node to discover other peers. So, for the webrtc-star connectivity it is node needed, if you just want to have a mesh with all the peers in a given dapp leveraging the star server.

Awesome, thanks for the information @vasco-santos! That makes clearer to me.

However, I still don’t quite understand how the DHT is used in the share.ipfs.io case. I can see that share.ipfs.io uses ipfs-provider which is configured to spawn an embedded js-ipfs node with a custom libp2p config:

The referenced libp2p config says that the DHT is disabled:

I’m not quite sure what this means though.

If I understand it correctly the ipfs-provider could have been configured to use an HTTP API to facilitate the DHT query/provide delegation you’ve mentioned above. But it doesn’t seem to be configured like that.


Maybe the above settings don’t affect the default behaviour of js-ipfs using the delegated content routing module. So I experimented by querying the DHT for providers of the CID that share.ipfs.io shows me when I’m sharing a file. E.g.:

share.ipfs.io:

  • https://share.ipfs.io/#/bafybeidy2k2bqhwploovluay7ntdf66t5qnodskkqhxfljxqlqfhrwarde

My hosted IPFS node:

  • ipfs dht findprovs bafybeidy2k2bqhwploovluay7ntdf66t5qnodskkqhxfljxqlqfhrwarde

This doesn’t yield any results :confused: Shouldn’t this yield the peer ID of my browser node?

Thanks again

Native DHT is disabled in js-libp2p at share.ipfs.io because it has experimental status in js-ipfs and has limited utility in browser context:

  • as a client, even if you discover peers those will have /tcp addrs that are not possible to dial + browser nodes are ephemeral in nature
  • as a provider, browser node is pretty ephemeral, and won’t be diallable after user closes the tab (for ephemeral node like this, webrtc-star is better suited solution)

Publishing to DHT is limited at share.ipfs.io, as the app is mostly leveraging webrtc-star + light delegated routing by means of preload nodes: most clients are connected to preload nodes, and uploading a file will pre-cache CID to some go-ipfs node that has real DHT, and would provide it for some time as a proxy. You can see this as XHR calls to /refs endpoint.

Using preload nodes as a jump point provides alternative discovery method to webrtc-star – works even when direct p2p is not possible + provides ephemeral cache for some time even when original uploader’s tab is closed.

(And the other way around, if preload nodes are down, users can swill use the app over webrtc-star).