Since pubsub implementation does a very good job maintaining the updated list of peers (along with the sending and receiving go routines) I was thinking why not using this system for a direct send to a connected peer (without flooding the network with useless messages). This is particular useful when requiring authenticated data and do not want to get flooded by the whole network with n response messages all containing the same bytes in Data field. A real example will be a node in a blockchain project that requires signed transactions from its peers to get synchronized. It might not even want to “ask” more than 2-3 peers for the data.
It’s not clear how such a feature would be implemented API-wise.
Was thinking a new field in Message struct, might call it Destination. If Destination is not set, the message follows the normal pubsub path, otherwise, if the node has a direct connection to the destination peer (and the destination peer subscribed on the topic), send the message only to him. The message will be dropped if sender does not have a connection to destination peer.
I’m trying to make it in our forked repo, just to see if it can be done.
Wouldn’t the receiving peer then forward the message as normal, thus re-entering the pubsub broadcast?
It kind of defeats the purpose.
It can be circumvented as it “sees” that the destination peer is himself.
Managed to implement direct send functionality in our fork:
Added some integration tests that show how the message reaches only the destination peer (if existing)
I’m missing something… why using pubsub when you can Connect
directly to the peer you want and pass whatever data you need to that peer? Just because pubsub is there, it should not become a replacement for libp2p streams.
Agree, but when exchanging messages between connected peers you have to have at least one go routine for each peer for receiving data (so another go routine for each peer, pubsub already has one for that peer). Also, sending uncontrolled flow of data (from a very high number of go routines) through a stream, might cause stream resets errors. Pubsub solves all those issues, making a good candidate for this feature.
It would be better to report those errors rather than workaround them using pubsub (I’m not sure the stream writers are race-safe, perhaps you need to wrap them in something that is (?)).
Pubsub will serialize, wrap the data, possibly enqueue the message and make things significantly worse for 1to1 transfer of data. If you have a high number of go-routines doing an uncontrolled flow of data pubsub is going to be blocking in worse ways than the stream writer would.
If you really want pubsub for this, you don’t have to implement anything, you could have a specific topic to which only the destination peer subscribes (like its peer ID for topic) and write to that one.
I did some testing with this new “feature” and indeed, the direct send is slower than our first iteration of a specialized direct sender component that used only streams. We are still investigating if pubsub can be tweaked for a performance boost. This is the reason I have started the discussion here, not just opening an issue/PR.
Regarding the second option, it was taken into account but subscribing to topics used only for one peer would have induced some overhead (announcement, more memory used, etc.)
Also, a second iteration of the direct sender component has been made in which the output stream of data is serialized (so stream errors will no longer appear because of so many go routines involved in writing). This serialization is done by only one sending go routine (having n receiving go routines). Already thinking of a third iteration, in which we might mimic the pubsub’s behavior, having dedicated sending go routines for each connected peer.