I did some testing with this new “feature” and indeed, the direct send is slower than our first iteration of a specialized direct sender component that used only streams. We are still investigating if pubsub can be tweaked for a performance boost. This is the reason I have started the discussion here, not just opening an issue/PR.
Regarding the second option, it was taken into account but subscribing to topics used only for one peer would have induced some overhead (announcement, more memory used, etc.)
Also, a second iteration of the direct sender component has been made in which the output stream of data is serialized (so stream errors will no longer appear because of so many go routines involved in writing). This serialization is done by only one sending go routine (having n receiving go routines). Already thinking of a third iteration, in which we might mimic the pubsub’s behavior, having dedicated sending go routines for each connected peer.