Rough stress metrics for Gossipsub

Hi! I’m exploring Gossipsub as I plan to use it in a new blockchain implementation. I’m curious if there are any stress metrics available for Gossipsub, excluding blockchain-specific aspects. I’ve come across a simulator and some metrics reports, but they seem to apply to specific configurations with fixed number of messages per second.

By stress tests, I’m referring to scenarios similar to the tens of thousands of nodes found in Ethereum and Bitcoin. For example, if we were to simulate around 30,000 nodes, what could we expect the throughput to be, even as a rough estimate? I’m looking for a simplified answer to get a basic understanding. To clarify, my question focuses solely on the P2P + Gossip aspect, without considering consensus mechanisms or other blockchain-related elements.

Hi @magnetseven,

Let me see if I understand your question. If we have a network of 30,000 peers you’re asking for a rough estimate of throughput? Is your question related to how many messages have to be passed to ensure that all peers see a given message? Is your question related to how many messages per second each peer can process/forward? Or is your question related to how many messages per second the entire network can process?

The rest of my answer assumes you’re asking about estimating how many messages per second an entire network can process.

There are lots of different attributes that change the answer to all of the above questions. All of the answers are tied to the “connectedness” of the network. Typically gossip networks aren’t “fully connected” meaning that ever peer is connected to every other peer. There is almost always an overlay network of some sort. Ethereum uses what they call “plumtree” overlays which uses a heuristic behavior to spontaneously organize a p2p network into a minimum spanning tree.

Minimum spanning trees connect all peers with the optimal minimum number of links. However, in very large networks, some of those links become “trunks” that connect very large sub graphs of peers. Those trunks pass many more messages than links between peers closer to the “edge” of the graph. In fact, the volume of traffic on the links in a minimum spanning tree typically follow a Pareto distribution meaning that a few links carry much more traffic than all other links and there’s a power law “long tail” of the amount of traffic carried by the other links.

That said, if a “trunk” carries too much traffic, then the “cost” for that link goes up and the peers will seek other links to the other side of the graph and if a better link is found then messages will preferentially flow through that and the structure of the minimum spanning tree changes.

My point in saying this is that the “throughput” in terms of messages per second for the entire p2p network is bound by the throughput on the trunk links in a plumbtree-built minimum spanning tree. So you can estimate your message size and, given your simulated network, estimate the number of messages per second that will be able to pass through the “trunk” links to get the maximum for the whole network. For instance if you look at the graph from the plumtree page, the links I’ve highlighted in red are going to be your “trunks” and their maximum throughput will limit the overall throughput of the whole overlay network.

If you do run a simulation, the way you can find the “trunk” links is to find the peers with the lowest average number of hops between themselves and the other peers in the network. Those peers are in the “middle” of the minimum spanning tree and the links between them and their immediate peers are likely the trunks. You can sort the links by the number of peers that are reachable through each link to get the links with the highest throughput traffic.

I’m not aware of any simulation tool designed to test gossipsub overlay networks specifically. I know there are people using tools like Shadow to do large-scale network tests but I’m unaware of one specifically testing gossipsub. You could build a fairly simple simulation that you plug in numbers for average message size, link throughputs, number of peers and the initial set of links. Then add in the plumtree heuristic for adjusting links so that it will self-organize into a minimum spanning tree and then measure what throughputs are. Obviously this would be fully synthetic and not really a test grounded in reality because it’s missing the noise of internet “weather” (e.g. packet loss, retransmission, etc) but it could be useful for doing rough estimates on networks with very large number of peers.

I did find this fairly recent interesting paper on the resiliency of gossipsub networks. That’s not exactly what you’re asking about but maybe there’s some related information in there.

If you provide more information on what exactly you’re asking, I’m happy to dive into it with you.

Cheers! :beers:

Hi @dhuseby thank you for your prompt response and sorry for my late reply.

I was discussing with my team how to make a clear reply and we were discussing doing the test ourselves in the near future and publish it for reference but continue to think that a rough estimation from you is beneficial.

What we would like to know, in simple terms, is how many messages por second could we expect in this network, more specifically in the trunk.

Can the throughput of the trunk be considered as a hard-limit for the full network throughput? If the relation between trunks and participants is low, an overcrowding happens, right? How many messages per second do we expect? Answering your questions:

You are right, we are looking to estimate how many messages per second can the entire network process. We are thinking in small messages (e.g. ~500 bytes) for the sake of simplicity. I understand that there are different solutions to give the appearance of broadcasting but would love to have an idea of scale: it is a network that can tolerate 50k tps (messages per second in this case), 100k ? 200k?

A basic calculation, for example, is that 100k messages would be 97mb/s. Obviously, we don’t expect that this will happen all at once for all nodes but in a time window (say ~5-10 secs) it could be acceptable.

Thank you very much!

Hi @magnetseven,

Can you add me to your team? I could help with libp2p lib.

What you’re asking for is difficult to estimate simply because the topology of the network has a lot to do with the total number of message that get passed in the whole network. You also have to realize that gossiping messages means the same message gets retransmitted over and over again to make sure all of the peers receive it within some tolerance window. Also you have to realize that transmitting mostly small messages is the least efficient way to transmit data and will also limit throughput.

I don’t think I can give you an estimate simply because there are too many variables. You’re better off simulating the network you’re wanting to build under circumstances which you think are a good estimate for the real thing.

Cheers! :beers: