Is libp2p secure against memory DoS attacks by malicious peers?

Hi, not sure whether this was discussed before, but I could not find much info so asking here.
Is libp2p (specifically the Rust version that I consider using) resilient against memory DoS attacks?
I mean, are there any data structures that can grow indefinitely? Are there any means they could be bounded (e.g., by guaranteeing that not more than X messages may be in-flight simultaneously)?
Are there any ways malicious peers could cause other peers to go out of memory? Even in a theoretical end-case…

Any pointer to existing information would also be useful.

Thank you!

Yes. Nothing should grow unbounded. See DoS Mitigation - libp2p for general thoughts on the matter.

If you do find something to the contrary please reach our directly to either me or Max Inden (rust-libp2p maintainer)

1 Like

Thanks for reaching out directly. Important topic.

Adding to Marco’s comment above, since rust-libp2p CVE we have released a series of patches hardening a rust-libp2p node against memory DOS attacks.

We have not been able, neither in practice nor in theory, to attack and bring down a node with a reasonable amount of attack resources since.

Most communication across components is backpressured, thus ensuring a bounded number of e.g. “messages” in flight. See Backpressure between components · Issue #3078 · libp2p/rust-libp2p · GitHub for outstanding work.

As of today, rust-libp2p deployments are dependent on setting reasonable connection limits. Upcoming, limits can be enforced dynamically, e.g. based on available memory.

@yotam let us know if this is of some help. Happy to answer any questions here or in the next community call. As Marco said above, in case you do find a vulnerability, please disclose it via a private medium.

Also worth mentioning are our rust-libp2p coding guidelines, containing multiple entries on the topic of backpressure (memory DOS) and local work prioritization (CPU DOS).

Thanks a lot for the answers.

I have another question w.r.t backpressure then: if libp2p blocks and backpressures, then one single (malicious) peer could potentially slow a (benign) sender down. Are there any mechanisms to protect against such behavior? Where can I read about this blocking behavior?

Thank you very much!

We use one channel per connection. Thus a single connection (e.g. malicious peer) can not dis-proportionally slow down all other connections.

Note that we spawn one task per connection and “block” (as in preempt) per connection task, not for the entire rust-libp2p process. (rust-libp2p is written using Rust’s async/await and Future::poll, thus the word “blocking” is misleading here.)

@mxinden thank you for the replies,

I have a few follow-up question.

Note that we spawn one task per connection and “block” (as in preempt) per connection task, not for the entire rust-libp2p process.

I would assume that the publish call in GossipSub can be IO bound . I guess what @yotam was asking how can an adversary peer impact the publish call?

Another question I have is for the protocol bandwidth overhead. If we publish say 1MB/s of data how much bandwidth capacity will be required per peer so that 1MB reaches all peers ? Assume we have a central place where we know all peers and there are no adversary peers. Let’s assume the best possible configuration as well.

I have looked through the [Gossipsub-v1.1 Evaluation Report](https://research.protocol.ai/publications/gossipsub-v1.1-evaluation-report/vyzovitis2020.pdf)

And please correct me if I am wrong in my reasoning.

Looking at page 10, table 2, row 4. If I understand correctly, overall in the network there are 2tx/sec and the size of the tx is 2KB. This more or less means that at any given second we have a single publisher which publishes 2 messages each with size 2KB. And each peer is connected to 24 other peers.
So 248.8 GB/month is equal to ~96 KB/s for effective publishing of 4KB/s.

Thank you,
Rosti

You can ignore my second question for the overhead. Reading more carefully through the docs it is written that the redundancy is proportional to the degree of the network.

The publish called on the main state machine (NetworkBehaviour & Swarm) dispatches to the individual connection tasks. An adversary can slow down the connection task but should not be able to slow down the main state machine.

For others, referring to the gossipsub paper here. https://arxiv.org/pdf/2007.02754.pdf

Let me know in case there you have more questions @yotam and @rumenov. Great job digging into the details here. Also happy to pair-program on a first proof-of-concept in case that would help you hit the ground running.

Sharing some additional data points/information for completeness:

Ref: a blog post that has links to the first two bulleted items

Thanks a lot for the new pointers @pshahi !

@mxinden, I have another question w.r.t. the protocol which I could not find an answer to.

What happens when you have a subscriber that falls off from the network? More specifically, imagine we have 1 subscriber in a datacenter, the datacenter has an outage and there is no internet connectivity for prolonged period of time. The subscriber did not leave the topic. Are there any message guarantees for this subscriber ? I personally assume that there is some TTL past which messages just won’t be delivered to this subscriber once he comes up. If this is the case, is there a way to detect when a message is TTLed without being delivered to a subscriber or there is really no good way to know this?

@rumenov jumping in late here. There isn’t a way to deliver a message to a subscriber that has disappeared for a prolonged period of time. What happens here is that peers will try to gossip messages for three consecutive heartbeats (i.e., 3 seconds). If the peer has come back online by that point and connected to some peers in the network, then it’s got some chances of getting the message through gossip - not sure what value I would attach to “some chances” here though :slight_smile:

The assumption here is that it is a P2P network so peers might come and go and there is no good way to keep track at the protocol level. In a blockchain scenario that would mean that the node has fallen out of sync, and would therefore need to re-sync and get caught up. This is outside the protocol logic and would have to be implemented on top.

cc’ing @vyzo as I might have forgotten some details by now :slight_smile: