Problem in AWS EC2 deployment

RicMash · September 6, 2023, 12:14pm

I want to deploy some Wasp (a program made for the IOTA blockchain) deployments onto different AWS EC2 machines. Everything is working when I deploy a network composed by a low number of machines. However when the number of machines is around one hundred (or even less), only SOME of the nodes are unable to connect each other (every node has the same firewall policies, so it can’t be a problem of reachability). Wasp is using go-libp2p to implement connection, and I’m obtaining the following errors:

WARN	Peering.peer:15.228.161.235:4000	Failed to send outgoing message, unable to allocate stream, reason=failed to dial 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU:
  * [/ip4/15.228.161.235/tcp/4000] failed to negotiate security protocol: EOF
WARN	Peering.peer:15.228.161.235:4000	Failed to send outgoing message, unable to allocate stream, reason=failed to dial 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU:
  * [/ip4/15.228.161.235/tcp/4000] dial backoff

Since I’m trying to solve this problem for a long time now, I also tried to check what Swarm was logging as debug, and I read the following:

2023-09-06T10:47:33.429Z	DEBUG	basichost	basic/basic_host.go:739	host 12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59 dialing 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU
2023-09-06T10:47:33.429Z	DEBUG	swarm2	swarm/swarm_dial.go:243	dialing peer	{"from": "12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59", "to": "12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU"}
2023-09-06T10:47:33.429Z	DEBUG	swarm2	swarm/limiter.go:193	[limiter] adding a dial job through limiter: /ip4/15.228.161.235/tcp/4000
2023-09-06T10:47:33.429Z	DEBUG	swarm2	swarm/limiter.go:161	[limiter] taking FD token: peer: 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU; addr: /ip4/15.228.161.235/tcp/4000; prev consuming: 27
2023-09-06T10:47:33.429Z	DEBUG	swarm2	swarm/limiter.go:167	[limiter] executing dial; peer: 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU; addr: /ip4/15.228.161.235/tcp/4000; FD consuming: 28; waiting: 0
2023-09-06T10:47:33.429Z	DEBUG	swarm2	swarm/swarm_dial.go:490	12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59 swarm dialing 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU /ip4/15.228.161.235/tcp/4000


2023-09-06T10:47:33.465Z	DEBUG	swarm2	swarm/limiter.go:73	[limiter] freeing FD token; waiting: 0; consuming: 92
2023-09-06T10:47:33.465Z	DEBUG	swarm2	swarm/limiter.go:100	[limiter] freeing peer token; peer 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU; addr: /ip4/15.228.161.235/tcp/4000; active for peer: 1; waiting on peer limit: 0
2023-09-06T10:47:33.465Z	DEBUG	swarm2	swarm/swarm_dial.go:281	network for 12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59 finished dialing 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU
2023-09-06T10:47:33.465Z	DEBUG	swarm2	swarm/limiter.go:201	[limiter] clearing all peer dials: 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU


2023-09-06T10:47:33.582Z	DEBUG	basichost	basic/basic_host.go:739	host 12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59 dialing 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU
2023-09-06T10:47:33.582Z	DEBUG	swarm2	swarm/swarm_dial.go:243	dialing peer	{"from": "12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59", "to": "12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU"}
2023-09-06T10:47:33.582Z	DEBUG	swarm2	swarm/swarm_dial.go:281	network for 12D3KooWDaPpyUadtWoQpr2Kw2v8axCsPjA7LaJoq3nVtrVT2i59 finished dialing 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU
2023-09-06T10:47:33.582Z	DEBUG	swarm2	swarm/limiter.go:201	[limiter] clearing all peer dials: 12D3KooWJj52NW8UbZ2pXK88CQ7FHzvfVE3TVvi8KCyHHakP35ZU
...

Since this is a problem happening only with an high number of machines deployed I was wondering if this could be a problem of timeouts during dials or maybe there is a limit in the number of connections. What steps I could apply to try solving this important issue for me?

pshahi · September 12, 2023, 7:15pm

Can you x-post here please? libp2p/go-libp2p Q A · Discussions · GitHub
The go-libp2p devs are trying to move off of the discussion forum here in favor of GitHub

RicMash · September 12, 2023, 8:08pm

Sure, thanks for the feedback!

Topic		Replies	Views
Problems in connection between AWS EC2 hosts Users and Developers	0	154	September 6, 2023
Interoperop between py-libp2p and go-libp2p Users and Developers	2	52	August 7, 2024
Go-libp2p v0.28.0 released News and announcements	0	299	June 17, 2023
Integrating Request-Response and Stream Setups in swarm for Efficient Data Transfer? rust	1	155	February 4, 2025
Ping over WebSockets Users and Developers	1	374	November 8, 2021

Problem in AWS EC2 deployment

Related topics