The JS Peerstore (PeerBook) is pretty basic at the moment. Go has TTLs for various address types in its Peerstore. For peers themselves, are there any pruning mechanisms in Go for removing stale peers?
Yeah, both the memory peerstore and the datastore-backed peerstore are garbage collected. go-libp2p uses TTLs (*) to set expiries for individual address book entries.
Hereâs the description of the algorithm we use for the datastore-backed peerstore. We recently transitioned out of strict database-enforced TTLs as they amplified entry count and made range queries inefficient.
- Rationale: It is wasteful to traverse the entire KV store on every GC cycle.
- Proposal: keep a lookahead window of peers that need to be visited shortly.
- Purges operate on over lookahead window only, thus avoiding full DB range scans.
- Default lookahead period: 12 hours.
- This is not a sliding window, but rather a jumping window. It jumps every 12 hours instead of sliding smoothly.
- The lookahead window lives in a dedicated namespace in the KV store. Entries have nil value and all data is contained in the key:
/peers/gc/addrs/<ts>/<peer id in b32>
.ts
is the Unix timestamp when the record needs to be visited next. This pattern makes time slice range scans very efficient.- Algorithmically, this GC algorithm runs over two cycles.
- Lookahead cycles populate the lookahead window by performing a full DB scan and adding the entries that need to be visited in this period to the window.
- Purge cycles perform a range scan over the
/peers/gc/addrs
namespace until they hit a key with value > now; then they stop. Each entry is refreshed, re-persisted, and removed from the window. It is re-added if the new ânext visit timestampâ happens to fall within the current period.I think this algo makes efficient use of resources and minimises read amplification in the context of GC.
(extracted from https://github.com/libp2p/go-libp2p-peerstore/pull/47#issuecomment-439509360).
(*) TTLs are the wrong metric. We should be modelling address confidence, observations and decay.
Hello, is there any mechanism of pruning stale peers in js implementation of libp2p?
I can see that here: js-libp2p/README.md at 707bd7843c5b05a70916055015e3f483cc385759 ¡ libp2p/js-libp2p ¡ GitHub is a note about adding it in the future so I am wondering if it was added yet (and maybe readme wasnât updated)? Looking at the code I didnât see it but perhaps I missed something.
I noticed that we have a lot of stale peers in address book (e.g 5 active peers - 65 in address book) - probably because we have one constantly running node which collects all of them.
@vasco-santos â do you have any thoughts on what the best solution would be here?
I work with EmiM on this, and weâre seeing a huge performance cost because libp2p will endlessly retry offline peers, and weâre using Tor so each of those tries is a little costly from a performance perspective and it adds up.
All of our peer addresses are persistent, since theyâre Tor onion URLs, and we already have a store of them all, so we donât really need this peer discovery piece at all. (Or at least, where it would be helpful would be for gossiping about peers that are currenrtly online so we only connect to those ones!)
Is there a way we can disable redialing peers entirely? Like, drop them from the peerstore as soon as we fail to connect?
Hey folks,
Unfortunately js peerStore still does not have TTLs/any peer scoring in place that would enable peerstore pruning. We did not implement TTLs like go as we would like to get the address confidence as a better indicator of the real value of these addresses. Some issues tracking that that you can watch:
- PeerStore cleannup strategies ¡ Issue #639 ¡ libp2p/js-libp2p ¡ GitHub
- PeerStore improvements ¡ Issue #582 ¡ libp2p/js-libp2p ¡ GitHub
In the meantime and taking into consideration your use case, I think the best solution is to disable the autoDial
js-libp2p/PEER_DISCOVERY.md at 707bd7843c5b05a70916055015e3f483cc385759 ¡ libp2p/js-libp2p ¡ GitHub . With this, for starting out you can probably iterate the PeerStore in the application level and attempt to dial each peer according to your needs. As you mentioned, I would recommend if a dial fail, you should remove the peer from the AddressBook.
Once we get the address confidence in place, as well as address garbage collecting, you can move to it
Thanks, this is helpful!
@vasco-santos thank you for the hints.
I turned off autodial. I wasnât sure what is the best place to iterate over address book (cause at the beginning address book is empty) so instead I attached to âpeer:discoveredâ event and for each discovered peer I try to dial it - when dialing fails the peer is removed from the books (libp2p.peerStore.delete
).
This is the test I performed:
I created a small network consisting of a few peers. I let them connect to each other and then I disconnect one of them. The thing is that the peer is still discovered after being deleted and the loop of âdiscovering and deletingâ continues. I guess this is because at the same time at least one peer still has information about deleted peer (cause cleaning process happens in different time for each peer)?
I even added deleting peer to peer:disconnect
just to be sure but it didnât help.
Example - disconnected node is discovered over and over again:
peer_1 | Discovered QmRbkBkhTt2DbLMF8kAaf1oxpfKQuEfLKFzVCDzQhabwkw: 40
peer_1 | addressbook before deleting [Map Iterator] { 'QmRbkBkhTt2DbLMF8kAaf1oxpfKQuEfLKFzVCDzQhabwkw' }
peer_1 | Aborting /dns4/<address>/tcp/7788/ws/p2p/QmRbkBkhTt2DbLMF8kAaf1oxpfKQuEfLKFzVCDzQhabwkw: 40
The âAborting (âŚ)â log comes from our WebSockets transport module from ._connect
method. On production we have lots of noise here because it tries to dial even old peers and does it many times per peer. 40
is how many times the code reached this line for single peer.
Do you have any idea what Iâm doing wrong? Maybe I should be using perstent datastore?
NOTES:
- The peer I disconnected manually was bootstrap node so every peer had information about it from the beginning. Perhaps thatâs why other peers still knew about it and were discovering it again and again?
- If I removed (disconnect) the regular peer that is known by one/two peers itâs not discovered anymore.
UPDATE: Itâs working now. Regular peer deleted from address book is not seen in the network, deleted bootstrap node is constantly being discovered.
Thanks for reporting this @EmiM
So, the bootstrap module continuously emits the discovery event js-libp2p-bootstrap/index.js at master ¡ libp2p/js-libp2p-bootstrap ¡ GitHub which will ends up doing what you experience. Probably worth for you to fork bootstrap and only emit one time, or just do the dial manually instead of using it.
Thank you for the answer.
-
Does transport module (in our case Websocket+tor) depends on address book or only on discovered peers? Maybe deleting inactive peers from peer book is not a good approach?
In_maybeConnect - src/index.js
I added a check whether connecting to peer fails or succeeds by parsing the error message (HostUnreachable). Do you think this would be a good place to remove peer from a book? -
What consequences will be for emitting peer:discovery only one time? I suppose interval was added here for a reason
Update: I parse error caught in _maybeConnect but somehow I stopped getting âHostUnreachableâ (even if itâs definitely thrown by Websocket transport) and only errors I get are aggregated âThe operation was abortedâ and/or âalready abortedâ but I am not sure if I can assume that aborted dial == failed dial.