Am I reading correctly that the answer is Yes due to https://github.com/libp2p/go-libp2p-kad-dht/blob/master/routing.go#L577-L579 ?
No, right now we don’t. The lines you reference adds the addresses of a discovered provider to the peerstore, but it does not record the CID => PeerID association in the ProviderManager. This is a non-trivial issue. As soon as you start replicating provider records in an unstructured manner throughout the network (in a pull-based, as a consequence of local lookups), a number of issues can emerge, related to staleness, consistency, and security/squatting. Coral’s sloppy hashing uses the XOR distance function to determine ownership/maintenance (nodes closest to the target are responsible for storing provider records), enhanced with the backoff logic that naturally expands the radius where records can be stored, but always using the target as the centre. If, however, we decided that any node in the network can arbitrarily store and serve any provider record, the number of challenges that arise would be significant IMO.
Thank you for the clarification. Not sure if I understand the full argument that having the records replicated in more locations can lead to issues, if (and only if) the act of replicating a record is voluntary (i.e. someone would have to call .ReplicateRecord). As for the threads, on
-
Staleness - The replicates will have the same timestamp and the “replicator” can fetch fresh copies
-
Consistency - Agreed. There might be records that have X providers and records that have X+Y where Y are the new providers since the last replica was made. Note: Given that we don’t have active load balancing on node join, this can (and probably is) already happen when new nodes join and take over the responsibility of parts of the address space, leaving the previous owners with a old copy of the record
-
Security/Squatting - Mind expand a bit on what you are thinking here?
If, however, we decided that any node in the network can arbitrarily store and serve any provider record, the number of challenges that arise would be significant IMO.
Would you say if this still holds true if a node decides to voluntary replicate a record?