ScudDB: a distributed database in Rust using libp2p for its metadata sharing

For my masters thesis I’ve created a distributed database, which aims to provide strongly consistent data storage for some edge compute applications. It achieves this by giving specific nodes authority over a certain prefix. These nodes can then service reads, writes and create new sub-prefixes without having to communicate with any other nodes. For example, one node might have authority over “/”, another over “/students/” and another over “/professors/”. These nodes oversee the creation of new sub-prefixes on other nodes, such as “/students/Bob/” and “Professors/John/” respectively.

This model also has attractive failure tolerance properties since there’s no centralized entity responsible for prefix distribution.

Clients can query the entire database through a single database node because requests are forwarded to the appropriate node. However, for request forwarding, traveling up the entire prefix hierarchy and back down to the correct node is slow.
To solve this problem, I used libp2p’s gossipsub to spread metadata about which prefixes are stored where. Compared to using a classic pubsub system for this purpose, libp2p lets me simplify the architecture of the entire system, without introducing another point of failure.

The (undocumented) implementation can be found here.

The libp2p part responsible for metadata sharing, utilizing gossipsub with kademlia and Identify for peer discovery can be found here: server/src/metacom/mod.rs · master · Frederik Baetens / scudDB · GitLab

I think this is an interesting use case, because unlike most(?) users of libp2p, no byzantine fault tolerance is required, yet I still benefit from its failure tolerance and relative simplicity.

If anyone has some questions, suggestions, or knows of others building similar systems, I’d love to hear about it! So far I know about Cloudflare workers KV & Durable Objects and some academic databases.

1 Like

Hi Frederik,

Thanks for sharing your project! Cool to see this use-case of libp2p.

I would be especially interested in the performance characteristics of this architecture. Did you do some benchmarking or simulations?

If anyone has some questions, suggestions, or knows of others building similar systems, I’d love to hear about it! So far I know about Cloudflare workers KV & Durable Objects and some academic databases.

@yiannisbot and Dietrich have a great overview over the space. What do the two of you think?

Another DB project that comes to my mind, building on top of IPFS and libp2p, is OrbitDB · GitHub.

1 Like

Side note, cool name!

ScudDB is named after a low to the ground cloud.

1 Like

So the forwarding itself, once the metadata is present, is quite fast. I maintain a lazy connection pool (only keep connections after they’re needed), so forwarding a request adds just a single round trip per hop (usually just one b/c the metadata is usually always present). Since the forwarding usually just involves the node the client is connected to, and the node with data ownership, there’s no central node or cluster to bottleneck request forwarding.

I haven’t benchmarked the prefix creation and associated metadata dissemination though. Do you know what kind of throughput one can expect from libp2p gossipsub & what variables affect it?

I’ll try to run a quick prefix creation benchmark on a few vms, but if there’s potential gossipsub bottlenecks caused by large network sizes, I don’t really know how to test that. How do you run libp2p tests when testing 50+ nodes? Or do you just not run those kinds of tests? Do you use IAC tools like pulumi or something like ansible?

You can use testground The sdk in rust is still WIP.

You can run simulations. with my computer I can run 40+ instances.

1 Like

Hi,
Very interesting project.

I have some questions (I am new to p2p so sorry if my questions are dumb):

Is it possible to read your thesis somewhere?

How nodes have authority over a prefix ? How do you prevent two nodes to have the same prefix, for example, /students ?

If I am a node and create the sub-prefix /students/Bob the student bob data will only live in one node ? What if the node with bob data goes offline, is it possible to get the data ?

If I want to get all /students from the database is there a way to know if I got all students or just part of the students, for example let’s say there are 10 students (assuming that database is on a strongly consistent state for the 10 students), but a node got down, and I only got 7, is there a way to know that my result is incomplete ? If yes, how did you solve it?

Can you give some use case of the database, how would this be used in a real university, for example?

Thanks.

Is it possible to read your thesis somewhere?

If you speak dutch :). Unfortunately I am somewhat obligated to write my thesis in Dutch b/c of language policies at my university.

How nodes have authority over a prefix ? How do you prevent two nodes to have the same prefix, for example, /students ?

Strong consistency and atomicity in prefix creation is enforced by only allowing the node with authority over a parent prefix to give out authority over new sub-prefixes.

For example: let’s say node A has authority over the prefix “/”, and node B and C both simultaneously want to create the new prefix “/students”. Both B and C will send a request to node A, to request permission to create the new prefix “/students”. Node A will handle these requests as atomic transactions, meaning one request will get handled strictly before the other. Let’s say B’s request is handled first, meaning that B now gets permission to create the “/students” prefix, and C’s request will be denied.

In the end, only B will have authority over the “/students” prefix, and C will not have been able to acquire authority over the “/students” prefix.

If I want to get all /students from the database is there a way to know if I got all students or just part of the students

Yes, this is possible. As explained in the previous section, a parent prefix needs to be informed about the creation of all sub-prefixes. This means that the node with authority over the “/students” prefix will have knowledge of all sub prefixes of “/students”, so it will be able to tell you about all students.

If a node with authority over one of the students is down, you will get an error because that node would be unreachable, so you would know that you are missing some students.

Can you give some use case of the database, how would this be used in a real university, for example?

I don’t know if it’s useful to use this kind of database for a university, because universities usually aren’t spread all over the world, meaning that they wouldn’t really benefit from distributing their data globally, like ScudDB allows for.

A simple use case is a chatroom application, where people mainly talk to other people in the same region, meaning that you could locate the chatroom and its messages in the same region as the majority of the users in that chatroom.

I hope that clears up your questions. If anything is still unclear, feel free to ask :slight_smile:

Hi,

If you speak dutch :). Unfortunately I am somewhat obligated to write my thesis in Dutch b/c of language policies at my university.

I can read pictures lol, yeah I can’t read Dutch…

Thank you for your answers, very interesting concept, one of the use cases that comes to my mind would be a discussion board like this one.

Hi,

I find your project very interesting because I am doing a database myself (GitHub - fsvieira/beastdb: A specialized database for state search space problems), what I found most interesting on your idea is the write authority that is handled on a distributed way, but I have some questions about availability and redundancy.

What I understand is that each node have authority over a prefix and that means that each node is also responsible for keep the data and data access:

  • So if a node goes down: this prefix and data will be unavailable?
  • If a node is destroyed : the data is lost forever? (unless the node has backups)

I am considering some of this ideas on the beastdb design that I would like it would be a distributed database, the authority writes is special interesting because its a way to coordinate the computing nodes that can write to database, however there is still the questions about availability and strong consistency.

So is there any solutions for this kind of problems that you have found for scudDB, did ever consider for example instead of nodes the authority would be given for clusters meaning that if a node would be down other node would take its place?

What about read authority and integrity, even if only a node would have write access would it possible to have many read nodes? Would it be possible be sure that the data is true.

Thanks.

Hi, Thanks for your continued interest in ScudDB!

  • So if a node goes down: this prefix and data will be unavailable?
    Yes, exactly.
  • If a node is destroyed : the data is lost forever? (unless the node has backups)
    Also yes. These are both direct consequences of a single node having sole authority over data.

So to remediate these problems you would turn one logical node into a localized cluster of nodes that share authority over a single key. Since these clusters of nodes would each be constrained to a single region, the latency impact of this would still be low, while also remediating single node failures and allowing you to still have strong consistency.

Having localized clusters of nodes also allow you to spread the load of reads over multiple nodes (depending on what read/write quorum you use or what other consensus algorithm you use).

  • What about read authority and integrity, even if only a node would have write access would it possible to have many read nodes? Would it be possible be sure that the data is true.

You do need to be careful with this though. As I said earlier, localized clusters with an appropriate consensus algorithm, would work here. You can’t however just asynchronously replicate data to some read-only nodes because the following scenario might break the strong consistency:

  1. Write node receives request to update key
  2. Read node receives and responds to read request before the updated key on the write node has made it to the read node.
  3. You’ve now read an old value after a new one was written, thereby breaking strong consistency.

Avoiding that scenario is basically what consensus algorithms and read/write quorums let you do.

Hi,

Thank you very much to answer my questions. You are absolutely right read nodes cant possible maintain strong consistency, I have think of read nodes as “cache” where write/read race conditions are not a problem.

Moving a node to clusters would pose new challenges, but I am sure this could be solved with different solutions (ex. consensus protocol) .

My questions are all clear, not sure if you are going to continue with ScubDB but I wish you the best luck for your present and future projects.

Thanks.