Skip to content

Open Grant: Peerbit #260

@marcus-pousette

Description

@marcus-pousette

Open Grant Proposal: Peerbit

Name of Project: Peerbit

Proposal Category: devtools-libraries

Proposer: marcus-pousette, allberg

(Optional) Technical Sponsor: -

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT, APACHE2, or GPL licenses?: Yes

Project Description

We are building a P2P database framework on top of the IPFS stack so that developers can build and maintain a distributed, private, searchable state across devices end to end.

We are solving two problems. We are bringing privacy and decentralization into the P2P database framework space by building a framework where encryption, distribution (sharding) are core features. Secondly, we have the goal of reducing infrastructure costs for all types of organizations by providing a framework that lets services utilize consumer hardware efficiently through smart auto-sharding that respects network dynamics and device capabilities. In addition to this, Peerbit can cut development time since you will not have to think about "backend" and "frontend" just the peer client.

We are confident in solving the problem outlined above since we have already spent months developing a working prototype that has the core functionalities such as sharding, encryption and distributed search. In addition to this, we have been deep diving into this space and know the benefits and shortfalls of alternative solutions that exist today, like ThreadDB and OrbitDB. Mainly the key problems we have identified to be problematic from existing technologies is that they don't provide sufficient privacy, scalability and good enough developer experience to compete with traditional tech stacks.

You can read more in depth about what Peerbit is today in the repository.

Value

The benefits for the ecosystem of getting this project right is that developers can unlock a large userspace that would create and store data in a distributed mindset. Developers can choose to either be part of the replication process in a network and store content on a local IPFS node, or for example use Filecoin to put the responsibility of storage on someone else.

The risks of not getting this project "right" except wasting time and money, is the risk with any distributed storage project: What happens if illegal/unwanted content gets distributed with this technology?

Technical risks include unforeseen weakness of the protocol that would lead to loss of data.

There are many technical challenges with this project, mainly how do one write a framework with the right amount of abstraction so that developer experience is good: Fast onboarding yet still allows for high configurability for the users that demand it. In addition to this, some technologies such as using WebRTC-transport are quite new and might lead to unexpected challenges that are hard to foresee.

Deliverables

When all milestones have been achieved Peerbit will be a framework for building:

  • P2P distributed state that persists in a network where nodes join and leave.
  • Automatic sharding that respects device capacity, such as storage, ram, disk, cpu, battery life and predicted availability
  • Chain agnostic and cross device identities. It does not matter if you are using Metamask (Eth), Phantom (Solana) or any other wallet to authorize yourself, since the protocol allows transactions to be signed with multiple types of signature algorithms. In addition the protocol supports the ability to link identities across devices so that you can modify an access controlled state from multiple devices seamlessly.
  • Read and write access controlled states
  • Example apps, support forum and rich documentation to help developers get started.
  • An easy method of updating programs nodes are running.
  • Privacy through E2EE with a roadmap how to implement forward secrecy and zero-knowledge access controllers.

Milestones outlined below are targeting the main tasks we have to complete to achieve this deliverable.

Development Roadmap

1. Automatic updates

In the Peerbit world replicator nodes can help networks by replicating content and providing search indices. They perform their job by simply subscribing to particular PubSub topics. Messages in these topics can instruct the node to open a particular database from a manifest and start replicating some content and build an index for search capabilities. At some point in time, there will be a need to update the software that helps the replicator node to interpret and handle different kinds of manifests. If I were a node provider, providing thousands of concurrently running nodes in different networks, it would be an overwhelmingly cumbersome job to maintain all nodes manually. Instead, it would make sense that the protocol itself can instruct the nodes to consider updates just as the protocol can instruct nodes to open a particular database.
Hence with this milestone one can:

  • Allow updates to be suggested by peers for a network/cluster
  • Nodes can approve or reject updates depending on the identity the suggesting party
  • Support for safe/gradual rollouts so that no data is lost during the upgrade process

Subtotal: 160 hours (120 hours of research and implementation work, and 40 hours additional feature specific maintenance after rollout) * 80 USD/h = 12800 USD

ETA 4 weeks

Assignees: Marcus Pousette, developer

2. Improved sharding: Distribution that is conditioned on device capacity

With this milestone we improve the sharding algorithm by considering peer capacity. Some instances will be more powerful than others, hence the distribution of content and building the search index should be done accordingly.

  • Research and develop a deterministic algorithm that incorporates device capacity into the leader/replication election routine. It should respect device capacity, such as storage, ram, disk, cpu, battery life and predicted availability (all capacity features might not be possible to integrate, but do as many as feasible)
  • Analyze and build safeguards to make sure this feature does not introduce attack vectors for DDoS and privacy.

Subtotal: 160 hours. (120 hours of research and implementation work, and 40 hours additional feature specific maintenance after rollout) * 80 USD/h = 12800 USD

ETA 4 weeks

Assignees: Marcus Pousette, developer

3. Improved developer experience

With this milestone we have made onboarding super easy for developers with different levels of programming experience by providing a large collection of examples that resembles different kinds of use-cases and providing tools on how to easily setup and maintain nodes.

  • Build a library example projects consisting of (at least 3 examples):

    Ideas:

    • Distributed file storage
    • Chess game
    • Paint together
    • E2EE chat
  • Create with an easy to use CLI for deploying a replicator node behind a domain with SSL certificate

  • Setup a developer support forum and chat so that minor questions can be answered quickly by the maintainers or the community.

Subtotal: 80 hours * 80 USD/h = 6400 USD

ETA 2 weeks

Assignees: Marcus Pousette, developer. Erik Allberg, product

4. Performant indexing for document stores

Right now, the computational complexity of making a query for a particular state locally in a Document store is linear to the amount of documents that exist (one has to go through every document to see if it matches the query). This could and have to be improved greatly. Performant and reliable query capabilities have to exist, fundamentally, if this framework ever is going to be considered as a goto way of building distributed applications. With this milestone we do this improvement by integrating a highly performing search index engine that allows peers to make content searchable to a greater extent.

  • Integrate Tantivy or ProblySearch (or any other feasible library) into the Document store data type
  • Allow nodes to choose what kind of indexing capability they want to provide for the network

This integration is non-trivial as there exist no implementation as of yet that can be compiled with WASM that have all the wanted indexing capabilities that are needed for this project. It might require some search engine implementation work.

Subtotal: 200 hours (160 hours of research and implementation work, and 40 hours additional feature specific maintenance after rollout) * 80 USD/h = 16000 USD

ETA 5 weeks

Assignees: Marcus Pousette, developer.

5. E2EE ZK-ACL and forward secrecy

With this milestone we have done thorough research and created a road map on how we can improve the security of the protocol. This is a pure research milestone to determine if it is possible to incorporate some powerful security and privacy measures.

  • Research on how Zero Knowledge proofs can let peers create access controllers that allow them to gate-keep logs without knowing the identities of the participants. Currently, if you want a relay that replicates a database with an identity based access controller, this relay will have to know the identities behind the commits in order to approve or reject them.
  • Research on how forward secrecy can be implemented for Peerbit. Create a definite roadmap on how to (if applicable) to integrate forward secrecy without unwanted side effects (e.g. imposed technical constraints on other features due to this). Forward secrecy could perhaps be an optional feature for client that want to pay the cost of the side effects for more security

Subtotal: 80 hours * 80 USD/h = 6400 USD

ETA 2 weeks

Assignees: Marcus Pousette, developer. Erik Allberg product.

Total Budget Requested

680 hours * 80 USD/h = 54400 USD

Maintenance and Upgrade Plans

We are committed to maintaining the code since we are to build and maintain a social, collaboration protocol on top of this framework which will require us to both maintain and improve the framework in future to be able to match all the challenging demands this will impose. In addition to this, we have developed Peerbit with a mindset that the codebase shall be super easy to understand even if it is packed with features to ensure that anyone who wants to contribute could learn and understand it in a short amount of time.

Team

Team Members

Marcus Pousette
Background in Applied mathematics and Engineering Physics.
10 years of developer experience in total. 1 year of work dedicated to compiler technology. 2 years of work related to search engine technologies.
1 year of experience with the IPFS stack. Proficient in Rust and TypeScript.

Erik Allberg
Co-founded market.xyz. Early core-contributor in Logseq. Background as founder of e-commerce startups. Been doing full-time R&D on the Global Giant Graph for 2 ½ years.

Team Member LinkedIn Profiles

https://www.linkedin.com/in/marcus-pousette-06092b102/

https://www.linkedin.com/in/allberg/

Team Website

https://dao.xyz/

Relevant Experience

We have separately spent years working in scalable data applications, compiler technology, search engine technologies, web3 and the IPFS stack, including implemeneting scaleable applications to mass-market.

Team code repositories

Peerbit: https://github.com/dao-xyz/peerbit

Other related repositories: https://github.com/dao-xyz

Additional Information

How did you learn about the Open Grants Program?
Heard about it at the IPFS Camp in Lisbon.

Please provide the best email address for discussing the grant agreement and general next steps.
marcus@dao.xyz

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions