Announcing Project E3DB: The End-to-End Encrypted Database

Update: TozStore is out of preview and into production! We are looking for early adopter developers to try it out and give us feedback.

Today, Project E3DB is a tool for programmers who want to build an end-to-end encrypted database with sharing into their projects. We are providing a command-line client for you to play with and a Java SDK to prototype with. Over the next few weeks, we’ll be getting feedback from you, adding features, and making it ready for production.

[Update: When this blog post was first written, E3DB was still in Beta. It’s now in production as TozStore, so please use it!]

Why did we build this thing:

We are passionate about security and privacy. We think crypto is usually way too hard for users and way too hard for programmers. We want to make it easy. A few years back, we released a popular AES library for Android that helped developers Do The Right Thing (since Android makes it very easy to do the wrong thing). That was a great experience, and Project E3DB is a natural extension of that, if massively more ambitious.

What it does:

  • Stores JSON data blobs with a REST API: Well, that’s not too exciting, but it’s the baseline. Need to read & write data into the cloud from your mobile or web app? Project E3DB does it.
  • Encrypts that data on the client: Now we’re talking! We give you code to embed into your web or mobile app that encrypts the data before sending and storing it in the cloud. We’re not just talking about using HTTPS. Our cloud server can’t see the plain text; only your client can.
  • Lets you add trusted readers: By default, only your mobile or web client can read the data you put into Project E3DB, but if you want, you can add other mobile or web clients, and we’ll let them download the encrypted data too. But of course, it’s encrypted, so…
  • Adds keys for trusted readers: Your trusted readers need to be able to decrypt the data, so you need to add their key to the list of trusted readers. Our SDK makes that easy while keeping you in control.

Dataflow overview for E3DB

What you can use it for:

When we do a real release, you can use Project E3DB for any type of data storage.

  • Do you collect data about your users? You should be storing it encrypted at rest so the bad guys can’t get it. You really should already be doing this.
  • Do you share data with third parties? Instead of sending them your data in the clear, make them read it encrypted at the source. It lets you stay in control and doesn’t encourage them to store their own plain text copy.
  • Do you want your users to communicate end-to-end encrypted? Each party can have their own key and send JSON blobs to each-other without you or us being able to decrypt it. Sweet.
  • Do you want to segregate data among your servers? Ever heard of the Principle of Least Privilege? Project E3DB lets you create a JSON object store where only the services that need access to the data will get the keys to the data. That means that if a bad guy breaks into one part of your system, they don’t get everything!

What’s available today:

We’ve got a few things that you can grab from Github and start playing with:

  • An SDK (jar file) for integrating into your Java projects: This is a Project E3DB client that does the encrypting, decrypting, encoding, sharing, and networking.
  • An open source command-line client that shows how to use even more of the features. This is really just to illustrate using E3DB without having to write any code. You could even layer test code on top of this by shelling out to it if your project isn’t in Java. That’s what we did with the feedback script.

What we need from you:

  1. Get Project E3DB from Github. There’s a zip file with the command-line client or you can clone the repo to build and mess around with the Java code.
  2. Try it out. The command line client takes a few minutes to get going; integrating it into an app or Java project probably takes a few hours.
  3. We’d like your feedback! There’s a feedback command for that you can try out that shares your feedback with Isaac via E3DB. Alternately, you can email us your thoughts: feedback@tozny.com You can say whatever you want, but if we can suggest what would be most useful:
    • Do you feel like you get what Project E3DB does?
    • Do you see value in it for you or your users? If so, can you tell us what’s valuable about it?
    • Was it easy, normal, or hard to use?
    • If it weren’t for Project E3DB, how would you accomplish this same task? Unencrypted database? Some similar project? Rolling it yourself?
    • Suggestions? Other stuff you want to say?



 

Where is Project E3DB going:

We’re building an Personal Data Service (PDS), and Project E3DB is the cryptographic core of our PDS. Think of a PDS as a single place for users to manage their privacy settings, giving organizations access to the information that users are willing to share, but limiting access to information users themselves consider more sensitive. PDSs provide security, user-controlled sharing, and a robust access control system so that only authorized third parties have access to the data.
Read more from our CEO about privacy in the cloud.

Upcoming features:

  • Per-record sharing: The current implementation lets you securely share data by its content type (e.g. “please share all of my contacts with this other user”). We’re going to make sharing much more fine-grained so that you can select specific records, or maybe even specific fields.
  • User portal: Since E3DB will eventually become a personal data service, we will provide a graphical interface for data producers and users to view their data and manage sharing, as well as log in with e.g. OIDC.
  • Fine-grained queries: The current system lets you list all of your data, but the near-term plan is to be able to query by content type and record fields. This is why the content type and record fields are not currently encrypted. We want you to be able to query on them. This may be parametrizable later. See more details below.
  • Richer policy language: The current policy allows you to share data with other users, but nothing more complex. Longer term, we will provide richer rules for how data is handled.
  • Parametrizable cryptography: We will increase the default key sizes and make those and the ciphers, and modes. We use JSON Web Encryption under the hood which is cipher and key size agnostic and we will be exposing more of this functionality via the Java client soon.
  • Key rotation: We will implement a built-in method for rotating the symmetric keys that encrypt data.
  • More language support: Our current library is implemented in Java, and so is suitable for JDK languages, including Scala and Android. We also plan to support a low-level library (probably in C) so that other languages can implement bindings using a foreign function interface.
  • Performance and other miscellaneous improvements.

Appendix: More on the process & the Crypto

The fun thing about E3DB in both the command line and in the Java code is that it’s encrypting JSON blobs and storing / retrieving / sharing them without you having to really worry about the crypto. The downside of this is that it’s not completely obvious what’s going on!

We will be publishing more details about the crypto soon, but we wanted to provide a brief overview here.

What gets encrypted:

It’s important to understand that certain metadata are not encrypted with the data keys as outlined below. Metadata is important from a security perspective; it can reveal who you are communicating with, and what you are communicating about. In particular, the content type and, the field names, and the users who have access to the data are protected with role-based access control, but not with your keys.

  • Pros: The advantage of this approach is that it supports querying; you can (eventually) query for the metadata type and the field names, making the database significantly more useful.
  • Cons: Assuming that you use meaningful names for things (and to to a lesser extent, even if you don’t) the structure of the data can be available to an attacker who has achieved privileged access to E3DB. That is, the attacker could tell that something is a “user record” and a “social security number”, but not what that social security number is.

We’d like your feedback on this approach. We may make this optional in the future so that the user can choose what’s encrypted, while accepting a limited ability to query data.

How do keys work:

NOTE This section has been updated since original publication to reflect the current crypto approach.

In short, Tozny uses libSodium for cross-platform strong crypto. When you register with Project E3DB, you generate some keys, including:

  • Data Keys: A symmetric key used to encrypt the data itself.
  • Authorization (AuthZ) Keys: A symmetric key used to encrypt the data key. Authorized parties get access to this key.
  • Client Key: A public / private key pair that is used to encrypt an authorization (authz) key. E3DB maintains the list of public keys and provides them on request for sharing. Longer-term, we want to provide an out-of-band mechanism to help with sharing or verifying public keys.

Then when you share the data, that same authz key is encrypted with the other party’s public key. We call the set of encrypted authz keys the “Cryptographic Authorization Block” (CAB).

Here is a brief video explainer on the crypto (Note: since creating this video, the ciphers have been changed to use libSodium instead of AES and RSA).

And here’s the flow in picture form:

e3db-simple-data-flow