I'm the technical lead for decentralization at the Internet Archive and developer of the Dweb.archive site. I've been involved in decentralization on and off since forever, since the 80s when we built an ISP to take the Internet to the developing world, an organization called the APC. It's still around. And 20 years ago was a CTO of a company that did peer-to-peer video, didn't go very far. Company crashed to start up, but played with us since then. And I'm doing this talk because, partly because I showed some of this technology to a number of other developers and they said, oh, we're struggling with the same problem, which is, you know, our mission is universal access to all knowledge, but what about the bits you don't want to share? Your viewing list, your bookmarks, your Facebook feed that you only want your friends to see, especially if you don't want to share it with Cambridge Analytica. So how do you build that technically? This is a super technical, geeky, look at the data structures talk, which I've gone through with a couple of people which they've been using to build into their applications. All this stuff is available open source, as with everything we do, so the library is downloadable, but it would need adapting to different circumstances. Yeah, you say, how do you access all knowledge except the private bits? So how can you hide things when there's no server to store passwords in? Of course, we all know that cryptography is the answer, but implementing it in each scenario can get complex, and if you talk to a crypto Greek, you probably need someone else to decrypt the answer, and I was interested in a particularly tricky problem that turns out to have some broad applicability, so let's look at my set of constraints. Let's imagine a personal updates feed just because it's something you're all familiar with, and let's imagine Alice, and anyone who's read any crypto white paper knows she's always called Alice for some reason lost in the history of time, and her updated feed has a list of content, which hopefully she's going to expand over time, and she also has a list of people she wants to have access to, her friends, and we also want that to change over time, and those two constraints, the need to be able to add content to a feed and the need to be able to add people to the feed without telling everyone you add about every content that was put there or telling every piece of content about each new person you add is a set of constraints that eliminates a lot of the crypto solutions that you'll see in other places here where we're talking about chat applications where we can share keys in real time, so we need different solutions for different constraints. So our basic structural assumption is that there's a potentially growing list of articles which refers to an access control list, and that access control list refers to the users who can access it, and we'll flesh out some of the details in a bit, but I'm going to describe it in terms of the data structures we use in our DWeb objects JavaScript library, but it could just as easily be implemented directly on top of any of the underlying transports like GAN or YJS, any decentralized data storage system should be able to implement this authentication system, and in particular, we wanted to be able to implement authentication in a way that was independent of the platform that's doing it. We want to implement a system that's a property of the data, not a property of whether you get that data via IPFS or GAN or YJS. So let's say Bob, and they always call Bob for the same reason that Alice was always called Alice, let's say Bob wants to access some content. So he starts out with the address of a content list, like a list of places that he might be able to get this stuff, and we currently support a whole bunch of different stuff in our protocols, so we're supporting, in this case, HTTP, GAN, and YJS. And for any developer out there who's building on top of transports, I would suggest wherever you can put a URL, support multiple URLs, support an array of URLs, because you don't know which of those is going to work, and especially if you're working in places like Medli does where things get blocked, you may find that some of these protocols are blocked and you want to be using others. So design your apps around the thought that you may be getting data from multiple places. Everything we've done, apart from implementing it on top of GAN and IPFS, we also implemented on top of HTTP in no-knowledge content servers, so they don't know what they're running. They're just storing stuff. So Bob can use those URLs to retrieve a list of content, and each piece of content is we simply store with the content encrypted with an AES token, for example, or it could be any kind of symmetric encryption system, and an access control list. That content list was immutable. The content itself is immutable, so that could have come via IPFS or come via anywhere. That list, as I say, has a list of public URLs in a bunch of different systems, and in that list are entries which include Bob's public key and the token encrypted with Bob's key. And in the ACL for each of your friends, in the example we used, the Facebook feed, there's something in there for each of Bob's friends. And what Bob has to do to decrypt that content is follow the list to the access control list, pull back, look up his public key in that list, pull back the token, decrypt it with his private key, and access the data. We've been building that in as a filter on the storage system, so as the storage system retrieves data, if it sees this underscore ACL field in it, it will automatically decrypt it before handing it to the application, so the application itself doesn't have to think about whether the data is encrypted or not. And the same in the opposite direction. Please, ask questions. So are you retaining the encryption for all of the friends, and then Bob finds one in that list? Sorry, I couldn't hear the question. Questions are good, by the way. This is a technical... Yeah, sorry, my question is, there's a friends list, right, so there are many people who have access, but a relatively small finite list. My question is, is Bob getting content that's tailored to him, so encrypted using his public key, or is all possible encryptions are included and Bob finds the encryption that's relevant to him? Neither of those are good solutions. What you want is a series of content which is encrypted with a key that relates to the content, and then the list is the mapping between the content's key and Bob's key, which means you can have a million pieces of content, all encrypted, and a list with a few hundred people on it, and you don't have to do an end by end, which is the thing you're trying to avoid, which is the thing that a lot of the solutions out there end up doing, and end by end does not scale, obviously, but with end or end scales, my laptop doesn't. Does that answer your question? Cool. And then in our library, what we do is we now have to look at how you manage those keys, and the way we're doing, and this would work in different ways in different systems, is we use a simple passphrase approach that, and actually I think Gunn has a better way of doing this, I should say, and we get back a key chain, and a key chain, similarly, is we don't want, Bob does not want to use the same key for everything, and too many people make the assumption that you're using Bob's public key, and you see the term Bob's public key. When I hear the term Bob's public key, I get worried, because that assumes Bob only has one key, and really you want to be using a different key for everything you do, and so we're assuming that you similarly manage a key chain, but again, the key chain can't be stored somewhere private, it has to be stored in the decentralized web, because I want to open up my phone and use it, and I want to open up someone else's browser and access my stuff, so in this concept, we're using a key chain, the key chain is out there on the net, it also has URLs, and it's encrypted in the same manner, and so Bob can use a passphrase or a bitcoin mnemonic or something to get his master key, use the master key to retrieve the list, and on the list, he will find, sorry, key pairs, which is his key, his new key, his key for this purpose, encrypted with his master key. So that becomes a very simple way to manage a large set of keys in a public net, without having to put them in any one person's particular technology, because you can store these data structures on IPFS and YJS and GAN and all these different places. There's one more step you need, which is how do you publish content, which goes down one more level of the crypto rabbit hole, and Alice, the publisher, she's going to have an encrypted private version of the content list, which has the information that she needs to add people both to the, to add content, which is basically the token, the only extra thing she needs is the, sorry, my cursor's going the wrong way, where is it? She needs the key pair associated with the content and a, does it show the token there? Oh, no, the token's over here. So she needs the key pair associated with the content and an access control list, the private version of it contains the token that she's using, so she can decrypt, so that also can be on the public net, so all these data structures we've had, everything except the passphrase is out there in the public, is out there, publicly visible, publicly accessible, no secrets, which is the hard part with all this encryption stuff. And I've probably lost everybody there, and I don't have a good demo because we were too busy putting the Dweb.archive together, but the application of this immediately for us is the favorites list, that if you want to run a favorites list in a decentralized web, where do you store it in a place that only the person who created it can view it, but how do we do it in a way that's independent of this device? I mean, I travel a lot, I don't necessarily have the same device with me all the time, so how do we do that, and how do we manage that? And so that was the instant stuff, I'll happily talk about any specifics with people if they want to, if people want to implement it in their own applications. And all this code for this is in the Dweb objects library, which is in the same place as all of our other Dweb repos. Questions? Did I lose everybody in the weeds? Yeah, my question is about if you remove somebody from the access control list, are you then re-encrypting the data set with a new AES key? No, you need to, yes, you need to do that, if you want to remove, relegation is very hard in a decentralized web, because if you've got, you have to assume that any data that's ever been out there is always going to be out there. So if you re-encrypt all the content, that doesn't mean that someone won't be able to find the old version of the content via, say, its IPFS hash. The only way you can effectively remove people from something is make sure that the code itself implements those rules, because most people will not be savvy enough to go get it. So for the 90 percent case, it won't work for the 100 percent case where you absolutely need to stop someone, and you really can't, because you don't know that they didn't look at it in the past and keep an unencrypted copy on their laptop. It's a different problem, and if you really want that, you need to do something a lot more sophisticated than what I've done. This will handle the 90 percent cases of, you know, I add people to my friends list, I remove them, I don't care if they already made a copy. Is that, yes? Going back to the 100 percent case, a lot of this is all predicated upon the fact that, like, encryption is not going to be broken at some point. So if I was going to, if I was the U.S. government and I was going to put a list of our spies in other countries and encrypt it and put it on the blockchain, where it gets replicated out and then people could start, like, trying to find some flaw in encryption, and it could be ten years later that they find it, how, is there at all a way to actually put data out there and have it be safe for a long period of time? No. And I'm one of those people who've had an argument with the NSA about how long the keys should be back when export controls existed, an argument with them in public about it. So I think, and I think it's a totally valid question, and it's one of those 100 percent questions, and how do we do something that's, I'm going to get philosophical about this. I was involved in the IETF when we were going through the mail encryption issues. And the problem was we had PGP, which was unusable, and we had SMIME, which was unimplementable in those days. And so the perfect was the enemy of the good. And encryption has to be damn easy to use. More importantly, it has to be damn easy for application developers to put in stuff. And the result of this is how many people send the majority of their email encrypted here? Not a single hand went up. That has been technically possible for 30 years, right? It's not there because people make it so damn hard to implement that nobody implements it. Let's have a big hand for Mitra Argrin out there implementing what you need to implement. Let's go beyond theory.