Gmaxwell bitcoin wiki


You can think about this very pragmatically. If you are using a public ledger, then your competitors know your customers, suppliers, sales volumes, amounts, prices, etc. This causes leverage imbalances in contracts.

So your income goes up, and now your landlord tells you to give him more. It creates a breakdown of personal harmony when your inlaws and neighbors are looking at your purchases. If your conman knows you just bought something, he can call you up and threaten you. If someone sees that you have a lot of money, then they know to target you.

Another thing that privacy is important for is that it goes hand in hand with fungibility. One bitcoin is the same as every other. If we break fungibility If fungibility is broken, then it is unsafe to accept bitcoin from someone. If you have a central blacklist, then why bother with bitcoin anyway?

This has to be traded off with other considerations. You can build transparency into bitcoin. You can layer accountability and privacy on top of a private system. But trying the other direction is much harder or impossible.

You can build provable transparency in a way that you couldn't have done before. Giving people privacy on a system that is not private, is much harder. This goes much further beyond the practical day-to-day privacy against your neighbors. There are significant civil rights concerns here. To effectively seek political power, you must be able to spend money.

Some political forces stop others from spending money. This is not a radical political view. Money is important to our ability to speak according to multiple Supreme Court justices. Why privacy in Bitcoin Core? I personally view Bitcoin Core as a best practices implementation. We're not necessarily trying to be the most risky. We are taking a serious approach to building software that is very high quality. Another aspect is that Bitcoin Core is a full-node.

It's a system that autonomously validates all information without trusting anyone. One of the things that occurred, and this surprised me, I have been asked by companies and researchers to not fix privacy bugs. I think this is short-sighted for all of the reasons I just said. In Bitcoin Core, we would not intentionally add a privacy-harming misfeature.

A missing fix is invisible, though, and needs to be avoided. Not fixing a bug is equivalent or even worse. A missing fix is invisible. It should only serve the interest of its users. There has been recent public drama about surveillance attacks in the bitcoin network. Passive surveillance has existed since at least the beginning near This has been going on for a long time.

People run nodes that serve to deanonymize users by trying to trace transactions in the network. Recently there have been farms "sybil nodes" spun up where nodes try to gather more traffic so that they can analyze it better. Bitcoin Core has basic protections, it wont connect to the same netgroup multiple times. Those protections are only basic. Any weakness in this domain encourages wasting network resources. When people try to connect to everyone, that waste resources.

If we make the system more private, there's no incentive to connect to everyone and waste capacity. So one long-standing advice is to run Bitcoin Core over tor. Tor is great, it's not perfect because it's weak against state actors, it's vulnerable to denial-of-service.

It's usually strictly better than not using tor. There has been some discussion around some of the sybil attacks like, they are unlawful because they violate the Computer Fraud and Abuse act or that they violate various e-privacy directives. And that may be true, I have no opinion on that. But in the space of bitcoin, we have to defend against people who don't care about their public image and they don't care about the law or what the law says.

And so since we have to do that, our most powerful tool to defend against that is technology. So if people want to pursue other avenues to make people behave, that's fine, but my approach is to ask how can we improve the technology.

So one of the areas we have been looking to improve is with respect to "eclipse attacks". This is where people try to get your node to only connect to them or mostly connect to them by sybil attacking you. Bitcoin Core keeps a table about peers and nodes you can connect to. It is hard to gain extra space in that table. There have been some recent simulation results that reveal that it wasn't quite as good as we expected.

There were some places where randomization in the algorithm actually allowed an attacker to gain a greater concentration. So there is a paper recently published on this, and prior to the publication of the paper, the authors worked with us and we implemented a half-dozen set of fixes. We expanded the tables, didn't promiscuously take up extra advertisements. These fixes will all be available in 0. In general, people trying to connect to a lot of the network has been a long-term problem.

There are some things discussed and in the works to dissuade this activity. We might have a proof-of-work system where you can get prioritized connections to the network by doing proof-of-work. So it's more expensive to use up capacity on the network.

I proposed a proof-of-work scheme like this. Sergio has another similar scheme. They are all kind of complicated and don't completely solve the problem. To clearly solve the problem, you would deploy it right away. The combination is messy. It's still in the research phase. I have been working on a tool that you can run alongside Bitcoin Core where you have peers that you trust your friends, whatever and they can compare notes without telling each other who they are connected to, but you would be able to determine who is connecting too much to the network.

At the moment people are doing this manually to determine who to ban. There are a number of other network-level improvements that impact privacy. There was an issue in Bitcoin Core today where your connections out through Tor would go through the same circuits as your browser going through tor which is not good.

There is an information leak where someone can send addressed messages to you, and then there's timestamps, and this allows them to learn more about the network topology and run partition attacks. They can also see the graph path of how the transaction propagated through the network.

We have also been working on a way to do batched transactions as one of the ways to increase privacy. One of the features that has gone into the 0. Right now with Bitcoin Core, receiving transactions is completely private. You don't send anything to identify that. When you send, someone could use that to identify what you're doing or what your transactions are. If your transaction doesn't immediately get into the blockchain, you re-broadcast it periodically.

I think that some people have used this to trace users in the network. There is a new flag in 0. Manually sending transactions is not very useful, but it means that someone can create a program beside Bitcoin Core that could use another method to relay the transaction. So it could use a high-latency network like MixMaster or BitMessage. And the cool thing about it is that it could be separate. You don't have to learn about developing Bitcoin Core, you can just run it alongside Bitcoin Core and write it yourself.

We might pull in useful contributions there into Bitcoin Core. Another area that we are working on in privacy has privacy implications.. We have been working on making it easier for users to run full nodes. Full nodes have fundamental privacy advantages. The SPV clients like electrum are fundamentally weak from a privacy perspective.

The bloom filters uniquely identify the wallet. It is fundamentally weak. You might think you're private, but you're not. Electrum sends the server a list of addresses, and anyone can run an electrum server. You can easily spy on the electrum network. It's necessary that we have light nodes. It's the only way you're going to run bitcoin on a cell phone right now.

If we make it easier for more people to run full nodes, then the world will be better for it. There's also some behaviors in Bitcoin Core around bandwidth bursting that might not play well with consume routers.

Sometimes there's buffer bloat and then the router will have lots of latency. This allows you to run a full node that is fully private but it does not contain the full blockchain. You can have just a gigabyte of space. We have some plans for automatic bandwidth ratelimiting. The main question I have here is whether there's a way to make this self-tuning.

I'm not sure if I am going to be able to do that. There are still things that we would like to see in privacy. The wallet and coin selection algorithm is fundamentally bad for privacy.

It is incompatible with address reuse. If you reuse addresses at all, you blow up your privacy. The wallet just needs to try to avoid linking addresses. We have code in the code base that traces the links, but we don't have enough testing for the wallet infrastructure right now and it's hard for developers to be confident that they aren't breaking everything.

There has been a ton of development around coinjoin, and this is a casual privacy improvement in transactions that I described a year and a half ago. There's no implementation of coinjoin in Bitcoin Core yet, but there are many out on the network. There's some research results that show that there's a significant number of coinjoin transactions happening which I am happy to see.

The design for coinjoin systems is still being fleshed out by some people, but it's still not as mature as I would like to see before including an implementation of that in Bitcoin Core. There's also this thing that's called "stealth addresses". And I really hate the name, it's intentionally "edgy" and it's really doing something that's quite pedestrian. It has been promoted by the Dark Wallet folks. I like to call it "reusable addresses". It's a thing that we have been talking about for years, we were calling it "ECDH addresses" previously.

The notion of these stealth addresses is that you give someone an address, and every time they reuse the address, there's sort of a randomly generated different address that appears in the blockchain so those transactions are not linked together. There's an existing proposal for this, it's basically unmaintained, it gets a bunch of things wrong. And the basic design right now makes SPV privacy problems even worse.

So it's very difficult to deploy a new bitcoin address style on the bitcoin network. We created P2SH back in the beginning of and it took years before wallets started to support it. So we want to act very intentionally with this to make sure that we implement the right spec and make sure we do not have people gyrating on different approaches here.

Alright, so, that's some of the things going on with privacy in Bitcoin Core. Now I am going to switch gears and talk a little more about forward-facing technologies.

So when we deploy new technology for the bitcoin network, we have to think far into the future for a number of reasons. For one, it takes a lot of work to deploy a new system or new tool to a big distributed network. Also we need to make sure that our changes do not interfere with other future changes. So I have been giving this a lot of thought with a number of other people about what kinds of cool things that we could do to make multisignature more powerful in the future.

We have come up with some criteria about things that are good to have in multisignature schemes. I think everyone here is already familiar with multisignature. It's really solving a fundamental problem, that in bitcoin there is no other recourse than the network. If someone steals your bitcoins, your bitcoins are stolen. You can't go get a clawback.

And also, computer security is a joke. There's no trustworthy devices. Everything is remotely controllable by someone. The idea with multisignature is hey well maybe if we take multiple devices they won't all be compromised at once, and we could get some actual security.

You could define an access policy, such as these coins with these spents and if A and B agree. Or these coins can be spent only if these two parties agree, or any two out of three parties as defined, and so on. Multisig has been in bitcoin since the very first day. We added some features to make it more accessible with P2SH.

That took years to get deployed. It's important to think about this to get the pipeline going. One of the problems with multisig today is that it's costly. If you use 2-of-3 today, it increases your transaction sizes by a factor of 2. And that means 2. And that also has a direct impact on the decentralization of the network because the more expensive it is to run nodes on the system, the less people will run them, and the more centralized the system becomes.

So we want to have a good handle on this. The bigger your multisig, the more your cost is. And so there's a contention where your security says use multisig policy, but practicality says no you're not going to do that. It would be nice to improve that. And we can improve it.

So I want to talk about some cryptosystem background stuff to let you understand how we can improve this. And Schnorr is older than ECDSA and it's simpler, more straightforward, and it has robust security proofs and it's a little bit faster as well.

But Schnorr was patented, and as we have seen in the history of cryptography, patenting is poison. Any patented cryptosystem sees virtually no deployment at all. People would rather be insecure than deploy a patented cryptosystem. And of course, patenting is actively incompatible with decentralization because the patent holder owns the technology.

So in any case, the NSA came up with a nice workaround to the Schnorr patent. ECDSA is very similar to it, but not algebraically equivalent. One of the cool things about Schnorr is that it can do multisignature in a very straightforward and scalable way. It's sort of like.. Schnorr multisignature works the way that an idiot would tell you it works even if they knew nothing about cryptography. So the way it works is that if you want to have a 2-of-2 signature with two parties, and you add together the two pubkeys and to get the 2-of-2 signature you add together their signatures, and that's a 2-of-2 signature in Schnorr.

And there are some details for actually implementing it, but that's the basic idea. And it just works. And it gives you a 2-of-2 signature. Not only does it give you a 2-of-2 signature, but this Schnorr scheme can be extended to give you an arbitrary threshold. Actually arbitrary cascade of thresholds, an arbitrary monotone linear function. You can get any policy you want. You can't distinguish it from 1-of-1, it's the size of 1-of-1, it scales like 1-of Pieter and Andrew Poelstra went to started to implement this.

Pieter started with a Schnorr verifier and then Andrew went to make a key-sharing tool to do thresholds. We realized that in order to make a threshold Schnorr key, the signers have to collaborate to generate the pubkey. You can't derive a threshold key, the designers have to interact. If the threshold is big, they actually have to interact a whole lot. That's a little problematic. We have seen in the bitcoin ecosystem that there are a lot of cool things you can do with Script, but if you need a complex state machine to So that's sort of fundamentally worse than what we have today, even though it's more efficient.

What other criteria should we be thinking about when selecting or creating a signature system? One of them is accountability, or so I thought. This is where in the bitcoin multisig system today, you can see who signed a transaction.

When there's a 2-of-3 signature in bitcoin today, using P2SH, you can see who signed a transaction. This is actually kind of important because what if one of these 2-of-3 signatures is applied to a transaction you did not authorize? You want to know. And not only do you want to know, you want to be able to prove to the world that they did it, you might want to sue them, you might want to discredit them. You want to communicate about this.

This is a useful property for a multisignature scheme to have. The Schnorr signature scheme does not have this, you can't tell which of the signatures signed because it looks exactly like a 1-of-1 signature. So this is a criteria that would be useful to have. Another useful property of a signature scheme is usability.

Many of the multisignature schemes require round-trips between the participants. In the bitcoin multisignature scheme today, you can send a transaction to the first signer, they sign it and then send it to the second person to sign it, and they can do this all the way to the end and then put it in the blockchain and you're done.

With n-of-m Schnorr you can basically do that. You need one round-trip to establish the knowns which you can do in advance. But you have to have lots and lots of round-trips to establish the threshold. There are some other schemes that require many rounds during signing. You would have to go to and from the safe during signing, basically, to complete your transaction.

Nobody is going to use that. And building the software to support that and teaching people how to use that would be a real barrier. So usability is one of the other constraints we have to worry about. Another one is privacy. I talked before about why privacy is important. In the context of multisignature, privacy is useful because if an attacker knows your policy, he knows what to target.

If he sees that you are 2-of-3, then he has more information about what to look for which may lead to him kidnapping specific necessary people to coerce them into signing or stealing their private keys. Maybe it's 8 people he has to go and kidnap, and that's a different tradeoff. If you are using an odd policy, then people can trace your transactions which has commercial implications as I mentioned earlier.

Seems like privacy may be incompatible with accountability, but it's not true. Accountability means that the participants and the people they designate need to know what's going on in the transaction, and privacy on the other hand is a statement about third-parties.

So this Schnorr signature stuff has great privacy. Nobody can tell what the policy is, except for the participants. But it has worse accountability. And bitcoin today, has great accountability but very poor privacy. And this is fancy cryptographic techniques to do the same stuff as the Schnorr multisignature, but using the existing bitcoin infrastructure that has already been deployed today.

This scheme has a limitation. It fails on usability. Also like Schnorr multisignatures they have no accountability. But it works in theory today, already.

There's no implementations right now that don't require a trusted dealer. But this may be okay for situations where you don't care about those implications. Now I have to say that the first version of the paper of this said that you could do this without a trusted dealer, and I argued with the authors and they eventually convinced me that yes we could really do that. And then they retracted their paper and said no, you really need a trusted dealer to generate the keys.

They have since gone back and come up with a scheme that I believe, without their convincing, will work without a trusted dealer but no one has implemented it yet. I am not going to talk further about this. It may be interesting, but it's not the long-term interesting stuff. So I want to talk about a couple of schemes that I have been working on and coming up with that give different mixes of these usage criteria.

And we we start with the observation that n-of-n all-sign Schnorr multisignature meets the criteria of it's efficient and it doesn't require a bunch of roundtrips, and it's actually completely attemptable because if they all signed, then they all signed.

A larger threshold like a 2-of-3 signature So you can enumerate all of the possible satisfactions for the thresholds, and there are M-choose-N of them, and you can build a hash tree over them, like we use for SPV proofs in bitcoin, and then in your signature you can prove to the network that this pubkey is from this set of pubkeys and you provide just the N-of-N signature. Now this is interesting because it scales fairly well, it has improved efficiency although not the same efficiency as 1-of-1, it's completely accountable, and the parties know who signed.

If you randomize the keys in it, the only thing that someone can learn is the size of the upper bound of the threshold and it's relatively cheap to add one hash and double the size. So privacy is pretty good too.

The verification efficiency is great, it's basically constant time to verify, it's basically checking a signature and then some hash verifications. The real problem with this scheme is that if you want to talk about a big threshold, like more than 20 participants, the tree becomes so large that you can't compute the pubkey. Now the network doesn't have to do this, but the participants do. But this becomes impractical quite quickly because of this binomial blowup.

Instead of building this big hash tree where you precompute all of the satisfying combinations, why not have the signer show the network all of the M pubkeys that are participating, and then have a verifier compute the N-of-N?

So I can take the pubkeys and the verifier on the network, and say okay this set is signing, the verifier can add up the pubkeys, and then provide a signature for that. This is good accountability, but it's not private because the network can see who is signing.

And it works with one pass so it's usable. The size isn't great, it's always larger than the tree version, even though that tree version has that binomial blowup in it. Verification is fast, and it has an okay set of tradeoffs. And then taking from this idea, I thought well could we do better.

So I can show how this works concretely. Say we want to do a 3-of So 3 people, 1 not signing. We need to compute an M-of-N pubkey that has left out one participant. So we make our two pubkeys, it's the sum of the participants. And then we tell the network another public value, which is the sum of the participant A plus two times the sum of participant B's key plus three times participant C's key and plus four times participant D's key.

So if you know how to sign with participant A, then you can be fine with 8 times A or any other constant, it's just multiplying a constant. Then you can ask the network, hey we want to do this signature and we don't want C to participate. The network would compute a new pubkey here on the slide denoted P sub V. If you write this out, this value is the same as.. This can be extended by adding quadratic and cubic points.

You can cancel out an arbitrary number of values. You can have a threshold.. M - N plus 1 is the scaling. You can send these values to the network and encode them in an unbalanced hash tree, so you only have to reveal to the network just as many as you're going to cancel. And what that means is that you might have 50 of signature, but if all participants are available, you compute that as a of, and you reveal only the first term of the polynomial, and then you provide that of signature, and your transaction looks like a 1-of And so you get perfect efficiency in the case where all of your cosigners were available.

And perfect privacy in that case because you revealed nothing about your actual policy. If you need to reveal more people because some signers were offline, you can do that and you leak a little bit about your policy.

I mentioned in the list of features, composable. Composable is this notion is that it should be possible for you to have your own policy and you have your own policy, and neither of you should care about what your own policy is. I should be able to create a policy of your policy without knowing the details. You should be able to use 2-of-3, and the other member should be able to use 3-of-5 or whatever. I should be able to make a 2-of-2 of you.

And I shouldn't have to care. These schemes themselves do not do anything for composability directly. But we can overlay a higher level higher scheme that achieves composability and what we found when exploring this is that if the higher-level scheme is only able to express a monotone boolean function, that is to say signatures where someone extra-signing will never make your signature untrue, then it is quite easy to write software to handle unknown parts.

You can write software that will sign parts that it knows, and not worry about the rest. So we think that will make, if we overlay a scheme that does this on top, then we should be able to get something more composable, but we haven't really explored this whole space yet.

I have a sort of comparison chart here, and if you notice all of my slides are wordy. Just to give you an idea of how the schemes scale, just look at this chart. It's not clear how this will develop. Some of these ideas are very complementary and can be merged.

Expect to see some more development on this in the future. The art of selection cryptography: So I want to talk about now about a thing that I am calling the art of selection cryptography. And I'm not using the word cryptography here; this is more the philosophical selection of my talk here. Before I can tell you what selection cryptography is, I think I need to redefine cryptography.

The definition that people use today is broken. You go to Wikipedia or any dictionary and they will say that cryptography is secret writing or deciphering messages.

That definition has nothing to do with many of the things that we today, like digital signatures, zero knowledge proofs, private information retrieval, hash functions, it doesn't talk to things like cipher suite negotiation in TLS which has been a constant source of vulnerabilities.

And bitcoin itself, too. You can build a bitcoin node today with absolutely no cryptography in it. The only cryptography that we use if you go by the dictionary definition is wallet encryption. And then you never send the messages to anyone else So to explain my explanation, I want to take a step back and sort of give my view on the world. Back in the early 90s when I was sort of politically coming of age and on the internet, I was very excited and involved in the cypherpunks group and the activity around the prosecution of hackers and the export of encryption software.

And this politics or religion that the internet would change everything. There was this rallying call, "information wants to be free".

And I knew in my bones that this was true. We were going to use computers, which turn everything into information, and we would use networks to hook all of the computers together and we would change the world. We were going to change the power balances, make more people more empowered and everyone would have access to the world's knowledge and they would all fulfill their potential. That's a very political take on something that I now think in fact is better described as a law of nature.

This is not just that I want information to be free; no, information really does want to be free. It is fundamental that information will percolate out into every little nook and cranny, and you can't control it. The result is that often bad things happen because information wants to be free. Sunlight is the ultimate solvent, but solvents corrode.

So my email wants to be read by the NSA. When I try to login to my server, it can't tell me from you, because you can just replay my login. And now you're logged in as me. When you go and browse the internet, people learns how it works.

They see inside your mind what used to be completely private. When I go to research something, marketers can send out cheap spam, and that spam is just as visible as the information I seek. If I want to build a digital cash system, I can't, because information is perfectly copyable.

And all copies are just as good. Money that you can just copy is not much of a money at all. So you have an environment where there are powerful parties that have more ability to use this fundamental nature of information, and this goal of everyone being more empowered may not come true.

And so, I would like to propose this definition of cryptography that says that cryptography is that art and science with which we try to fight this fundamental nature of information.

We try to bend it to our political will and to serve our moral purposes. And to direct it to human ends against all outcomes and all eventualities that may try to oppose it. This is a sort of broad definition. It encompasses everything that we should properly call cryptography, and a number of things that we haven't yet traditionally called cryptography, such as computer security, or even sometimes the drafting of legislation. I don't really offer this lightly.

I have thought about this for a long time and I think this definition leads to pretty good intuitions about the kinds of things that have cryptographic considerations.

So often, as technologists, we get excited when we have a cryptographic tool to solve this problem. You want to read my email? You want to track my stuff, bam private information retrieval. Bam, digital signatures, I'm going to solve all problems with some cryptographic tool.

You can fight back against things you don't like in the world with a bunch of math. But sometimes we get caught up in the coolness of that and we forget that we are really fighting the fundamental nature of information. It's so hard that it may not be possible to make secure cryptosystem. They are all predicated on a set of strong assumptions.

We assume that some mathematical problem is intractable. Over time we have seen that many cryptosystems have been broken. Few people believe it to be the case, but it may be fundamentally impossible to build strong cryptography. If you were able to build a provably secure asymmetric cryptosystem with no strong assumptions in it, this would be a proof that P! You could still build insecure cryptosystems; building securing cryptosystems is actually harder than just figuring out whether P isn't equal to NP.

So don't expect anyone to solve this soon. A really important point here is that attacks on cryptosystems are themselves information that want to be free.

We often underestimate how powerful computers have become because our software is so bloated and slow and has many layers. You can imagine that a computer is sort of an intellectual equivalent of someone who is doing arithmetic for you, but a billion times faster. So if you can attack a cryptosystem by applying a lot of force, computers are a force multiplier. Everyone who is attacking your cryptosystem, if they have a desktop computer, it's like them having an army of a billion imbeciles.

They might be imbeciles, but there's a billion of them. And that's before they get a botnet. And then they have a hundred thousand times that much computing power. Or the NSA data center. So if someone is able to attack your cryptosystem and reduce it to a state where it is still a huge haystack that they are searching for a needle in, they can then apply a lot of computing power to go further. We can even use the computing power to search for complicated algebraic solutions for the systems as well.

It's not just the number crunching. It actually expands our intellectual capacity to attack the systems. This more strongly favors attackers than it does defenders in general.

In order to build a secure cryptographic system, we have to secure it against any eventuality. And so as a result, virtually everything people propose ends up being broken. This is certainly true for everything I've touched. There's a whole subfield in academia about provable cryptography. People get confused about what provable cryptography means. Provable cryptography is about cryptography that is secure as long as the proof is right and the assumptions hold.

Well why wouldn't it be secure if the assumptions hold? Well it turns out that's actually hard to achieve too. And many things that occur in provable cryptography, there's a pressure to publish a proof. So the easiest way to get a proof is to adopt a stronger assumption. There is a lot of provable cryptography that is broken because they adopted assumptions that were wrong, they sounded plausible but they turned out to not be true, or they proved some vacuous property that did not map to security in a practical sense.

I don't mean to sort of say that cryptography is the only thing that people do that is hard. Civil engineering is a tremendously difficult discipline and there are lives on the line if a building doesn't stand up. But usually in civil engineering you are more worried about a limited set of natural causes and you're not generally worried about the billion army of imbeciles and all of the world's efforts to nearly effortlessly attack you.

If you ask someone to build a building that cannot be taken down through all the force in the world, they would tell you that you're nuts. They would probably ask for a trillion dollars first, and then tell you that you're nuts, but still take the trillion dollars.

We are only able to think about cryptography at all because we can use software. Software is a great building block. We have tremendous tools to write software that is more complicated than anything else. A very complicated piece of mechanical engineering, something like the space shuttle, on the fringe of what we can do as a civilization, has on the order of something like , parts.

But if you look at a conventional piece of software that you use every day, say Firefox, that's 17 million lines of code. Almost any one of those lines of code could be undermining your security. It is fundamentally weak. You might think you're private, but you're not. Electrum sends the server a list of addresses, and anyone can run an electrum server.

You can easily spy on the electrum network. It's necessary that we have light nodes. It's the only way you're going to run bitcoin on a cell phone right now. If we make it easier for more people to run full nodes, then the world will be better for it.

There's also some behaviors in Bitcoin Core around bandwidth bursting that might not play well with consume routers. Sometimes there's buffer bloat and then the router will have lots of latency.

This allows you to run a full node that is fully private but it does not contain the full blockchain. You can have just a gigabyte of space. We have some plans for automatic bandwidth ratelimiting. The main question I have here is whether there's a way to make this self-tuning. I'm not sure if I am going to be able to do that. There are still things that we would like to see in privacy. The wallet and coin selection algorithm is fundamentally bad for privacy.

It is incompatible with address reuse. If you reuse addresses at all, you blow up your privacy. The wallet just needs to try to avoid linking addresses. We have code in the code base that traces the links, but we don't have enough testing for the wallet infrastructure right now and it's hard for developers to be confident that they aren't breaking everything.

There has been a ton of development around coinjoin, and this is a casual privacy improvement in transactions that I described a year and a half ago. There's no implementation of coinjoin in Bitcoin Core yet, but there are many out on the network. There's some research results that show that there's a significant number of coinjoin transactions happening which I am happy to see.

The design for coinjoin systems is still being fleshed out by some people, but it's still not as mature as I would like to see before including an implementation of that in Bitcoin Core. There's also this thing that's called "stealth addresses". And I really hate the name, it's intentionally "edgy" and it's really doing something that's quite pedestrian.

It has been promoted by the Dark Wallet folks. I like to call it "reusable addresses". It's a thing that we have been talking about for years, we were calling it "ECDH addresses" previously. The notion of these stealth addresses is that you give someone an address, and every time they reuse the address, there's sort of a randomly generated different address that appears in the blockchain so those transactions are not linked together.

There's an existing proposal for this, it's basically unmaintained, it gets a bunch of things wrong. And the basic design right now makes SPV privacy problems even worse.

So it's very difficult to deploy a new bitcoin address style on the bitcoin network. We created P2SH back in the beginning of and it took years before wallets started to support it. So we want to act very intentionally with this to make sure that we implement the right spec and make sure we do not have people gyrating on different approaches here. Alright, so, that's some of the things going on with privacy in Bitcoin Core. Now I am going to switch gears and talk a little more about forward-facing technologies.

So when we deploy new technology for the bitcoin network, we have to think far into the future for a number of reasons. For one, it takes a lot of work to deploy a new system or new tool to a big distributed network.

Also we need to make sure that our changes do not interfere with other future changes. So I have been giving this a lot of thought with a number of other people about what kinds of cool things that we could do to make multisignature more powerful in the future. We have come up with some criteria about things that are good to have in multisignature schemes.

I think everyone here is already familiar with multisignature. It's really solving a fundamental problem, that in bitcoin there is no other recourse than the network.

If someone steals your bitcoins, your bitcoins are stolen. You can't go get a clawback. And also, computer security is a joke. There's no trustworthy devices. Everything is remotely controllable by someone. The idea with multisignature is hey well maybe if we take multiple devices they won't all be compromised at once, and we could get some actual security. You could define an access policy, such as these coins with these spents and if A and B agree. Or these coins can be spent only if these two parties agree, or any two out of three parties as defined, and so on.

Multisig has been in bitcoin since the very first day. We added some features to make it more accessible with P2SH. That took years to get deployed. It's important to think about this to get the pipeline going. One of the problems with multisig today is that it's costly. If you use 2-of-3 today, it increases your transaction sizes by a factor of 2. And that means 2. And that also has a direct impact on the decentralization of the network because the more expensive it is to run nodes on the system, the less people will run them, and the more centralized the system becomes.

So we want to have a good handle on this. The bigger your multisig, the more your cost is. And so there's a contention where your security says use multisig policy, but practicality says no you're not going to do that. It would be nice to improve that. And we can improve it. So I want to talk about some cryptosystem background stuff to let you understand how we can improve this.

And Schnorr is older than ECDSA and it's simpler, more straightforward, and it has robust security proofs and it's a little bit faster as well. But Schnorr was patented, and as we have seen in the history of cryptography, patenting is poison. Any patented cryptosystem sees virtually no deployment at all.

People would rather be insecure than deploy a patented cryptosystem. And of course, patenting is actively incompatible with decentralization because the patent holder owns the technology. So in any case, the NSA came up with a nice workaround to the Schnorr patent. ECDSA is very similar to it, but not algebraically equivalent. One of the cool things about Schnorr is that it can do multisignature in a very straightforward and scalable way. It's sort of like..

Schnorr multisignature works the way that an idiot would tell you it works even if they knew nothing about cryptography. So the way it works is that if you want to have a 2-of-2 signature with two parties, and you add together the two pubkeys and to get the 2-of-2 signature you add together their signatures, and that's a 2-of-2 signature in Schnorr. And there are some details for actually implementing it, but that's the basic idea.

And it just works. And it gives you a 2-of-2 signature. Not only does it give you a 2-of-2 signature, but this Schnorr scheme can be extended to give you an arbitrary threshold.

Actually arbitrary cascade of thresholds, an arbitrary monotone linear function. You can get any policy you want. You can't distinguish it from 1-of-1, it's the size of 1-of-1, it scales like 1-of Pieter and Andrew Poelstra went to started to implement this. Pieter started with a Schnorr verifier and then Andrew went to make a key-sharing tool to do thresholds.

We realized that in order to make a threshold Schnorr key, the signers have to collaborate to generate the pubkey. You can't derive a threshold key, the designers have to interact. If the threshold is big, they actually have to interact a whole lot.

That's a little problematic. We have seen in the bitcoin ecosystem that there are a lot of cool things you can do with Script, but if you need a complex state machine to So that's sort of fundamentally worse than what we have today, even though it's more efficient.

What other criteria should we be thinking about when selecting or creating a signature system? One of them is accountability, or so I thought. This is where in the bitcoin multisig system today, you can see who signed a transaction.

When there's a 2-of-3 signature in bitcoin today, using P2SH, you can see who signed a transaction. This is actually kind of important because what if one of these 2-of-3 signatures is applied to a transaction you did not authorize? You want to know. And not only do you want to know, you want to be able to prove to the world that they did it, you might want to sue them, you might want to discredit them.

You want to communicate about this. This is a useful property for a multisignature scheme to have. The Schnorr signature scheme does not have this, you can't tell which of the signatures signed because it looks exactly like a 1-of-1 signature.

So this is a criteria that would be useful to have. Another useful property of a signature scheme is usability. Many of the multisignature schemes require round-trips between the participants.

In the bitcoin multisignature scheme today, you can send a transaction to the first signer, they sign it and then send it to the second person to sign it, and they can do this all the way to the end and then put it in the blockchain and you're done. With n-of-m Schnorr you can basically do that. You need one round-trip to establish the knowns which you can do in advance.

But you have to have lots and lots of round-trips to establish the threshold. There are some other schemes that require many rounds during signing. You would have to go to and from the safe during signing, basically, to complete your transaction.

Nobody is going to use that. And building the software to support that and teaching people how to use that would be a real barrier. So usability is one of the other constraints we have to worry about. Another one is privacy. I talked before about why privacy is important. In the context of multisignature, privacy is useful because if an attacker knows your policy, he knows what to target.

If he sees that you are 2-of-3, then he has more information about what to look for which may lead to him kidnapping specific necessary people to coerce them into signing or stealing their private keys.

Maybe it's 8 people he has to go and kidnap, and that's a different tradeoff. If you are using an odd policy, then people can trace your transactions which has commercial implications as I mentioned earlier. Seems like privacy may be incompatible with accountability, but it's not true. Accountability means that the participants and the people they designate need to know what's going on in the transaction, and privacy on the other hand is a statement about third-parties.

So this Schnorr signature stuff has great privacy. Nobody can tell what the policy is, except for the participants. But it has worse accountability. And bitcoin today, has great accountability but very poor privacy. And this is fancy cryptographic techniques to do the same stuff as the Schnorr multisignature, but using the existing bitcoin infrastructure that has already been deployed today. This scheme has a limitation. It fails on usability. Also like Schnorr multisignatures they have no accountability.

But it works in theory today, already. There's no implementations right now that don't require a trusted dealer. But this may be okay for situations where you don't care about those implications. Now I have to say that the first version of the paper of this said that you could do this without a trusted dealer, and I argued with the authors and they eventually convinced me that yes we could really do that.

And then they retracted their paper and said no, you really need a trusted dealer to generate the keys. They have since gone back and come up with a scheme that I believe, without their convincing, will work without a trusted dealer but no one has implemented it yet. I am not going to talk further about this.

It may be interesting, but it's not the long-term interesting stuff. So I want to talk about a couple of schemes that I have been working on and coming up with that give different mixes of these usage criteria. And we we start with the observation that n-of-n all-sign Schnorr multisignature meets the criteria of it's efficient and it doesn't require a bunch of roundtrips, and it's actually completely attemptable because if they all signed, then they all signed.

A larger threshold like a 2-of-3 signature So you can enumerate all of the possible satisfactions for the thresholds, and there are M-choose-N of them, and you can build a hash tree over them, like we use for SPV proofs in bitcoin, and then in your signature you can prove to the network that this pubkey is from this set of pubkeys and you provide just the N-of-N signature.

Now this is interesting because it scales fairly well, it has improved efficiency although not the same efficiency as 1-of-1, it's completely accountable, and the parties know who signed. If you randomize the keys in it, the only thing that someone can learn is the size of the upper bound of the threshold and it's relatively cheap to add one hash and double the size.

So privacy is pretty good too. The verification efficiency is great, it's basically constant time to verify, it's basically checking a signature and then some hash verifications. The real problem with this scheme is that if you want to talk about a big threshold, like more than 20 participants, the tree becomes so large that you can't compute the pubkey. Now the network doesn't have to do this, but the participants do.

But this becomes impractical quite quickly because of this binomial blowup. Instead of building this big hash tree where you precompute all of the satisfying combinations, why not have the signer show the network all of the M pubkeys that are participating, and then have a verifier compute the N-of-N?

So I can take the pubkeys and the verifier on the network, and say okay this set is signing, the verifier can add up the pubkeys, and then provide a signature for that. This is good accountability, but it's not private because the network can see who is signing. And it works with one pass so it's usable.

The size isn't great, it's always larger than the tree version, even though that tree version has that binomial blowup in it. Verification is fast, and it has an okay set of tradeoffs. And then taking from this idea, I thought well could we do better. So I can show how this works concretely. Say we want to do a 3-of So 3 people, 1 not signing.

We need to compute an M-of-N pubkey that has left out one participant. So we make our two pubkeys, it's the sum of the participants. And then we tell the network another public value, which is the sum of the participant A plus two times the sum of participant B's key plus three times participant C's key and plus four times participant D's key. So if you know how to sign with participant A, then you can be fine with 8 times A or any other constant, it's just multiplying a constant.

Then you can ask the network, hey we want to do this signature and we don't want C to participate. The network would compute a new pubkey here on the slide denoted P sub V. If you write this out, this value is the same as.. This can be extended by adding quadratic and cubic points. You can cancel out an arbitrary number of values.

You can have a threshold.. M - N plus 1 is the scaling. You can send these values to the network and encode them in an unbalanced hash tree, so you only have to reveal to the network just as many as you're going to cancel. And what that means is that you might have 50 of signature, but if all participants are available, you compute that as a of, and you reveal only the first term of the polynomial, and then you provide that of signature, and your transaction looks like a 1-of And so you get perfect efficiency in the case where all of your cosigners were available.

And perfect privacy in that case because you revealed nothing about your actual policy. If you need to reveal more people because some signers were offline, you can do that and you leak a little bit about your policy.

I mentioned in the list of features, composable. Composable is this notion is that it should be possible for you to have your own policy and you have your own policy, and neither of you should care about what your own policy is. I should be able to create a policy of your policy without knowing the details. You should be able to use 2-of-3, and the other member should be able to use 3-of-5 or whatever. I should be able to make a 2-of-2 of you. And I shouldn't have to care.

These schemes themselves do not do anything for composability directly. But we can overlay a higher level higher scheme that achieves composability and what we found when exploring this is that if the higher-level scheme is only able to express a monotone boolean function, that is to say signatures where someone extra-signing will never make your signature untrue, then it is quite easy to write software to handle unknown parts. You can write software that will sign parts that it knows, and not worry about the rest.

So we think that will make, if we overlay a scheme that does this on top, then we should be able to get something more composable, but we haven't really explored this whole space yet. I have a sort of comparison chart here, and if you notice all of my slides are wordy. Just to give you an idea of how the schemes scale, just look at this chart. It's not clear how this will develop. Some of these ideas are very complementary and can be merged. Expect to see some more development on this in the future.

The art of selection cryptography: So I want to talk about now about a thing that I am calling the art of selection cryptography. And I'm not using the word cryptography here; this is more the philosophical selection of my talk here. Before I can tell you what selection cryptography is, I think I need to redefine cryptography.

The definition that people use today is broken. You go to Wikipedia or any dictionary and they will say that cryptography is secret writing or deciphering messages. That definition has nothing to do with many of the things that we today, like digital signatures, zero knowledge proofs, private information retrieval, hash functions, it doesn't talk to things like cipher suite negotiation in TLS which has been a constant source of vulnerabilities.

And bitcoin itself, too. You can build a bitcoin node today with absolutely no cryptography in it. The only cryptography that we use if you go by the dictionary definition is wallet encryption. And then you never send the messages to anyone else So to explain my explanation, I want to take a step back and sort of give my view on the world. Back in the early 90s when I was sort of politically coming of age and on the internet, I was very excited and involved in the cypherpunks group and the activity around the prosecution of hackers and the export of encryption software.

And this politics or religion that the internet would change everything. There was this rallying call, "information wants to be free". And I knew in my bones that this was true. We were going to use computers, which turn everything into information, and we would use networks to hook all of the computers together and we would change the world. We were going to change the power balances, make more people more empowered and everyone would have access to the world's knowledge and they would all fulfill their potential.

That's a very political take on something that I now think in fact is better described as a law of nature. This is not just that I want information to be free; no, information really does want to be free.

It is fundamental that information will percolate out into every little nook and cranny, and you can't control it. The result is that often bad things happen because information wants to be free. Sunlight is the ultimate solvent, but solvents corrode. So my email wants to be read by the NSA. When I try to login to my server, it can't tell me from you, because you can just replay my login. And now you're logged in as me. When you go and browse the internet, people learns how it works.

They see inside your mind what used to be completely private. When I go to research something, marketers can send out cheap spam, and that spam is just as visible as the information I seek. If I want to build a digital cash system, I can't, because information is perfectly copyable.

And all copies are just as good. Money that you can just copy is not much of a money at all. So you have an environment where there are powerful parties that have more ability to use this fundamental nature of information, and this goal of everyone being more empowered may not come true.

And so, I would like to propose this definition of cryptography that says that cryptography is that art and science with which we try to fight this fundamental nature of information. We try to bend it to our political will and to serve our moral purposes. And to direct it to human ends against all outcomes and all eventualities that may try to oppose it. This is a sort of broad definition. It encompasses everything that we should properly call cryptography, and a number of things that we haven't yet traditionally called cryptography, such as computer security, or even sometimes the drafting of legislation.

I don't really offer this lightly. I have thought about this for a long time and I think this definition leads to pretty good intuitions about the kinds of things that have cryptographic considerations.

So often, as technologists, we get excited when we have a cryptographic tool to solve this problem. You want to read my email? You want to track my stuff, bam private information retrieval. Bam, digital signatures, I'm going to solve all problems with some cryptographic tool. You can fight back against things you don't like in the world with a bunch of math.

But sometimes we get caught up in the coolness of that and we forget that we are really fighting the fundamental nature of information. It's so hard that it may not be possible to make secure cryptosystem. They are all predicated on a set of strong assumptions. We assume that some mathematical problem is intractable.

Over time we have seen that many cryptosystems have been broken. Few people believe it to be the case, but it may be fundamentally impossible to build strong cryptography. If you were able to build a provably secure asymmetric cryptosystem with no strong assumptions in it, this would be a proof that P! You could still build insecure cryptosystems; building securing cryptosystems is actually harder than just figuring out whether P isn't equal to NP. So don't expect anyone to solve this soon.

A really important point here is that attacks on cryptosystems are themselves information that want to be free. We often underestimate how powerful computers have become because our software is so bloated and slow and has many layers.

You can imagine that a computer is sort of an intellectual equivalent of someone who is doing arithmetic for you, but a billion times faster. So if you can attack a cryptosystem by applying a lot of force, computers are a force multiplier. Everyone who is attacking your cryptosystem, if they have a desktop computer, it's like them having an army of a billion imbeciles.

They might be imbeciles, but there's a billion of them. And that's before they get a botnet. And then they have a hundred thousand times that much computing power. Or the NSA data center. So if someone is able to attack your cryptosystem and reduce it to a state where it is still a huge haystack that they are searching for a needle in, they can then apply a lot of computing power to go further. We can even use the computing power to search for complicated algebraic solutions for the systems as well.

It's not just the number crunching. It actually expands our intellectual capacity to attack the systems. This more strongly favors attackers than it does defenders in general. In order to build a secure cryptographic system, we have to secure it against any eventuality. And so as a result, virtually everything people propose ends up being broken. This is certainly true for everything I've touched.

There's a whole subfield in academia about provable cryptography. People get confused about what provable cryptography means. Provable cryptography is about cryptography that is secure as long as the proof is right and the assumptions hold. Well why wouldn't it be secure if the assumptions hold? Well it turns out that's actually hard to achieve too.

And many things that occur in provable cryptography, there's a pressure to publish a proof. So the easiest way to get a proof is to adopt a stronger assumption. There is a lot of provable cryptography that is broken because they adopted assumptions that were wrong, they sounded plausible but they turned out to not be true, or they proved some vacuous property that did not map to security in a practical sense.

I don't mean to sort of say that cryptography is the only thing that people do that is hard. Civil engineering is a tremendously difficult discipline and there are lives on the line if a building doesn't stand up. But usually in civil engineering you are more worried about a limited set of natural causes and you're not generally worried about the billion army of imbeciles and all of the world's efforts to nearly effortlessly attack you.

If you ask someone to build a building that cannot be taken down through all the force in the world, they would tell you that you're nuts. They would probably ask for a trillion dollars first, and then tell you that you're nuts, but still take the trillion dollars.

We are only able to think about cryptography at all because we can use software. Software is a great building block. We have tremendous tools to write software that is more complicated than anything else. A very complicated piece of mechanical engineering, something like the space shuttle, on the fringe of what we can do as a civilization, has on the order of something like , parts.

But if you look at a conventional piece of software that you use every day, say Firefox, that's 17 million lines of code. Almost any one of those lines of code could be undermining your security. Typical figures for defect rates for software in the industry are numbers like 15 to 50 bugs per thousand lines of code. The number varies a lot; maybe for software where people care a lot it is more like one bug per thousand lines of code. But one per , and we're talking about software that has 17 million lines of code?

So software, despite our awesome tools to build it, is very buggy. Making it cryptographically secure is even harder. Security testing is making sure that is all that your software does.

I have hopefully impressed on you that this is a hard area. This is not news to me. There is this adage on the internet that goes like, "Never write your own cryptography". Because people did appreciate that it's hard and everyone gets it wrong. But I think that's bad advice.

I call that the abstinence-only approach to cryptographic education. And one of the results is like provable security stuff, if you tell people to never write their own cryptography, they will go off and redefine cryptography to be some narrow part, like yes I didn't reimplement AES So some people have reimplemented AES and had only minor attack problems, but rarely are systems broken by people reimplementing underlying cryptographic primitives, although there's plenty of potential to do so.

Systems are more often broken by higher-level violations of their assumptions. And even if you follow this "never write your own cryptography" maxim, now you have this other problem which is, now you have to go and select the cryptography that you will be using, and you have to use it with that software's assumptions. So I would like to re-emphasize that if people are counting on a program, to fight this fundamental nature of information, the program as a whole is cryptographic.

That doesn't mean that you can't write it, or that you shouldn't write it, but it means that you need to step up to the plate and recognize the risks. This does come along with some bad news, though. I can't tell you how to write a secure cryptographic program. We don't even know if it is possible.

We do know that some things are unsafe and that you shouldn't do this or you shouldn't do that. But usually this advice is very application-specific. It is not general advice. So in general what I can say is we should face the challenges frankly and understand the risks, we should communicate and learn from our mistakes, and we should advance the art. So in the interest of advancing the art, I wanted to sort of talk a little about a special kind of cryptography that is probably the most common cryptography in the world.

I call it selection cryptography. It is the cryptosystem of picking cryptosystems. And you should think about, when you select a cryptographic tool, or build software that has cryptographic implications, what selections are you making and are those choices good from the perspective of a cryptographic adversary. So I see a lot of norm in the bitcoin space is to build tools out of primitives that were found on github. I don't say that to deride it. There is some fantastic code on github, including my own.

But not all code on github is good. So we can think about things like how can we go about doing a better job at selection? If you are a domain expert in the particular cryptographic tools that you are using, then you can review it as a true reviewer. That's great, and I hope everyone that can do that, does so.

But if you're selecting someone else's code, you probably can't review it. You probably don't understand the underlying parts, and you probably shouldn't necessarily need to. So I propose this 3-step program which is to ask yourself first, is this code broken or malicious?

What can happen if it is? You have to think about this. If you come back and say, "not much" then you are wrong and you should go back and think about it some more. Go back to step one. If you take a piece of software that seems like it can do nothing wrong, but its install script has a root shell backdoor in it, and you run it on your infrastructure, you're completely compromised.

You need to identify what they are, and then deal with them. And think about what can be done to mitigate those risks. So I wanted to give some concrete examples, and this was really hard for me. For all that I said about how hard this is, I don't think that anyone is bad or incompetent for making mistakes in this area. DJB, one of the most brilliant cryptographers of our time, his original ED code had a bug until someone tried to formally prove it correct and found that it occasionally generated incorrect results.

So everyone makes mistakes. We need to understand the mistakes so that we may learn from them. I have given an example here. On the screen, is an incredibly commonly at least in the past deployed piece of javascript for "secure random number generation" and this has been used on s of websites including many many bitcoin websites that generate private keys for wallets and signing.

It wasn't created by someone in the bitcoin ecosystem, but it was widely picked up by it. Now it has a couple of things that are sort of funny about it that a reviewer would pick up, or at least a reviewer with domain expertise.

So one is that there is a check in it, where it checks if "window. And if it's not available, it just doesn't use it. It doesn't throw an error. It doesn't do anything.. What it does use is "Math. And in most browsers, "Math. There is only 2 48 possible states for it.

And it's also in most browsers seeded from the time that browsers started. So this value is pretty predictable. And then also, it uses the time that it was running at. That's also pretty predictable. And so if you're in this state where you have not used the secure random number generator, you're using something that maybe has on the order of 50 bits of entropy at most and probably quite less.

Now with the power of a billion imbeciles, an attacker can search this space. It is quite practical to do so. And they can discover private keys as a result of doing so. So that's good at least. But I have complained about this code to people using it, because it looks unsafe. Now what I didn't see, and what I tried to tell 12 people now, what virtually no one I have showed this to, even telling them that there's another issue in this, is that that.. Well, navigator version is a string.

This code never uses the secure random number generator. And this happens when you are using this pile of javascript from inside webworkers, which actually happened in the bitcoin ecosystem. But you don't even have to have that problem; that's what happens when you don't select things correctly. Another concrete example is that a very popular bitcoin wallet deployed a message encryption function using ECIES.

So, ECIES is, it's not correct to say it's standardized, but it's a well-studied way of doing message encryption with elliptic curve cryptography. And I say it's not standardized because there's no test vectors or things like that, you can implement it on your own and your implementation may not match anyone else's.

But it's well-understood, and if implemented correctly, it's secure. So they implemented ECIES using source code they found on github, and the source code they found on github was widely leaked all over the internet, it was mentioned on bitcointalk, people who knew cryptography were talking to the author about it.

The author gives me the impression of someone who is sort of new to programming. He was really excited. And so this wallet picked it up, they reviewed it- good for them. They found that it used an insecure random number generator, like the python mersenne twister stuff. It was some other system that the author had just sort of magicked.

It had a bunch of other problems. A couple of the problems is that if someone was running a decryption oracle like something that would throw error messages back, you could send it 2 16 messages, collect the results, and then take another message to that same destination that you wouldn't have decrypted, and you can use those 2 16 results and decrypt the message.

It's a decryption oracle attack. It also had this issue where if you send some messages through it, it would just silently corrupt them on encryption, including the all 1s-bits messages. So if you send it hex FF through it, the results would be line noise and it was just a silent corruption bug.

So all of these issues I found in about 10 minutes because I have domain expertise in this exact kind of cryptosystem. And there are probably more problems with it, I stopped looking at that point. The authors of this wallet software removed this software, took it down, took that feature out, in about 1 more minute after my report. I don't think that they did wrong so much here, I think they are very competent and that they responded in a very responsible way.

Other people that I have worked with have not been so responsible in the past. Other wallet vendors have done similar things.