Bitcoin mining technical explanation
Bitcoin wallets keep a secret piece of data called a private key or seed, which is used to sign transactions, providing a mathematical proof that they have come from the owner of the wallet. The signature also prevents the transaction from being altered by anybody once it has been issued. All transactions are broadcast between users and usually begin to be confirmed by the network in the following 10 minutes, through a process called mining.
Mining is a distributed consensus system that is used to confirm waiting transactions by including them in the block chain. It enforces a chronological order in the block chain, protects the neutrality of the network, and allows different computers to agree on the state of the system. To be confirmed, transactions must be packed in a block that fits very strict cryptographic rules that will be verified by the network.
These rules prevent previous blocks from being modified because doing so would invalidate all following blocks. Mining also creates the equivalent of a competitive lottery that prevents any individual from easily adding new blocks consecutively in the block chain. This way, no individuals can control what is included in the block chain or replace parts of the block chain to roll back their own spends.
This is only a very short and concise summary of the system. If you want to get into the details, you can read the original paper that describes the system's design, read the developer documentation , and explore the Bitcoin wiki. How does Bitcoin work? This is a question that often causes confusion. Here's a quick explanation! One possible approach is for her to try to validate a block that includes both transactions.
Assuming she has one percent of the computing power, she will occasionally get lucky and validate the block by solving the proof-of-work. Unfortunately for Alice, the double spending will be immediately spotted by other people in the Infocoin network and rejected, despite solving the proof-of-work problem.
A more serious problem occurs if she broadcasts two separate transactions in which she spends the same infocoin with Bob and Charlie, respectively.
She might, for example, broadcast one transaction to a subset of the miners, and the other transaction to another set of miners, hoping to get both transactions validated in this way.
In fact, knowing that this will be the case, there is little reason for Alice to try this in the first place. She will then attempt to fork the chain before the transaction with Charlie, adding a block which includes a transaction in which she pays herself:. And unless Alice is able to solve the proof-of-work at least as fast as everyone else in the network combined — roughly, that means controlling more than fifty percent of the computing power — then she will just keep falling further and further behind.
Of course, she might get lucky. We can, for example, imagine a scenario in which Alice controls one percent of the computing power, but happens to get lucky and finds six extra blocks in a row, before the rest of the network has found any extra blocks.
In this case, she might be able to get ahead, and get control of the block chain. But this particular event will occur with probability. Of course, this is not a rigorous security analysis showing that Alice cannot double spend.
The security community is still analysing Bitcoin, and trying to understand possible vulnerabilities. The proof-of-work and mining ideas give rise to many questions. How much reward is enough to persuade people to mine? How does the change in supply of infocoins affect the Infocoin economy? Will Infocoin mining end up concentrated in the hands of a few, or many? These are all great questions, but beyond the scope of this post.
I may come back to the questions in the context of Bitcoin in a future post. To use Bitcoin in practice, you first install a wallet program on your computer.
You can see the Bitcoin balance on the left — 0. What you do is tell your wallet program to generate a Bitcoin address. You then send your Bitcoin address to the person who wants to buy from you.
You could do this in email, or even put the address up publicly on a webpage. This is safe, since the address is merely a hash of your public key, which can safely be known by the world anyway.
The person who is going to pay you then generates a transaction. Line 1 contains the hash of the remainder of the transaction, 7c This is used as an identifier for the transaction. Lines 3 and 4 tell us that the transaction has one input and one output, respectively.
Line 6 tells us the size in bytes of the transaction. Lines 7 through 11 define the input to the transaction. In particular, lines 8 through 10 tell us that the input is to be taken from the output from an earlier transaction, with the given hash , which is expressed in hexadecimal as ae Line 11 contains the signature of the person sending the money, Again, these are both in hexadecimal.
This seems like an inconvenient restriction — like trying to buy bread with a 20 dollar note, and not being able to break the note down. The solution, of course, is to have a mechanism for providing change.
Lines 12 through 14 define the output from the transaction. In particular, line 13 tells us the value of the output, 0. Line 14 is somewhat complicated. The main thing to note is that the string a7db6f You can now see, by the way, how Bitcoin addresses the question I swept under the rug in the last section: In fact, the role of the serial number is played by transaction hashes.
In the transaction above, for example, the recipient is receiving 0. There are two clever things about using transaction hashes instead of serial numbers. Second, by operating in this way we remove the need for any central authority issuing serial numbers. Instead, the serial numbers can be self-generated, merely by hashing the transaction. Ultimately, this process must terminate.
This can happen in one of two ways. This is a special transaction, having no inputs, but a 50 Bitcoin output. In other words, this transaction establishes an initial money supply.
You can see the deserialized raw data here , and read about the Genesis block here. With the exception of the Genesis block, every block of transactions in the block chain starts with a special coinbase transaction. This is the transaction rewarding the miner who validated that block of transactions. It uses a similar but not identical format to the transaction above. You can read a little more about coinbase transactions here.
The obvious thing to do is for the payer to sign the whole transaction apart from the transaction hash, which, of course, must be generated later. Currently, this is not what is done — some pieces of the transaction are omitted. This makes some pieces of the transaction malleable , i. I gather that this malleability is under discussion in the Bitcoin developer community, and there are efforts afoot to reduce or eliminate this malleability.
In the last section I described how a transaction with a single input and a single output works. Line 1 contains the hash of the remainder of the transaction. As in the single-input-single-output case this is set to 0, which means the transaction is finalized immediately. Lines 7 through 19 define a list of the inputs to the transaction. Each corresponds to an output from a previous Bitcoin transaction.
Line 11 contains the signature, followed by a space, and then the public key of the person sending the bitcoins. Lines 12 through 15 define the second input, with a similar format to lines 8 through And lines 16 through 19 define the third input.
The first output is defined in lines 21 and Line 21 tells us the value of the output, 0. The main thing to take away here is that the string e8c One apparent oddity in this description is that although each output has a Bitcoin value associated to it, the inputs do not. Of course, the values of the respective inputs can be found by consulting the corresponding outputs in earlier transactions. In a standard Bitcoin transaction, the sum of all the inputs in the transaction must be at least as much as the sum of all the outputs.
The only exception to this principle is the Genesis block, and in coinbase transactions, both of which add to the overall Bitcoin supply. If the inputs sum up to more than the outputs, then the excess is used as a transaction fee.
This is paid to whichever miner successfully validates the block which the current transaction is a part of. One nice application of multiple-input-multiple-output transactions is the idea of change.
Suppose, for example, that I want to send you 0. I can do so by spending money from a previous transaction in which I received 0. The solution is to send you 0. Of course, it differs a little from the change you might receive in a store, since change in this case is what you pay yourself.
But the broad idea is similar. That completes a basic description of the main ideas behind Bitcoin. But I have described the main ideas behind the most common use cases for Bitcoin. How anonymous is Bitcoin?
Many people claim that Bitcoin can be used anonymously. This claim has led to the formation of marketplaces such as Silk Road and various successors , which specialize in illegal goods.
However, the claim that Bitcoin is anonymous is a myth. The block chain is a marvellous target for these techniques. I will be extremely surprised if the great majority of Bitcoin users are not identified with relatively high confidence and ease in the near future.
Furthermore, identification will be retrospective, meaning that someone who bought drugs on Silk Road in will still be identifiable on the basis of the block chain in, say, These de-anonymization techniques are well known to computer scientists, and, one presumes, therefore to the NSA.
I would not be at all surprised if the NSA and other agencies have already de-anonymized many users. It is, in fact, ironic that Bitcoin is often touted as anonymous. Bitcoin is, instead, perhaps the most open and transparent financial instrument the world has ever seen. Can you get rich with Bitcoin? I must admit I find this perplexing. What is, I believe, much more interesting and enjoyable is to think of Bitcoin and other cryptocurrencies as a way of enabling new forms of collective behaviour.
But if money in the bank is your primary concern, then I believe that other strategies are much more likely to succeed. One is a nice space-saving trick used by the protocol, based on a data structure known as a Merkle tree. You can get an overview in the original Bitcoin paper. You can read more about it at some of the links above. But this is only a small part of a much bigger and more interesting story.
But the scripting language can also be used to express far more complicated transactions. To put it another way, Bitcoin is programmable money. In later posts I will explain the scripting system, and how it is possible to use Bitcoin scripting as a platform to experiment with all sorts of amazing financial instruments.
You can tip me with Bitcoin! You may also enjoy the first chapter of my forthcoming book on neural networks and deep learning, and may wish to follow me on Twitter. In my legally uninformed opinion digital money may make this issue more complicated. At least naively, it looks more like speech than exchanging copper coins, say. Thanks, I was always too lazy to look up BTC in detail. Your article cleared most of my questions. I wanted to know one thing what if some smart hacker is able to find some vulnerability in the protocol and he uses that to generate new bitcoins for himself.
Once that happens then whole confidence in bitcoins would be gone and it would lead to chaos. Your scenario is possible. Just like any other popular piece of open source software there are incentives for finding exploits, but there are a lot of benevolent hackers examining the code to uncover and fix them. Yes, that solves much of the problem neatly. My broad point about asymmetries is still true, however.
And is vividly demonstrated by the rise of large mining pools. This is in response to your comment below. I must have clicked on the wrong link when I replied. There have been 2 major live flaws in Bitcoin that I know of: Might want to look up the CVEs and the patches. From the sound of them, some validation check was omitted and so bad transactions were allowed. The raw block data that each miner is trying to solve contains a generation transaction.
That transaction is where their coins are sent if they solve that block. Because miners competing against each other want their coins to be sent to different addresses, and those addresses are hashed together with their nonce, it does not matter if everyone starts their nonce from zero. The added randomness from differing generation transaction addresses prevents each miner from working in the same space as others.
I had wondered about the same question as the author. Your explanation clears it up for me. Moreover the nonces need not be enumerable. If randomly picked from a large enough pool it is unlikely that the same nonce gets picked twice. Only one thing to add on another post: Bitcoin has 3 methods for finding peers: Did I miss it?
Does it have anything to do with quantum computing? Oops — actually, I had an extended discussion of this question, but deleted it just before I posted. The reason I deleted it is that the discussion was inconclusive. The separation seems to be a fairly arbitrary design decision — there are some minor space and security advantages, but not enough in my opinion to justify making the Bitcoin address the hash rather than the public key.
That reduces the window during which the private key could be derived and used in a double-spend to about 10 minutes. This has significant ramifications for the safe transition to quantum-proof cryptography, if nothing else.
To me, both seem like relatively small points. On the first point, many people reuse addresses, so in practice public keys are often widely known. So it does seem a bit arbitrary. This might make a nice example for my post on Bitcoin scripting. I have read that there is no known algorithm that would allow public keys to be derived from public addresses within a practicable timescale, even with quantum computing. However, the same is not true for deriving private keys from public keys.
Thus addresses that have not been used to spend, have benefits in terms of being more QC proof. I recall Vitalik Buterin writing on this topic. It looks like the protocol version is inside the JSON. What would be the incentive for non-miners to answer your question? Why would you trust the answers or lack thereof? After all, if I understand correctly, when there is no transaction fee set aside, the miners could very well choose to omit transactions from their blocks?
On trusting the answers: The requirement of a signature makes this hard to forge by a malicious naysayer. On your last point, yes, this is a very interesting question. At present this all seems to be working okay, but over the long run I suspect will limit the use of Bitcoin for small transactions. On the last point: I could see the transaction fee being indirectly related to the time required to confirm a transfer.
If you want your transfer confirmed quicker, then you have to pay. Also could someone with very large resources overwhelm the network with bad data? Eg, if china wanted to use some super computers or a bot net to stop bitcoin from operating by adding all sorts of bad data to the block chains? Denial of service type attacks are a real problem. On the first question, the answer is, I think: Android had a bug in their random number api that was successfully exploited.
Or maybe someone dies but the next of kin doesnt know the details? Lost bitcoins are just that — gone from the money supply for good, unless someone manages to either a recover the keypair; or b breaks the underlying crypto. That brings up an interesting scenario, on a long time scale there will have to be some allowance made for replacement of the lost coins, or sub-division of the satoshi.
With Bitcoin; losing the private key for good is more like accidentally dropping your coins out of an airplane over the pacific ocean. Looks like we both independently arrived at similar methods of explanation: Thank you so much!!!! I had wanted an understandable primer on Bitcoin since ages and this was a fabulous read!
It looks likely to cause floating point approximation errors. Just wanted to say thanks for a really great essay — the explanation was really clear, and totally fascinating. Can quantum computers mine bitcoin faster?
Does this boil down to how quickly a quantum computer can find a string that has a specified property for SHA? For which we have a quadratic speedup, but probably no more? Thanks for this, while I understood the majority of it, the coding element was very useful — especially highlighting where the script goes in conjunction with the transaction. While a lot of people know abot bitcoin, there is such a shortage of good quality technical info.
Why is bitcoin built to be inherently deflationary? This seems to be the go-to argument against why it will ever gain widespread adoption as a currency. Why does the reward for mining bitcoin halve every , blocks?
Could there be a point in the future where this is reversed? I certainly suspect as do you that these may ultimately turn out to be design flaws. Bitcoin is NOT deflationary. It is inflationary with a known and decreasing rate up until around at which point it will stop being inflationary. The only deflation in Bitcoin may happen through coin loss.
The same, by the way, is true for Fiat. The difference is that Fiat can be arbitrarily inflated and with Bitcoin it is not arbitrary. Why is it inflationary at all as in, why not start with a predetermined amount of bitcoins that never change. Bitcoin designers wanted a way to spread bitcoins around without starting with a central authority that has them all and gives them out like, say, ripple.
The bitcoin generating part of mining does exactly that. Bitcoin is only not deflationary if you assume that real wealth production will gradually slow, and eventually stabilize around at the same pace as the drop in Bitcoin production. The more that needs to be paid out in each transaction to cover the fees, the lower prices and actual payments will have to fall to make room for that overhead.
Lower revenue translates to lower ability to afford a given price level, and so on. What actually needs to be demonstrated is that there is any value in allowing any static, nonproductive account to maintain its nominal value, as opposed to using the inherent decline in the value of such accounts provide the baseline motivation to use more productive investments to store anything beyond cash sufficient to meet immediate needs for liquidity.
Trying to store value in money rather than in future production potential is the ultimate perverse incentive, rewarding fraud and financial manipulation far out of proportion to development of real assets. There are excellent reasons for wanting to store value. One obvious one is the desire to save for retirement. JPE V66 6 Dec. Actually bitcoin is inherently deflationary if you believe that the size of the bitcoin economy will grow faster than the money supply.
Although not quite intuitive, it does make sense upon reflection that the money supply reflects the value of the economy it represents.
If the money supply is growing faster than the underlying economy then you get inflation. If the money supply is growing slower than the economy you get deflation. I think all but a few of us expect the bitcoin economy to grow faster than the supply of bitcoins — hence we have a deflationary currency.
The wisdom of that choice is another mater, of course. One could imagine many different scenarios for the amount and timing and conditions of new currency entering the system. Does everyone have their own version of it or do they sync to a master? Does every block chain get updated when validation is completed? But the way the protocol is designed at present there is a sizeable number of people keeping a full copy of the block chain. This is currently quite a manageable size about 12 gig.
If Bitcoin grows rapidly enough this may eventually become a problem. The conclusion there, which seems to me believable, is that there are many options for scaling Bitcoin at least up to the level at which credit cards are used today, and perhaps further. Just about the total amount of bitcoins, if I understand well, new bitcoins are generated each time a transaction is processed? It means the more exchange we have, the more bitcoins in the market there is?
How were created the first bitcoins? Is there another way of creating bitcoins that checking transactions? Not per transaction but per block of transactions. Exchanges are a bad example. The transactions within the exchange happen outside the network. There are so many trades going on within an exchange, it happens internally. And since trades need to happen fast, the network is not suited for that. Thanks for the great Bitcoin writeup. What I think is more interesting than the cryptography aspect is the social-motivational aspect of Bitcoin and why it seems to be succeeding.
Scaling this system to support a billion users transacting multiple times per day seems…. Anyway, all very interesting to watch. As usual, I got in late and out early with Bitcoin bought around 5, sold around , seemed like an awesome profit margin at the time… that aspect of Bitcoin is a lot like any other speculative investment, and is certainly fueling interest at this stage.
On scalability, check out https: Like you, though, I wonder about the long-run economics and impact of mining. Thanks for writing this great explanation of Bitcoin. I noticed in the first Bitcoin transaction example, you mention 0.
Thanks for the excellent writeup. I have a question about one item, hopefully you can explain it. It appears the money you send someone is merely chunks of one or more previous transactions. Those previous transactions are the inputs for my transaction to you. How does the transaction message for the 2 bitcoin transaction prove that I was the recipient of those previous transactions when the addresses are all different? The proof is in the digital signature. That signature is generated using a public key which must match when hashed the address from the output to the earlier transaction.
But if I understand correctly the need for every transaction to be publicly verified means that you are tied to all your transactions. Anyone with a copy of the block chain can notice that the flow of money goes from various drug users, to Stringer, to Russell. If you really want to enable money laundering, first create a bank. Such a bank would have more uses than just money laundering. You will use a trusted middleman that does several transactions each day, some with good-guys and some with bad-guys.
The middle-man then transfers out the necessary amounts to intermediate addresses yyy0 … yyyM that he has set up specifically for this transaction period. Because all the incoming money has gone into the xxx address there is no way to separate out subsequently which money went to which reciever. If ALL the yyyy addresses belong to bad guys then you would be guilty by association.
Many bitcoin services perform such mixing by default, based on what I have read. The legal ramifications for the mixing service provider are unclear to me. But such a bank would have to keep its own records — both as a practical necessity and as a legal requirement — and those could be obtained by the authorities.
Whereas cash can be laundered tracelessly, through a cash business like a casino or restaurant, which can perfectly innocently be expected to have lots of cash coming in and no way of knowing where it comes from. Interestingly this is exactly what was done with silk road. It basically was a bitcoin bank moving bitcoins around in such a way the buyer and seller could not be connected.
Nonce starting at zero is not a vulnerability. The nonce is simply 32 bits out of the whole bit coinbase that you are hashing and there is no way to design a target solution to be distributed anywhere within the nonce range of those 32 bits.
Of course this creates an obvious incentive for all participants to try to guess nonces in a different order than everyone else. So it seems reasonable that most client software would use a random sequence of nonce guesses rather than guessing sequentially from 0. But still, if one were to find a vulnerability in the random number generator of a popular client, then it might be possible to design a competing client which would, in practice, almost always find the correct nonce before the targeted client, by virtue of guessing the same sequence a few steps ahead.
That would allow the attacker to successfully validate a share of blocks greater than their actual portion of the collective computational power, at the cost of everyone using the vulnerable client and finding the nonce less often than they should on average. Alex has explained my concern well.
As people make transactions, the public ledger grows. Will it not grow to an unmanageable size at some time? If the block chain forks, do the miners on both sides of the fork keep their rewards? I am puzzled by transactions in blocks. Is it not possible for two miners to be working on different blocks which contain mostly, although not all, the same transactions?
Does the second miner restart by taking his unverified transactions and putting them in a new block? However, over time only one of the forks will become the accepted consensus for confirmed transactions. And so only the miners from one fork will be able to redeem their transactions. What will happen when an owner loses his wallet restores a backup from a few weeks back. He may have spent some coins, and he may have received some.
Those transactions are no longer in his block chain. How would the block chain get back in sync? On your question-to-yourself about using two phase commit, I think the major issue would be vulnerability to denial-of-service attack.
A malicious user could set up a swarm of identities to act as nay-sayers and therewith deny some or all others from performing transactions. In my experience using the bitcoin client, you are not allowed to do anything on the bitcoin network until your block chain is in sync with the latest transactions.
It somehow recognizes how far behind your block chain is and starts downloading blocks and tells you how old your block chain is and how much left you have to update as it downloads more. BTW, I un-installed the bitcoin client because over the 1 year span that I had it installed, the block chain went from about 2 GB to about 25 GB, and the novelty of having my own copy of the block chain wore off in comparison to its cost.
On the naysayer DDoS attack on two-phase commit: Here is a very entertaining rational explanation http: If we were to decide that the rewards should be different remaining at 25 indefinitely, for example , what exactly would have to change? Is it the bitcoin mining clients that are hardwired to only validate transactions that award 25 coins to other miners when they validate their blocks, and the date of the validated block indicates that the award should be 25 BTC?
Every , blocks the rate halves. No need to keep track of the date, simply count blocks. As the chain is just validated list of transactions, how there can be any cap on transactions? What does hardcoded mean practically? You only own that much of bitcoins as others agree you own. So, hardcoded here means it is the original protocol suggested and supposed to be honored by all the users. Would it be, in principle, possible for all miners to agree on not lowering the reward at all?
For example to continue to reward 25 per block for all eternity. I was thinking about how the blockchain is managed as more transactions are processed, thanks for the link https: In a way, Bitcoin is replicating a history of money evolution in an accelerated manner. I wonder what will take place in the protocol to allow the peer-to-peer nature to continue while scaling the project to allow the transaction capacity necessary for a true currency.
Yeah, that is very interesting. And you do already see a lot of signs of centralization with the big mining pools:. This makes the concept difficult to grasp.
Thanks for such a generous and informative post. There is so much babble on Bitcoin that it often seems to operate socially as more of a rorschach test on currency than an actual means of exchange. The devil, and the delight, are in the details. Bitcoin has fascinated me recently. I admit to not being able to fully wrap my head around it, but I took what I could and wrote a little here: How does the block chain know that the address sending the coins is correct?
The sender sends their sig to go with it, I assume paired up with the hash of the address allows the various nodes to validate right? They would need to in order to validate. So can a sig only be used once, and if so how is it generated and what prevents it from being faked? Public key cryptography is a remarkable and beautiful thing. Each client using Bitcoin has keypairs — one key in each pair is public, the other private.
The nature of asymmetric cryptographic digital signatures is that I can sign any piece of data using my private key, and anyone else with only my public key can verify that the person who signed that data holds the private key.
In order to benefit they would have to be converted or be re-introduced later on. The situation is complicated further by the possibility of laundering. If you quickly spend some stolen bitcoins on, then it becomes very different to later recover those bitcoins, since now they may be in possession of honest parties.
Indeed, this is a critical question.