Volume 33 Number 8
Blockchain Fundamentals: Diving into Transaction Hash Chains
By Jonathan Waldman | August 2018
In the first article in this series (msdn.com/magazine/mt845650), I presented foundational concepts required to broadly understand modern-day blockchains, using high-level examples to illustrate the basics. In this article, I’ll revisit some of the topics from the previous article by going into more detail about transaction hash chains, the role of the transaction pool and how a longest blockchain always prevails. This article is best read as a supplement to the previous article and contains introductory material for developers new to blockchain technologies.
Incidentally, while the articles in this series are based on the Bitcoin blockchain, I’m not advocating the adoption of a particular blockchain product or technology. Rather, my goal is to explore the foundation on which popular blockchain technologies are built and to equip you with knowledge you can apply should you decide to leverage existing blockchains or engineer your own. As you study blockchains, you’ll soon realize that implementation details differ dramatically among them. If you decide to specialize in a particular blockchain implementation, you’ll need to keep pace with its fixes and updates in order to maintain expertise. But I warn you that the dynamic nature of these emerging technologies often means that available books, videos, blogs, forums and other documentation resources fall behind, sometimes making it necessary to consult the latest-deployed source code as a particular blockchain implementation’s definitive reference.
Transaction Hash Chain Revisited
My previous article discussed the transaction hash chain data structure, which tracks digital asset ownership. In this article, I’ll delve more deeply into how that hash chain works.
To pay homage to blockchain’s roots, I’ll begin by focusing on Satoshi Nakamoto’s seminal white paper about Bitcoin (bitcoin.org/bitcoin.pdf) published on Oct. 31, 2008—months before the launch of Bitcoin on Jan. 3, 2009. Although Bitcoin implementation details have changed quite a bit since then, the white paper remains a useful reference, in particular the diagram on p. 2 that expresses the original transaction hash chain design concept.
The purpose of that diagram is to convey how a transaction hash chain is constructed and how digital signatures authorize the transfer of ownership sequence. However, it’s highly abstracted and, as such, is a bit confusing. To add clarity, I’ve constructed a more detailed version that depicts how current transaction hash chains actually work (see Figure 1).
Figure 1 Updated Version of Satoshi Nakamoto’s Original Transaction Hash Chain Diagram
My revised diagram shows three transactions (0-based, as in the original document): Transaction0 for Alice, Transaction1 for Bob, and Transaction2 for Charlie. The first transaction establishes Alice the original owner of the digital asset; its second transaction transfers ownership to Bob; and its third transaction transfers ownership to Charlie. Each transaction is composed of these fields (shown with a solid outline): transaction hash, digital asset ID, optional data, public key and signature. Other fields are used but not stored in the transaction (shown with a dashed outline): private key and new transaction hash. The diagram expresses field values as subscripted, mixed-case names—for example, the transaction hash value for Transaction0 is TransactionHash0 and the public key value for Transaction2 is PublicKey2.
Figure 1 is a simplified transaction hash chain because it tracks only a single digital asset (DigitalAssetID0) as it changes ownership (in contrast, cryptocurrency transaction hash chains typically have multiple digital asset inputs and outputs). Also, don’t confuse the transaction hash chain with the blockchain, which aggregates verified transactions into blocks. Finally, the transaction hash chain isn’t typically stored as the single linked-list data structure depicted. Rather, it can be constructed (quickly, with the aid of indexes) from transaction data that’s stored on the blockchain.
As I described in my previous article, the sequence of transactions is preserved because each new owner’s transaction contains a hash value that’s back-linked to the previous owner’s transaction. In Figure 1, back-links are formed when the transaction hash of the previous transaction is stored in the current transaction. For example, Bob’s transaction contains a transaction hash field containing Alice’s TransactionHash0 value; likewise, Charlie’s transaction contains a transaction hash field containing Bob’s TransactionHash1 value, and so on.
Back-links are just one of several data-integrity components of the transaction hash chain. The chain also enforces transfer-of-ownership authorization. To follow an example, imagine that Alice is a purveyor of the world’s finest wines and wants to maintain a ledger that tracks the fate of every bottle she owns. One day, Alice goes to her wine cellar and decides she will register herself on her business’s blockchain as the original owner of every bottle of wine stocked there, effectively seeding transaction hash chains for each of her cherished bottles of wine. To begin, she casually grabs a bottle of Cheval Blanc 1947 Saint-Émilion and tags it with a QR code containing a unique ID. She then scans the QR label into her blockchain client software running as a node on the network. The software translates the scanned code into a digital asset ID (DigitalAssetID0) then adds optional data (OptionalData0) along with Alice’s public key (PublicKey0). As you can see in Figure 1, these fields are in their own outlined rectangle that represents an unsigned transaction. Each transaction also contains a transaction hash back-link and a signature, but because this is the first transaction in the hash chain, those fields are blank (shown by the shaded fields for Transaction0).
Shown atop each transaction is a unique transaction hash value that the client software calculates by SHA-256-hashing together all of the transaction fields (the transaction hash, digital asset ID, optional data, owner’s public key and signature). Again, it’s this transaction hash value that’s used as the next transaction’s back-link for DigitalAssetID0.
When Bob, the manager of Alice’s Manhattan restaurant, wants to acquire Alice’s bottle of Cheval Blanc, he uses his client software to generate a new public-private key pair for the transaction. Bob could skip this step and aggregate all of his digital assets under a single, previously used public key, but that would expose him to unnecessary risk. Instead, he generates a new key pair and gives Alice a public key he’s never used before. That way, if he ever loses the paired private key, he loses access to only a single digital asset.
In response to Bob’s request, Alice launches her client software and browses her digital assets. She selects the transaction ID associated with the Cheval Blanc bottle Bob wants and then initiates the transfer request by providing Bob’s public key, which doubles as a sort of destination address. The node then creates a new transaction (Transaction1) containing the back-link value of the previous transaction hash (TransactionHash0), the value of the digital asset ID (DigitalAssetID0) for the Cheval Blanc bottle (this is the same value as the digital asset ID for Transaction0), the value for any custom fields related to the transaction (OptionalData1), and the value of Bob’s public key (PublicKey1) because Bob is this transaction’s owner.
So far, the node has constructed an unsigned new Transaction1 for Bob. The next step is to sign the transaction using Alice’s private key. This is a critical step: Alice currently owns the digital asset in question so only she can authorize transfer of that digital asset to Bob.
Elliptic Curve Cryptography
In Figure 1, labels 1 and 2 indicate where the transaction is signed and where it’s verified, respectively. In its current version, the Bitcoin blockchain leverages an implementation of public key cryptography (PKC) called elliptic curve cryptography (ECC). ECC provides stronger cryptographic results and shorter keys than the popular RSA/Diffie-Hellman alternative. Blockchain nodes use ECC to generate asymmetric key pairs using a formula that involves randomly selected points on a two-dimensional graph. This scheme allows a lost public key to be regenerated from the private key (but of course doesn’t allow a lost private key to be regenerated from a public key).
Blockchains modeled after Bitcoin also leverage ECC when it comes to digital signatures. Unlike the simplified PKC Rivest-Shamir-Adelman (RSA) examples I illustrated in my previous article, Bitcoin now uses an Elliptic Curve Digital Signature Algorithm (ECDSA) (specifically, SHA256withECDSA) for signing transactions. This algorithm works a little differently from other signing technologies: In ECDSA, you must pass the signer’s private key along with the message to be signed to a function that uses an ECDSA signature-generation algorithm to create a signature (this step is indicated by marker 1 in Figure 1). To later verify that signature, you must pass the signer’s public key, message, and signature to a function that uses an ECDSA verification algorithm to generate a true or false value, indicating whether or not the signature is valid (this step is indicated by marker 2 in Figure 1). Figure 2 summarizes signing and verifying using ECDSA.
Figure 2 Elliptic Curve Digital Signature Algorithm Signature Generation (Top) and Verification Algorithm (Bottom)
When creating a digital signature using PKC RSA, you verify the signature by comparing hash values, as shown in my previous article. For the curious-minded, that signature verification strategy isn’t possible with ECDSA. RSA PKC is a deterministic digital signature algorithm because signing a given message with a given private key generates the same signature every time. ECDSA, on the other hand, is non-deterministic: Each time you pass a message and a private key to the ECDSA signing function you’ll obtain a different signature. To see this in action, go to bit.ly/2MCTuwI.
Continuing the example, Alice is about to sign the transaction that transfers ownership of DigitalAsset0 to Bob. The node software passes her private key (PrivateKey0) along with a message (NewTransactionHash1) to the ECDSA signature-generation algorithm function and obtains a signature as output (Signature1). The node adds this signature value to the signature field of the new transaction. Finally, the node calculates the transaction hash (TransactionHash1) value, which is a SHA-256 hash of all transaction fields, including the signature. At that point, the node successfully produced a signed transaction that can be sent to the transaction pool.
A signed transaction is deemed unverified until it has been validated by a miner node. When a miner node tries to verify Bob’s transaction, it uses the transaction hash back-link to access the previous transaction’s public key, which leads to Alice’s Transaction0. Once the node has access to the previous transaction, it passes that transaction’s public key (PublicKey0) along with the new transaction hash (NewTransactionHash1) and the signature in Bob’s transaction (Signature1) to the ECDSA verification algorithm that returns a true or false value, indicating whether or not the signature is valid.
Incidentally, Alice’s private key (PrivateKey0) and the new transaction hash (NewTransactionHash1) are not stored in the transaction. Private key values should not be stored on a blockchain, and there’s no need to store the new transaction hash value because it can be recomputed whenever needed.
Bob grabs his corkscrew and thinks he’s going to savor the Cheval Blanc when he receives a Skype call from Charlie, the manager of one of Alice’s other restaurants. Charlie wants to offer a special bottle of wine to welcome a newly hired sommelier. Bob regretfully agrees to transfer ownership of the Cheval Blanc to Charlie. He asks for Charlie’s public key, and the same process is again carried out in order to transfer DigitalAsset0 ownership from Bob to Charlie.
There now exist three transactions for DigitalAsset0—one for Alice, one for Bob and one for Charlie. Each transaction was verified and incorporated into the blockchain. After a certain number of additional blocks have been mined on top of the block that contains a particular transaction, that transaction is deemed confirmed (this “certain number” is implementation-specific). Thus, the official owner of a particular digital asset is always the person who has the private key to the most recently confirmed transaction for that digital asset’s transaction hash chain.
The Need for Consensus
As you’ve seen, a transaction hash chain is a data structure that strives to enforce ownership of a digital asset. But remember that these transactions are stored on a distributed, decentralized, asynchronous, public network that’s vulnerable to attacks and exposed to nodes that don’t necessarily honor blockchain protocol rules (so-called “bad actors”). The result is that bad-actor nodes could verify transactions that aren’t actually valid or could collude on the network to undermine the integrity of the blockchain.
The Transaction Pool To prevent these transaction-integrity issues, all transactions go through a verification and confirmation process. Each transaction is created by a single node somewhere on the network. For example, assume that Alice is in Albuquerque and Bob is in Boston. When Alice transfers ownership of her digital asset to Bob, Transaction1 is constructed by a node in Albuquerque then broadcast to other nodes on the network. Concurrently, other nodes actively broadcast the transactions they’ve just created. These broadcasts spread to other nodes on a global network and it takes time to propagate those transactions due to network latency. Regardless of where on the global network a transaction originates, the blockchain protocol places all new transactions in a transaction pool of unverified transactions.
Proof-of-Work and Proof-of-Stake In a blockchain that issues a reward for proof-of-work, miner nodes aggressively select transactions from the transaction pool. It behooves the miner node to verify each transaction while constructing a candidate block because a block containing any bad transactions will be immediately rejected by other nodes—and that would mean the work done by the node was for naught.
Recall from my previous article that each node is in a race to find a nonce for the candidate block it has constructed so it can earn a financial reward and recover energy costs incurred while demonstrating proof-of-work. As of this writing, the current financial reward on the Bitcoin blockchain is 12.5 Bitcoin (BTC), which amounts to roughly $100,000 USD. Sometimes the financial reward is a transaction fee and sometimes it’s a financial reward plus a transaction fee. What’s important to understand about proof-of-work is that nodes must expend energy and incur equipment and infrastructure costs in order to profitably continue mining blocks; for a node to be sustainable, those costs must be offset by revenue.
It’s no wonder, then, that as soon as a miner finds a nonce it immediately broadcasts that block to every other node on the network in the hope that its just-mined block is added to the end of the blockchain. The Bitcoin blockchain calibrates its nonce difficulty so that new nonces are discovered roughly every 10 minutes, so a lag of even a few seconds can mean that another miner might also find a nonce and broadcast its candidate block.
To appreciate the implications of losing the mining race, consider the mining nodes that didn’t find a nonce in time: All the energy they expended was wasted. The miners who didn’t find a nonce have no choice but to stop processing the current block and start all over by grabbing and verifying transactions from the transaction pool. The reason they must stop mining as soon as they learn that another miner found a nonce is that a candidate block has a backlink to the hash of the previous block on the blockchain. When another miner mines a verified block that links to the previous block, the losing miners must create a new block that references the hash for the newly mined block. The losing miners must also discard the transactions they previously selected and choose a new set from the transaction pool because other nodes will reject any new block that contains transactions already included in a previous block.
A node must bear all of the costs required to support its mining equipment. As the Bitcoin blockchain has grown, this has led to another kind of race—a race for the most powerful mining equipment. The more computing power a mining node can access, the more likely it can win each 10-minute race to solve the cryptographic puzzle required to find a nonce.
One common criticism of proof-of-work is that it encourages the construction of ever-more-powerful computing centers and the use of increasing amounts of electrical power. A competitive edge is given to the owner of the most powerful computing equipment on proof-of-work-powered blockchain networks. For example, multimillion-dollar datacenters now work exclusively toward mining bitcoin. According to digiconomist.net, Bitcoin’s annual blockchain energy consumption as of June 2018 is 71.12 TWh, which is similar to Chile’s annual energy consumption (bit.ly/2vAdzdl).
Another widely discussed consensus algorithm is proof-of-stake, which rewards nodes that demonstrate an economic stake in the network. Arguably, proof-of-stake’s greatest appeal is that it’s more energy-efficient. Furthermore, it doesn’t issue a cryptocurrency reward for mining a block, although it does issue transaction fees as a reward. It also doesn’t require a race to find the nonce that solves a cryptographic puzzle. Instead, the network randomly selects a node that has registered itself as a “forger” (analogous to Bitcoin’s “miner”) based on the total value and age of its cryptocurrency units. Various implementation details strive to ensure fairness and randomness in selecting among forgers. For example, once a forger is selected it often can’t participate in another round of forging for at least 30 days. Effectively, high-value forger nodes containing the oldest cryptocurrency coins have an edge over other forger nodes.
Proof-of-stake supporters make the good point that the cost of running a node is much lower, encouraging more participation and a greater level of decentralization. Ironically, however, proof-of-stake systems discourage the use of the cryptocurrency that blockchain is designed to transact because spending will reduce the node’s total value and decrease the chance of being selected as a forger.
One thing to ponder is the point made by blockchain expert Andreas Antonopoulos: “Proof-of-work is also a proof-of-stake but proof-of-stake is not also proof-of-work.” He explains that proof-of-work offers a combination of both consensus algorithms by making the point that while miners participating in a proof-of-work-powered network aren’t selected based on number or age of cryptocurrency units, miner nodes effectively demonstrate their economic investment in the network by funding the energy required to participate. Thus, the “stake” in the proof-of-work scheme, he argues, is the cost of electricity a node is willing to incur in an effort to successfully mine a block (watch Antonopoulos speaking at a Silicon Valley Bitcoin Meetup on Sept. 13, 2016: bit.ly/2MDfkA1).
Longest Chain The blockchain network constantly extends, branches and prunes itself. The entire view of the blockchain is called the block tree; each miner node actively mines against the block that terminates the block tree’s longest chain. You might think that the longest chain is defined by the chain with the greatest number of blocks, but it’s actually defined as the sequence of blocks from the genesis block that produces the greatest amount of work. You can derive total work by summing the “difficulty” of each block—a measurement of how unlikely it is to discover a nonce for a candidate block. The network protocol maintains that value, which the Bitcoin blockchain adjusts every 2,016 blocks so that blocks take roughly 10 minutes of processing time to mine. The difficulty value is stored in each block so that work can be computed by nodes that are trying to identify the longest chain.
Occasionally, it’s inevitable that two nodes, A and B, will demonstrate proof-of-work by mining a new block within seconds or even milliseconds of each other. Because each node adds its new block to the end of what it sees as the longest chain before broadcasting that block to the network, a fork (branch) in the block tree will appear. Depending on where these nodes are located and the bandwidth of connected nodes on the network and other latency considerations, some fraction of the network will first see Block A as the new block and will add that to the end of the chain. The other fraction of the network will see Block B as the new block and will add that to the end of the chain. This leaves some nodes with Block A and others with Block B as the terminating block (see Figure 3).
Figure 3 (Top) A Block Tree Showing a Block Tree Fork and Two Equal-Length Chains (Bottom); A Block Tree Showing a Block Tree Fork and One Longest Chain
When a fork occurs as shown at the top of Figure 3, two chains are on the block tree—they’re equal in length and both are valid. The problem this presents is clear when you consider that mining nodes look for the longest chain before they begin mining because they need the hash for that chain’s terminating block.
If a miner successfully mines Block C and was working on Chain A, it will add Block C to the end of the chain that has Block A as its terminating block (see the bottom block tree in Figure 3). Once it does that, it broadcasts Block C to the network and other nodes will see that Chain A is the longest chain. Nodes working on Chain B will see that Chain A is longer than Chain B and will stop mining their current block so they can begin to mine a new block that extends Block C on Chain A. As soon as this happens, the network releases all of the transactions in Block B back into the transaction pool so that they can be picked up in a new round of mining.
You might wonder what happens to the Bitcoin earned by the miner that created Block B: Its transaction commissions and block rewards are never issued. On the Bitcoin network, these rewards aren’t given to a miner until 100 blocks have been successfully mined on top of the block in question.
In this article, I explored in greater detail some of the topics introduced in my previous article. Together, the two articles cover most of the fundamental concepts you must truly grasp to understand how blockchains work. After reading both, you should understand blockchain’s decentralized, distributed network architecture; SHA-256 hashing; PKC and ECDSA basics; how nodes construct transactions on hash chains and how digital signatures authorize the transfer of digital-asset ownership; how transactions in the transaction pool await selection and verification before getting confirmed into a block; how specialized nodes employ a particular consensus algorithm (such as “miners” using proof-of-work or “forgers” using proof-of-stake) to generate a block; and how nodes on the network add generated blocks to the longest chain. If you want to delve deeper into blockchains, I highly recommend the books and videos available at Safari Books Online (safaribooksonline.com) and those published by Andreas Antonopoulos (antonopoulos.com).
Jonathan Waldman is a Microsoft Certified Professional software engineer, a solutions architect with deep technical exposure to a variety of industries and a specialist in software ergonomics. Waldman is a member of the Pluralsight technical team and currently leads institutional and private-sector software-development projects. He can be reached at firstname.lastname@example.org and followed on Twitter: @jwpulse.
Thanks to the following Microsoft technical expert who reviewed this article: James McCaffrey