How does Ethereum work, anyway?

Source: r/ethereum

Introduction

Odds are you’ve heard about the Ethereum blockchain, whether or not you know what it is. It’s been in the news a lot lately, including the cover of some major magazines, but reading those articles can be like gibberish if you don’t have a foundation for what exactly Ethereum is. So what is it? In essence, a public database that keeps a permanent record of digital transactions. Importantly, this database doesn’t require any central authority to maintain and secure it. Instead it operates as a “trustless” transactional system — a framework in which individuals can make peer-to-peer transactions without needing to trust a third party OR one another.

Blockchain definition

A blockchain is a “cryptographically secure transactional singleton machine with shared-state.” [1] That’s a mouthful, isn’t it? Let’s break it down.

  • “Transactional singleton machine” means that there’s a single canonical instance of the machine responsible for all the transactions being created in the system. In other words, there’s a single global truth that everyone believes in.
  • “With shared-state” means that the state stored on this machine is shared and open to everyone.

The Ethereum blockchain paradigm explained

The Ethereum blockchain is essentially a transaction-based state machine. In computer science, a state machine refers to something that will read a series of inputs and, based on those inputs, will transition to a new state.

  • state
  • gas and fees
  • transactions
  • blocks
  • transaction execution
  • mining
  • proof of work

Accounts

The global “shared-state” of Ethereum is comprised of many small objects (“accounts”) that are able to interact with one another through a message-passing framework. Each account has a state associated with it and a 20-byte address. An address in Ethereum is a 160-bit identifier that is used to identify any account.

  • Contract accounts, which are controlled by their contract code and have code associated with them.

Externally owned accounts vs. contract accounts

It’s important to understand a fundamental difference between externally owned accounts and contract accounts. An externally owned account can send messages to other externally owned accounts OR to other contract accounts by creating and signing a transaction using its private key. A message between two externally owned accounts is simply a value transfer. But a message from an externally owned account to a contract account activates the contract account’s code, allowing it to perform various actions (e.g. transfer tokens, write to internal storage, mint new tokens, perform some calculation, create new contracts, etc.).

Account state

The account state consists of four components, which are present regardless of the type of account:

  • balance: The number of Wei owned by this address. There are 1e+18 Wei per Ether.
  • storageRoot: A hash of the root node of a Merkle Patricia tree (we’ll explain Merkle trees later on). This tree encodes the hash of the storage contents of this account, and is empty by default.
  • codeHash: The hash of the EVM (Ethereum Virtual Machine — more on this later) code of this account. For contract accounts, this is the code that gets hashed and stored as the codeHash. For externally owned accounts, the codeHash field is the hash of the empty string.

World state

Okay, so we know that Ethereum’s global state consists of a mapping between account addresses and the account states. This mapping is stored in a data structure known as a Merkle Patricia tree.

  • a set of intermediate nodes, where each node is the hash of its two child nodes
  • a single root node, also formed from the hash of its two child node, representing the top of the tree
Source: Ethereum whitepaper
  1. Transactions trie
  2. Receipts trie
  1. The root hash of the tree
  2. The “branch” (all of the partner hashes going up along the path from the chunk to the root)

Gas and payment

One very important concept in Ethereum is the concept of fees. Every computation that occurs as a result of a transaction on the Ethereum network incurs a fee — there’s no free lunch! This fee is paid in a denomination called “gas.”

There are fees for storage, too

Not only is gas used to pay for computation steps, it is also used to pay for storage usage. The total fee for storage is proportional to the smallest multiple of 32 bytes used.

What’s the purpose of fees?

One important aspect of the way the Ethereum works is that every single operation executed by the network is simultaneously effected by every full node. However, computational steps on the Ethereum Virtual Machine are very expensive. Therefore, Ethereum smart contracts are best used for simple tasks, like running simple business logic or verifying signatures and other cryptographic objects, rather than more complex uses, like file storage, email, or machine learning, which can put a strain on the network. Imposing fees prevents users from overtaxing the network.

Transaction and messages

We noted earlier that Ethereum is a transaction-based state machine. In other words, transactions occurring between different accounts are what move the global state of Ethereum from one state to the next.

  • gasPrice: the number of Wei that the sender is willing to pay per unit of gas required to execute the transaction.
  • gasLimit: the maximum amount of gas that the sender is willing to pay for executing this transaction. This amount is set and paid upfront, before any computation is done.
  • to: the address of the recipient. In a contract-creating transaction, the contract account address does not yet exist, and so an empty value is used.
  • value: the amount of Wei to be transferred from the sender to the recipient. In a contract-creating transaction, this value serves as the starting balance within the newly created contract account.
  • v, r, s: used to generate the signature that identifies the sender of the transaction.
  • init (only exists for contract-creating transactions): An EVM code fragment that is used to initialize the new contract account. init is run only once, and then is discarded. When init is first run, it returns the body of the account code, which is the piece of code that is permanently associated with the contract account.
  • data (optional field that only exists for message calls): the input data (i.e. parameters) of the message call. For example, if a smart contract serves as a domain registration service, a call to that contract might expect input fields such as the domain and IP address.

Blocks

All transactions are grouped together into “blocks.” A blockchain contains a series of such blocks that are chained together.

  • information about the set of transactions included in that block
  • a set of other block headers for the current block’s ommers.

Ommers explained

What the heck is an “ommer?” An ommer is a block whose parent is equal to the current block’s parent’s parent. Let’s take a quick dive into what ommers are used for and why a block contains the block headers for ommers.

Block header

Let’s get back to blocks for a moment. We mentioned previously that every block has a block “header,” but what exactly is this?

A block header is a portion of the block consisting of:

  • ommersHash: a hash of the current block’s list of ommers
  • beneficiary: the account address that receives the fees for mining this block
  • stateRoot: the hash of the root node of the state trie (recall how we learned that the state trie is stored in the header and makes it easy for light clients to verify anything about the state)
  • transactionsRoot: the hash of the root node of the trie that contains all transactions listed in this block
  • receiptsRoot: the hash of the root node of the trie that contains the receipts of all transactions listed in this block
  • logsBloom: a Bloom filter (data structure) that consists of log information
  • difficulty: the difficulty level of this block
  • number: the count of current block (the genesis block has a block number of zero; the block number increases by 1 for each each subsequent block)
  • gasLimit: the current gas limit per block
  • gasUsed: the sum of the total gas used by transactions in this block
  • timestamp: the unix timestamp of this block’s inception
  • extraData: extra data related to this block
  • mixHash: a hash that, when combined with the nonce, proves that this block has carried out enough computation
  • nonce: a hash that, when combined with the mixHash, proves that this block has carried out enough computation
  • transactions (transactionsRoot)
  • receipts (receiptsRoot)

Logs

Ethereum allows for logs to make it possible to track various transactions and messages. A contract can explicitly generate a log by defining “events” that it wants to log.

  • a series of topics that represent various events carried out by this transaction, and
  • any data associated with these events.

Transaction receipt

Logs stored in the header come from the log information contained in the transaction receipt. Just as you receive a receipt when you buy something at a store, Ethereum generates a receipt for every transaction. Like you’d expect, each receipt contains certain information about the transaction. This receipt includes items like:

  • block hash
  • transaction hash
  • gas used by the current transaction
  • cumulative gas used in the current block after the current transaction has executed
  • logs created when executing the current transaction
  • ..and so on

Block difficulty

The “difficulty” of a block is used to enforce consistency in the time it takes to validate blocks. The genesis block has a difficulty of 131,072, and a special formula is used to calculate the difficulty of every block thereafter. If a certain block is validated more quickly than the previous block, the Ethereum protocol increases that block’s difficulty.

Transaction Execution

We’ve come to one of the most complex parts of the Ethereum protocol: the execution of a transaction. Say you send a transaction off into the Ethereum network to be processed. What happens to transition the state of Ethereum to include your transaction?

  • Valid transaction signature.
  • Valid transaction nonce. Recall that the nonce of an account is the count of transactions sent from that account. To be valid, a transaction nonce must be equal to the sender account’s nonce.
  • The transaction’s gas limit must be equal to or greater than the intrinsic gas used by the transaction. The intrinsic gas includes:
  1. a gas fee for data sent with the transaction (4 gas for every byte of data or code that equals zero, and 68 gas for every non-zero byte of data or code)
  2. if the transaction is a contract-creating transaction, an additional 32,000 gas
  • Log series: archived and indexable checkpoints of the virtual machine’s code execution.
  • Refund balance: the amount to be refunded to the sender account after the transaction. Remember how we mentioned that storage in Ethereum costs money, and that a sender is refunded for clearing up storage? Ethereum keeps track of this using a refund counter. The refund counter starts at zero and increments every time the contract deletes something in storage.
  • the gas used by the transaction is added to the block gas counter (which keeps track of the total gas used by all transactions in the block, and is useful when validating a block)
  • all accounts in the self-destruct set (if any) are deleted

Contract creation

Recall that in Ethereum, there are two types of accounts: contract accounts and externally owned accounts. When we say a transaction is “contract-creating,” we mean that the purpose of the transaction is to create a new contract account.

  • If the sender sent some amount of Ether as value with the transaction, setting the account balance to that value
  • Deducting the value added to this new account’s balance from the sender’s balance
  • Setting the storage as empty
  • Setting the contract’s codeHash as the hash of an empty string

Message calls

The execution of a message call is similar to that of a contract creation, with a few differences.

Execution model

So far, we’ve learned about the series of steps that have to happen for a transaction to execute from start to finish. Now, we’ll look at how the transaction actually executes within the VM.

Source: CMU
  • Remaining gas for computation
  • Address of the account that owns the code that is executing
  • Address of the sender of the transaction that originated this execution
  • Address of the account that caused the code to execute (could be different from the original sender)
  • Gas price of the transaction that originated this execution
  • Input data for this execution
  • Value (in Wei) passed to this account as part of the current execution
  • Machine code to be executed
  • Block header of the current block
  • Depth of the present message call or contract creation stack
PC: 0 STACK: [] MEM: [], STORAGE: {}
  • program counter
  • memory contents
  • active number of words in memory
  • stack contents.
  1. The sequence continues to process into the next loop
  2. The machine reaches a controlled halt (the end of the execution process)

How a block gets finalized

Finally, let’s look at how a block of many transactions gets finalized.

Mining proof of work

The “Blocks” section briefly addressed the concept of block difficulty. The algorithm that gives meaning to block difficulty is called Proof of Work (PoW).

  • nonce is a hash that, when combined with the mixHash, proves that this block has carried out enough computation

Mining as a wealth distribution mechanism

Beyond providing a secure blockchain, PoW is also a way to distribute wealth to those who expend their computation for providing this security. Recall that a miner receives a reward for mining a block, including:

  • the cost of gas expended within the block by the transactions included in the block
  • an extra reward for including ommers as part of the block
  • Reduce the possibility for any single node (or small set) to make a disproportionate amount of profit. Any node that can make a disproportionate amount of profit means that the node has a large influence on determining the canonical blockchain. This is troublesome because it reduces network security.

Conclusion

…Phew! You made it to the end. I hope?

I do not publish on Medium any longer. You can find my latest writings here: www.preethikasireddy.com