Ethereum Source Code Analysis: The Ethash Consensus Algorithm (Theoretical Introduction)

ยท

Introduction

Currently, Ethereum has two implementations of consensus algorithms: clique and ethash. Clique implements Proof-of-Authority (PoA), which we've covered in a previous article. Ethash, the focus of this article, implements Proof-of-Work (PoW) consensus.

Ethash is the consensus algorithm used by Ethereum's mainnet (Homestead version). Beyond basic PoW functionality, ethash also addresses mining fairness. Given the depth of content, we've split our discussion of ethash into two parts. This article focuses on the theory and design philosophy behind ethash's implementation. In the next article, we'll validate these design principles through actual source code.

What is PoW?

PoW stands for Proof of Work. Let's refer to Wikipedia's definition:

Proof of work (PoW) is an economic measure to deter denial of service attacks and other service abuses such as spam by requiring some work from the service requester, usually meaning processing time by a computer. The concept was invented by Cynthia Dwork and Moni Naor in 1993. The term "Proof of Work" was coined by Markus Jakobsson and Ari Juels in 1999. Today, it has become a mainstream consensus mechanism in cryptocurrencies like Bitcoin.

From a blockchain perspective, PoW has two key characteristics:

  1. It provides a way to prove substantial computational effort was expended to earn certain privileges.
  2. Other nodes can easily verify this proof.

Why does blockchain need PoW? Blockchain projects require new blocks to be produced at regular intervals. Since blockchain is decentralized with no central authority coordinating block production, a permissionless validation mechanism is needed where:

PoW is one type of consensus. Unlike PoA (Proof of Authority) which uses elected validators, PoW allows anyone to potentially earn block production rights by being the first to solve a computational puzzle. While we can't say one approach is better than the other, each suits different blockchain applications.

PoW's primary value is enabling permissionless participation in block production.

However, earning production rights becomes a competition of "computing power" - better hardware finds valid solutions faster. This creates a cycle where slower machines waste computations on stale blocks. Thus, we can summarize PoW as:

PoW validates block production rights in permissionless blockchains. Validation requires solving computational puzzles that are hard to solve but easy to verify.

Basic PoW Implementation

Let's consider how to implement PoW based on these requirements:

  1. Deterministic verification method
  2. Computationally intensive: Hard to solve, preventing easy answers
  3. Easy to verify: Others can quickly validate solutions

Hashing naturally fits these requirements. Hashing is one-way - deriving input data from a hash is practically impossible, but computing hashes is simple. Thus, we might design:

But realistically, strict hash matching would make block production nearly impossible. Instead, we match hash characteristics like:

This approach ensures solutions remain findable while maintaining security.

Additionally, we need dynamic difficulty adjustment to account for improving hardware. For example:

Finally, we need input data for hashing. Block headers work well since:

๐Ÿ‘‰ Learn more about blockchain consensus mechanisms

Thus, a basic PoW implementation might:

  1. Use cryptographic hashing
  2. Validate hash-as-integer against dynamic threshold
  3. Adjust threshold based on recent block times
  4. Hash block headers (with variable Nonce)

Ethash Implementation

While Ethereum's ethash follows similar principles, it differs significantly in hashing data sources to resist ASIC mining rigs. Let's examine key aspects:

Verification Method

Ethereum block headers contain a Difficulty field defining the mining threshold. For validation:

  1. Compute two hash values: mix and result
  2. result (as integer) must be less than 2ยฒโตโถ / Difficulty
  3. mix must exactly match header.MixDigest

Dynamic Difficulty Adjustment

Ethash adjusts difficulty to maintain consistent block times. Multiple adjustment methods exist across Ethereum versions:

Frontier Version

step = parent_diff // 2048
direction = 1 if block_time < 13s else -1
expAdjust = 2^((block.number//100000) - 2)
Difficulty = parent_diff + step*direction + expAdjust

The "difficulty bomb" (expAdjust) exponentially increases difficulty over time to encourage eventual PoS transition.

Homestead Version

Modified direction calculation to prevent gaming block timestamps.

Byzantine Fork

Delayed difficulty bomb by resetting block count offset.

Constantinople Fork

Further delayed difficulty bomb.

Hashing Data Sources

Ethash uses a massive (~1GB) dataset that regenerates every 30,000 blocks. This dataset:

  1. Requires significant memory for mining
  2. Can be regenerated from smaller cache (~16MB) for verification
  3. Follows a "Dagger-Hashimoto" approach:

    • Dagger: Hierarchical dataset generation from seed โ†’ cache โ†’ full dataset
    • Hashimoto: Uses header hash and Nonce with dataset to produce final hash

The memory-intensive design resists ASIC optimization while allowing efficient verification.

FAQ

Why resist ASIC miners?

ASICs lead to mining centralization where large operators could potentially control network decisions. Ethereum's memory-hard design makes ASICs less cost-effective.

How does the difficulty bomb work?

Exponential difficulty increases encourage transition to PoS by making mining increasingly difficult over time.

Why change difficulty formulas?

Different versions addressed various issues like timestamp manipulation and delayed PoS readiness.

๐Ÿ‘‰ Explore Ethereum mining further

Conclusion

This article analyzed ethash's theoretical foundations, including: