Blockchain Security In Depth: Mechanisms of 51% Attacks, Selfish Mining, and Sybil Attacks

Lesson 925min5,191 chars

Learning Objectives

  • 51% 공격의 실행 과정과 성공 조건을 시뮬레이션으로 입증할 수 있다
  • 가장 긴 체인 규칙을 구현하고 포크(Fork) 발생 시 체인 선택 로직을 코딩할 수 있다
  • 51% 공격·이기적 채굴·시빌 공격의 차이점과 각각의 방어 메커니즘을 비교·설명할 수 있다

Blockchain Security Deep Dive: Mechanisms of 51% Attacks, Selfish Mining, and Sybil Attacks

In August 2020, the Ethereum Classic (ETC) network was overturned three times within 72 hours. The attacker invested $200K and walked away with $5.6M. ROI: 2,700%.

In Lesson 8, we learned how to compress thousands of transactions into a single hash using Merkle trees. Like a "tournament bracket," we recursively combined hash pairs to verify transaction inclusion in O(log n). This structure allows the blockchain to efficiently guarantee data integrity. But integrity alone isn't enough. What if someone replaces the entire chain itself?

Today we view the blockchain through an attacker's eyes. I watched the ETC 51% attacks unfold in real time back in 2020. I was auditing a DeFi project at the time, and watching the chain itself get flipped made me viscerally understand that "smart contract security alone isn't enough." Blockchain security starts at the consensus mechanism level.


Where It Began: Ethereum Classic's Nightmare

The Real Incident — 51% Attack by the Numbers

August 2020, Ethereum Classic network.

ItemValue
Number of attacks (August 2020)3 consecutive
First attack reorganization depth3,693 blocks
Total double-spend damage~$5.6M
Estimated attack cost (hashpower rental)~$200K
Return on investment~2,700%

The attacker rented mining capacity from hashpower marketplaces like NiceHash. Think back to Lesson 5's proof of work (PoW) — the process of spinning a nonce to find a hash below the difficulty target. The attacker rented enough hashpower to perform this "sudoku solving" faster than the entire network combined.

The process went like this:

  1. Deposit 10,000 ETC to an exchange and convert to another coin
  2. Secretly mine an alternate chain without the deposit transaction
  3. Once the alternate chain becomes longer, broadcast it to the network
  4. The longest chain rule implemented in Lesson 6 causes the network to adopt the alternate chain
  5. The original deposit transaction disappears — double spend complete

$200K invested, $5.6M earned. Far more "profitable" than honest mining.

🤔 Think about it: Why doesn't this attack happen on Bitcoin (BTC)? What is the decisive difference between ETC and BTC?

View answer

The difference is the scale of hashrate. As of 2020, Bitcoin's hashrate was roughly 10,000 times that of ETC. A 51% attack on Bitcoin would require millions of dollars per hour in electricity costs and more than half of all ASIC miners worldwide. Economically, it's near impossible. PoW security is ultimately proportional to total hashrate. The smaller the network, the more vulnerable it is.


51% Attack Mechanics: Understanding Through Code

Let's dig into the underlying principles of the ETC incident. Exactly how much hashpower does an attacker need to succeed?

Attack Success Probability — Nakamoto's Formula

Satoshi Nakamoto calculated the probability that an attacker can catch up to the honest chain in the Bitcoin whitepaper. The key is the Random Walk problem. It's similar to a drunk person staggering toward a cliff — being one step ahead doesn't mean you're safe. Even if you're a few steps behind, you can catch up if you're lucky.

The code below calculates the probability of overtaking based on the attacker's hashpower ratio and number of confirmations. It directly implements the Poisson distribution approximation from Nakamoto's whitepaper.

# 51_percent_probability.py
# Calculate overtaking probability based on attacker's hashpower ratio

def attack_success_probability(q, z):
    """
    q: Attacker's hashpower ratio (0.0 ~ 1.0)
    z: Number of confirmations (blocks the honest chain is ahead)
    Returns: Probability that the attacker catches up to the honest chain
    """
    p = 1.0 - q  # Honest miners' hashpower ratio
    if q >= p:
        return 1.0  # Always succeeds with majority

    # Poisson distribution approximation from Nakamoto's whitepaper
    lam = z * (q / p)  # λ of the Poisson distribution
    total = 1.0
    poisson = 1.0
    for k in range(z + 1):
        if k > 0:
            poisson *= lam / k
        prob_poisson = poisson * (2.718281828 ** (-lam))
        # Probability of overtaking the remaining blocks after catching up to k
        remaining = (q / p) ** (z - k) if z > k else 1.0
        total -= prob_poisson * (1.0 - remaining)
    return max(0.0, total)

# Test various scenarios
print("=== Attack Success Probability by Confirmation Count ===")
print(f"{'Hashpower':>9} | {'1 conf':>8} | {'3 conf':>8} | {'6 conf':>8}")
print("-" * 45)
for q in [0.1, 0.2, 0.3, 0.4, 0.45, 0.51]:
    p1 = attack_success_probability(q, 1)
    p3 = attack_success_probability(q, 3)
    p6 = attack_success_probability(q, 6)
    print(f"{q*100:>8.0f}% | {p1:>7.2%} | {p3:>7.2%} | {p6:>7.2%}")
# Expected output:
=== Attack Success Probability by Confirmation Count ===
Hashpower |   1 conf |   3 conf |   6 conf
---------------------------------------------
      10% |  12.56% |   1.85% |   0.03%
      20% |  27.44% |  10.17% |   1.18%
      30% |  45.07% |  26.13% |   8.04%
      40% |  66.09% |  50.56% |  29.66%
      45% |  77.29% |  67.03% |  50.23%
      51% | 100.00% | 100.00% | 100.00%

Look at this table carefully. With 51% or more hashpower, success is guaranteed regardless of confirmation count. But the truly scary rows are below that. With just 30%, there's an 8% chance of success at 6 confirmations. The name "51% attack" makes it easy to assume a majority is required, but in reality it's a probability game.

🤔 Think about it: Why does Bitcoin use exactly 6 confirmations as its standard? Find the answer in the table above.

View answer

For an attacker with 10% hashpower, the success probability at 6 confirmations is 0.03% — essentially zero. Since acquiring even 10% of Bitcoin's hashpower in practice costs an astronomical sum, 6 confirmations (about 1 hour) is a practical compromise between cost and safety. This is also why different exchanges require different confirmation counts — the lower a coin's hashrate, the more confirmations are required.

51% Attack Simulation Code

The probability formula alone may not make it click. Let's simulate the attack directly using the PoW from Lesson 5 and the chain structure from Lesson 6. The code below creates both an honest chain and an attacker's chain, then shows which one wins under the longest chain rule.

# attack_demo.py
# Simplified 51% attack simulation

import hashlib
import time
import random

def simple_hash(data):
    """Simple hash function"""
    return hashlib.sha256(data.encode()).hexdigest()

def mine_block(prev_hash, data, difficulty=2):
    """PoW mining from Lesson 5 — spin a nonce to find leading zeros"""
    nonce = 0
    target = "0" * difficulty
    while True:
        block_str = f"{prev_hash}{data}{nonce}"
        block_hash = simple_hash(block_str)
        if block_hash[:difficulty] == target:
            return {"prev_hash": prev_hash, "data": data,
                    "nonce": nonce, "hash": block_hash}
        nonce += 1

def build_chain(genesis_hash, blocks_data, difficulty=2):
    """Mine blocks consecutively to build a chain"""
    chain = []
    prev = genesis_hash
    for data in blocks_data:
        block = mine_block(prev, data, difficulty)
        chain.append(block)
        prev = block["hash"]
    return chain

# Genesis block
genesis = simple_hash("genesis")

# Honest chain: includes a transfer of 5 BTC from A to B
print("⛏️ Mining honest chain...")
honest_chain = build_chain(genesis, [
    "tx: A→B 5BTC",
    "tx: C→D 2BTC",
    "tx: E→F 1BTC"
], difficulty=2)

# Attacker's chain: branches from the same point but without the A→B transaction!
print("😈 Mining attacker's chain...")
attacker_chain = build_chain(genesis, [
    "tx: A→A 5BTC",   # Send back to self (double spend!)
    "tx: G→H 3BTC",
    "tx: I→J 4BTC",
    "tx: K→L 1BTC"    # One more block mined! (longer chain)
], difficulty=2)

print(f"\nHonest chain length: {len(honest_chain)} blocks")
print(f"Attacker chain length: {len(attacker_chain)} blocks")
print(f"\n🏆 Chain selected by network: "
      f"{'Attacker ❌' if len(attacker_chain) > len(honest_chain) else 'Honest ✅'}")
print("→ Attacker's chain adopted by the longest chain rule!")
print("→ A→B 5BTC transaction erased, replaced by A→A 5BTC!")
# Expected output:
⛏️ Mining honest chain...
😈 Mining attacker's chain...

Honest chain length: 3 blocks
Attacker chain length: 4 blocks

🏆 Chain selected by network: Attacker ❌
→ Attacker's chain adopted by the longest chain rule!
→ A→B 5BTC transaction erased, replaced by A→A 5BTC!

Remember how add_block() in Lesson 6 linked each block to the previous block's hash? The attacker follows the exact same rules while building an alternate chain with different transactions. It's not a rule violation. It's exploitation of the rules.


The Longest Chain Rule and Chain Selection

So why does this rule exist? If the "longest chain wins" rule enables attacks, why not remove it?

Why "Longest Chain = Correct Chain"?

In Lesson 5 we learned the key to PoW — "hard to solve, easy to verify." A longer chain is evidence that more computational work was invested. If the majority of hashpower is honest, the honest chain will always be the longest. This is the fundamental assumption of Nakamoto consensus. Attacks are only possible when this assumption breaks — when the honest side falls below a majority.

In the diagram above, the chain forks after Block 2. The honest chain extends to Block 4, while the attacker's chain reaches Block 5'. Under the longest chain rule, the red chain wins.

Implementing resolve_conflicts

Let's put this rule into code. We add a key method to the Blockchain class from Lesson 6. This is the logic for selecting the longest valid chain when multiple nodes hold different chains. is_valid_chain verifies both PoW and hash linkage, and resolve_conflicts adopts the longest among the valid chains.

# resolve_conflicts.py
# Chain conflict resolution logic that selects the longest valid chain

import hashlib

def simple_hash(text):
    return hashlib.sha256(text.encode()).hexdigest()

def is_valid_chain(chain, difficulty=2):
    """Verify that all blocks in the chain are valid"""
    target = "0" * difficulty
    for i, block in enumerate(chain):
        # Check if hash satisfies the difficulty condition (PoW verification from Lesson 5)
        block_str = f"{block['prev_hash']}{block['data']}{block['nonce']}"
        computed = simple_hash(block_str)
        if computed != block["hash"] or not computed.startswith(target):
            return False
        # Check that linkage to previous block is correct (chain integrity from Lesson 6)
        if i > 0 and block["prev_hash"] != chain[i-1]["hash"]:
            return False
    return True

def resolve_conflicts(our_chain, other_chains, difficulty=2):
    """
    Select the longest valid chain.
    Returns: (selected chain, whether replacement occurred)
    """
    best_chain = our_chain
    replaced = False

    for chain in other_chains:
        # Only consider chains that are longer and valid
        if len(chain) > len(best_chain) and is_valid_chain(chain, difficulty):
            best_chain = chain
            replaced = True

    return best_chain, replaced

# Test: scenario where 3 nodes each hold a different chain
from attack_demo import build_chain, simple_hash as sh

genesis = sh("genesis")
node_a_chain = build_chain(genesis, ["tx1", "tx2", "tx3"])        # length 3
node_b_chain = build_chain(genesis, ["tx1", "tx4", "tx5", "tx6"]) # length 4
node_c_chain = build_chain(genesis, ["tx1", "tx2"])               # length 2

result, was_replaced = resolve_conflicts(
    node_a_chain, [node_b_chain, node_c_chain]
)
print(f"Node A chain length: {len(node_a_chain)}")
print(f"Node B chain length: {len(node_b_chain)}")
print(f"Node C chain length: {len(node_c_chain)}")
print(f"Selected chain length: {len(result)}")
print(f"Chain replaced: {was_replaced}")
# Expected output:
Node A chain length: 3
Node B chain length: 4
Node C chain length: 2
Selected chain length: 4
Chain replaced: True

🤔 Think about it: What problem arises if is_valid_chain only compares length without PoW verification?

View answer

An attacker could rapidly generate fake blocks without proof of work. Without finding a nonce, any value can be plugged in to fabricate a chain — creating an arbitrarily long chain with zero actual computational cost. PoW verification is an essential step that confirms "was real computational work actually invested in this chain?" This is exactly the verification side of "hard to solve, easy to verify" from Lesson 5.


Exchange Deposit Confirmation: ❌ → 🤔 → ✅ Step-by-Step Improvement

The biggest victims of the ETC attack were exchanges. They confirmed deposits too quickly. Let's compare three approaches to how an exchange actually handles deposit transactions. You'll feel firsthand why the attack probability formula we learned earlier matters in practice.

❌ WRONG WAY: Confirm deposit as soon as the transaction appears

# ❌ Never do this
# Immediately credit balance as soon as a transaction appears in the mempool

def process_deposit_wrong(tx_hash, amount):
    """Immediately credit unconfirmed mempool transactions — the worst approach"""
    tx = get_transaction(tx_hash)
    if tx:
        # No confirmation check! Doesn't even verify inclusion in a block!
        user_balance[tx.receiver] += amount
        enable_withdrawal(tx.receiver)  # Allow immediate withdrawal
        print(f"✅ {amount} coins deposited — withdrawal enabled")

Why is this dangerous? Transactions in the mempool haven't been included in any block yet. An attacker can create another transaction spending the same UTXO to replace the original (Replace-By-Fee). Trusting a transaction that hasn't even been included in a block is equivalent to receiving a check and handing over cash without verifying it won't bounce.

🤔 BETTER: Wait for a fixed confirmation count

# 🤔 Better, but still insufficient
# Apply the same confirmation count to all coins

FIXED_CONFIRMATIONS = 6  # 6 confirmations for all coins

def process_deposit_better(tx_hash, amount):
    """Confirm deposit after fixed number of confirmations — improved but incomplete"""
    tx = get_transaction(tx_hash)
    if tx and tx.confirmations >= FIXED_CONFIRMATIONS:
        user_balance[tx.receiver] += amount
        enable_withdrawal(tx.receiver)
        print(f"✅ {amount} coins deposited ({FIXED_CONFIRMATIONS} confirmations verified)")
    else:
        print(f"⏳ Waiting... ({tx.confirmations}/{FIXED_CONFIRMATIONS} confirmations)")

Why is this insufficient? 6 Bitcoin confirmations are safe. But is 6 confirmations enough for a small coin whose hashrate is 1/10,000 of Bitcoin's? Absolutely not. This is precisely why the ETC attack succeeded — exchanges applied the same standard to every coin. Look at the probability table again. Even with the same 6 confirmations, the difficulty of attacking differs enormously based on network hashrate.

✅ BEST: Dynamic confirmations based on coin hashrate and amount

# ✅ Production-level — dynamically adjust confirmation count based on coin and amount

# Per-coin security profiles (based on hourly attack cost)
COIN_SECURITY = {
    "BTC":  {"attack_cost_per_hour": 1_000_000, "base_confirms": 6},
    "ETH":  {"attack_cost_per_hour": 500_000,   "base_confirms": 12},
    "ETC":  {"attack_cost_per_hour": 5_000,     "base_confirms": 40_000},
    "BTG":  {"attack_cost_per_hour": 1_200,     "base_confirms": 60_000},
}

def required_confirmations(coin, amount_usd):
    """
    Calculate required confirmations by comparing deposit amount to coin's attack cost.
    Core principle: set confirmations so that attack cost > attack profit.
    """
    profile = COIN_SECURITY.get(coin)
    if not profile:
        return 100  # Be conservative for unknown coins

    base = profile["base_confirms"]
    cost_per_hour = profile["attack_cost_per_hour"]

    # If the amount exceeds 10% of hourly attack cost, scale up confirmations proportionally
    if amount_usd > cost_per_hour * 0.1:
        # Add confirmations so the cost is at least 10x the attack profit
        risk_multiplier = amount_usd / (cost_per_hour * 0.1)
        extra = int(base * risk_multiplier * 0.5)
        return base + extra
    return base

def process_deposit_best(tx_hash, coin, amount, amount_usd):
    """Dynamic confirmation count + amount-based risk assessment"""
    required = required_confirmations(coin, amount_usd)
    tx = get_transaction(tx_hash)

    if tx and tx.confirmations >= required:
        user_balance[tx.receiver] += amount
        # Apply additional withdrawal delay (cooldown) for large deposits
        if amount_usd > 50_000:
            enable_withdrawal_with_delay(tx.receiver, delay_hours=24)
            print(f"✅ {amount} {coin} deposited — withdrawal available after 24 hours")
        else:
            enable_withdrawal(tx.receiver)
            print(f"✅ {amount} {coin} deposited ({required} confirmations verified)")
    else:
        confirmations = tx.confirmations if tx else 0
        print(f"⏳ Waiting... ({confirmations}/{required} confirmations)")

# Same $50,000 deposit, but required confirmations differ completely by coin
print("=== Required Confirmations for $50,000 Deposit by Coin ===")
for coin in ["BTC", "ETH", "ETC", "BTG"]:
    confirms = required_confirmations(coin, 50_000)
    cost = COIN_SECURITY[coin]["attack_cost_per_hour"]
    print(f"{coin:>4}: {confirms:>6} confirmations required  (hourly attack cost: ${cost:>10,})")
# Expected output:
=== Required Confirmations for $50,000 Deposit by Coin ===
 BTC:      6 confirmations required  (hourly attack cost: $ 1,000,000)
 ETH:     12 confirmations required  (hourly attack cost: $   500,000)
 ETC:  240000 confirmations required  (hourly attack cost: $     5,000)
 BTG:  870000 confirmations required  (hourly attack cost: $     1,200)

Sending $50,000 in BTC requires only 6 confirmations (about 1 hour), but the same amount in ETC requires waiting 240,000 confirmations (several days). Coinbase actually adopted this approach after the ETC attack. This is economic deterrence implemented in code — it ensures that the cost of renting hashpower while waiting for confirmations exceeds the potential profit.


Selfish Mining: Dangerous Even Without 51%

A 51% attack is a frontal assault — brute force. But in 2013, it was discovered that unfair profits could be gained with far less hashpower.

The Core Strategy

A 2013 paper by Eyal & Sirer at Cornell shocked the blockchain community. It proved that with just 33% hashpower, a miner could earn more rewards than honest mining.

Think of it like poker: instead of betting immediately when you get a good hand, you wait for your opponent to raise before flipping everything at once.

  1. When you mine a block, don't broadcast it immediately (keep it secret)
  2. Once your secret chain is 1 block ahead of the public chain, wait for an honest node to mine a block
  3. When an honest block appears, immediately publish your secret chain — creating a fork of equal length
  4. Use your network influence to get your chain selected

Profitability Simulation

It sounds plausible in theory, but is it actually profitable? The simulation below compares the profitability of selfish vs. honest mining at various hashpower ratios. With strategy="selfish", the secret chain is published when it's 2 or more blocks ahead, and when tied at 1 block difference, the outcome is decided by 50% probability.

# selfish_mining.py
# Profitability comparison: selfish mining vs honest mining simulation

import random

def simulate_mining(attacker_ratio, rounds=10000, strategy="honest"):
    """
    attacker_ratio: Attacker's hashpower ratio
    strategy: "honest" or "selfish"
    Returns: Fraction of blocks obtained by the attacker
    """
    attacker_blocks = 0
    honest_blocks = 0
    secret_chain = 0  # Length of selfish miner's secret chain

    for _ in range(rounds):
        # Determine who found the block based on hashpower ratio
        attacker_found = random.random() < attacker_ratio

        if strategy == "honest":
            if attacker_found:
                attacker_blocks += 1
            else:
                honest_blocks += 1
        else:  # selfish mining
            if attacker_found:
                secret_chain += 1
                # If secret chain is 2 or more ahead, publish to invalidate honest chain
                if secret_chain >= 2:
                    attacker_blocks += secret_chain
                    secret_chain = 0
            else:
                if secret_chain == 1:
                    # Competing state: 50% chance selfish miner's block is adopted
                    if random.random() < 0.5:
                        attacker_blocks += 1
                    else:
                        honest_blocks += 1
                    secret_chain = 0
                else:
                    honest_blocks += 1

    total = attacker_blocks + honest_blocks
    return attacker_blocks / total if total > 0 else 0

# Compare at various hashpower ratios
print("=== Selfish Mining vs Honest Mining Profitability ===")
print(f"{'Hashpower':>9} | {'Honest Rev':>10} | {'Selfish Rev':>11} | {'Gain':>8}")
print("-" * 50)
for ratio in [0.1, 0.2, 0.25, 0.33, 0.4, 0.45]:
    honest_rev = simulate_mining(ratio, 50000, "honest")
    selfish_rev = simulate_mining(ratio, 50000, "selfish")
    diff = selfish_rev - honest_rev
    marker = "✅ Gain" if diff > 0.005 else "❌ Loss"
    print(f"{ratio*100:>8.0f}% | {honest_rev:>9.1%} | {selfish_rev:>10.1%} | {marker}")
# Expected output (approximate due to randomness):
=== Selfish Mining vs Honest Mining Profitability ===
Hashpower |  Honest Rev |  Selfish Rev |     Gain
--------------------------------------------------
      10% |      10.0% |        5.2% | ❌ Loss
      20% |      20.1% |       14.8% | ❌ Loss
      25% |      25.0% |       21.3% | ❌ Loss
      33% |      33.0% |       33.8% | ✅ Gain
      40% |      40.0% |       44.2% | ✅ Gain
      45% |      45.1% |       51.7% | ✅ Gain

The line flips at 33%. In my experience, this result matters in practice. When auditing DeFi protocols, I always consider the scenario: "What if an MEV (Maximal Extractable Value) attacker combines selfish mining strategies?" The barrier to entry is far lower than 51%.

🔍 Deep Dive: Why Is 33% the Threshold?

The key in the mathematical analysis of selfish mining is expected reward. When the secret chain is 2 blocks ahead, publishing it invalidates 1 honest block. For this strategy to be more profitable than honest mining, the probability of finding 2 consecutive blocks first must be high enough.

Calculating probabilistically: when the attacker's ratio is q, the probability of 2 consecutive blocks is . For this to exceed honest mining's expected reward q, we must also account for network propagation ability (γ). At γ=0.5 (propagation to half the network first), the threshold is approximately 1/3 ≈ 33%.


Sybil Attack: Deceiving Through the Power of Numbers

Both 51% attacks and selfish mining use "computational power" as a weapon. The Sybil attack takes a completely different approach. Instead of computation, identity forgery is the weapon.

What If You Create 1,000 Identities?

One entity creates thousands of fake identities (nodes) to dominate the network — this is the essence of a Sybil attack. The name comes from the case of a patient named "Sybil" who had multiple personality disorder.

When we studied the UTXO model in Lesson 7, we tracked the inputs and outputs of each transaction. If consensus were vote-based, a Sybil attacker with 1,000 fake nodes could manipulate the vote. The code below illustrates this — even if 10 honest nodes vote for "BlockA," if 20 fake nodes vote for "BlockB_malicious," the malicious block gets adopted.

# sybil_demo.py
# Sybil attack: vote-based consensus vs PoW-based consensus

class SimpleVoteConsensus:
    """Vote-based consensus — vulnerable to Sybil attacks!"""
    def __init__(self):
        self.nodes = {}  # {nodeID: vote}

    def register_node(self, node_id):
        self.nodes[node_id] = None

    def vote(self, node_id, choice):
        if node_id in self.nodes:
            self.nodes[node_id] = choice

    def result(self):
        votes = {}
        for choice in self.nodes.values():
            if choice:
                votes[choice] = votes.get(choice, 0) + 1
        return max(votes, key=votes.get) if votes else None

# Scenario: 10 honest nodes vs Sybil attacker
consensus = SimpleVoteConsensus()

# Register & vote for 10 honest nodes
for i in range(10):
    consensus.register_node(f"honest_{i}")
    consensus.vote(f"honest_{i}", "BlockA")

# 🚨 Sybil attack: create 20 fake nodes!
for i in range(20):
    consensus.register_node(f"sybil_{i}")
    consensus.vote(f"sybil_{i}", "BlockB_malicious")

print(f"Total node count: {len(consensus.nodes)}")
print(f"Honest nodes: 10, Sybil nodes: 20")
print(f"Vote result: {consensus.result()}")
print(f"→ Sybil attack successful! Malicious block adopted 😱")
# Expected output:
Total node count: 30
Honest nodes: 10, Sybil nodes: 20
Vote result: BlockB_malicious
→ Sybil attack successful! Malicious block adopted 😱

How PoW Defends Against Sybil Attacks

Does the same attack work on a PoW-based network? Let's verify with the code below. In PoW, hashpower is the voting right, not node count. Creating 1,000 nodes doesn't increase your say if the total hashpower is the same.

# pow_vs_sybil.py
# Why Sybil attacks are meaningless in PoW

class PoWConsensus:
    """PoW-based consensus — hashpower is voting right"""
    def __init__(self):
        self.miners = {}  # {minerID: hashpower}

    def register_miner(self, miner_id, hashpower):
        self.miners[miner_id] = hashpower

    def total_hashpower(self):
        return sum(self.miners.values())

    def win_probability(self, miner_id):
        total = self.total_hashpower()
        return self.miners.get(miner_id, 0) / total if total > 0 else 0

pow_net = PoWConsensus()

# 10 honest miners, each with 100 TH/s
for i in range(10):
    pow_net.register_miner(f"honest_{i}", 100)

# Sybil attack: create 1000 nodes... but total hashpower is only 100 TH/s
for i in range(1000):
    pow_net.register_miner(f"sybil_{i}", 0.1)  # Total: 100 TH/s

total = pow_net.total_hashpower()
honest_power = sum(v for k, v in pow_net.miners.items() if "honest" in k)
sybil_power = sum(v for k, v in pow_net.miners.items() if "sybil" in k)

print(f"Total hashpower: {total} TH/s")
print(f"Honest miner hashpower: {honest_power} TH/s ({honest_power/total:.0%})")
print(f"Sybil node hashpower: {sybil_power} TH/s ({sybil_power/total:.0%})")
print(f"Sybil node count: 1000")
print(f"\n→ 10x more nodes, but same hashpower = same say")
print(f"→ Sybil attacks are meaningless in PoW! 🛡️")
# Expected output:
Total hashpower: 1100.0 TH/s
Honest miner hashpower: 1000 TH/s (91%)
Sybil node hashpower: 100.0 TH/s (9%)
Sybil node count: 1000

→ 10x more nodes, but same hashpower = same say
→ Sybil attacks are meaningless in PoW! 🛡️

In Lesson 5, we compared PoW to "solving a sudoku puzzle." A Sybil attack is like submitting 1,000 entry forms to a sudoku competition. No matter how many forms you submit, it doesn't help you solve sudoku faster. You need actual computational ability.

🤔 Think about it: How does PoS (Proof of Stake) defend against Sybil attacks? Is it the same principle as PoW?

View answer

Same principle! Just as "hashpower" is the voting right in PoW, the amount of staked coins is the voting right in PoS. Creating 1,000 nodes doesn't increase your say if the total staked amount is the same. The core principle is linking voting rights to a scarce resource (electricity/coins) to make unlimited duplication impossible.


Attack Comparison and Defense Strategies

Placing the three attacks side by side makes each one's character distinct.

Attack TypeRequired ResourceGoalDefense MechanismReal-World Examples
51% AttackMajority hashpowerDouble spendHigh hashrate, multiple confirmationsETC 2020, BTG 2018
Selfish Mining33%+ hashpowerUnfair rewardsOptimized block propagation, Uncle rewardsTheoretical (no confirmed large-scale cases)
Sybil AttackMany fake identitiesNetwork dominationPoW/PoS, identity costP2P networks in general

3 Core Defense Principles

1. Economic Deterrence If attack cost > attack profit, there's no reason to attack. This is exactly why Bitcoin's hashrate is so high — the higher the wall, the more expensive the siege.

2. Confirmation Depth Look again at the probability table earlier in this article. The deeper the confirmation, the exponentially lower the attack success probability.

3. Decentralization In Lesson 8 we learned how SPV clients using Merkle trees can verify transactions without downloading the full block. This kind of lightweight verification enables more nodes to participate, decentralizes the network, and raises the cost of attacks.


🔨 Project Update

This lesson adds two things:

  1. Add resolve_conflicts() method to blockchain.py
  2. attack_simulation.py — 51% attack demonstration and safety experiments by confirmation count

blockchain.py (Cumulative code — new additions marked with # 🆕 Lesson 9)

# blockchain.py — Lessons 1~9 cumulative project

import hashlib
import time
import json

# ═══ Lesson 1: Hash Functions ═══
def sha256_hash(data):
    """Calculate SHA-256 hash"""
    return hashlib.sha256(data.encode()).hexdigest()

# ═══ Lesson 3: Transactions ═══
class Transaction:
    def __init__(self, sender, receiver, amount):
        self.sender = sender
        self.receiver = receiver
        self.amount = amount
        self.timestamp = time.time()

    def to_dict(self):
        return {
            "sender": self.sender,
            "receiver": self.receiver,
            "amount": self.amount,
            "timestamp": self.timestamp
        }

    def calculate_hash(self):
        tx_string = json.dumps(self.to_dict(), sort_keys=True)
        return sha256_hash(tx_string)

# ═══ Lesson 8: Merkle Tree ═══
def build_merkle_root(tx_hashes):
    """Calculate Merkle root from a list of transaction hashes"""
    if not tx_hashes:
        return sha256_hash("")
    level = list(tx_hashes)
    while len(level) > 1:
        if len(level) % 2 == 1:
            level.append(level[-1])  # Duplicate last if odd count
        next_level = []
        for i in range(0, len(level), 2):
            combined = sha256_hash(level[i] + level[i+1])
            next_level.append(combined)
        level = next_level
    return level[0]

# ═══ Lessons 4+5: Block ═══
class Block:
    def __init__(self, index, transactions, previous_hash, difficulty=2):
        self.index = index
        self.timestamp = time.time()
        self.transactions = transactions
        self.previous_hash = previous_hash
        self.difficulty = difficulty
        self.nonce = 0
        # Lesson 8: Merkle root
        tx_hashes = [tx.calculate_hash() for tx in transactions] if transactions else []
        self.merkle_root = build_merkle_root(tx_hashes)
        self.hash = self.mine()

    def calculate_hash(self):
        block_data = (f"{self.index}{self.timestamp}{self.merkle_root}"
                      f"{self.previous_hash}{self.nonce}")
        return sha256_hash(block_data)

    def mine(self):
        """Lesson 5: Proof of Work — find leading zeros"""
        target = "0" * self.difficulty
        while True:
            self.hash = self.calculate_hash()
            if self.hash[:self.difficulty] == target:
                return self.hash
            self.nonce += 1

    def to_dict(self):
        return {
            "index": self.index,
            "timestamp": self.timestamp,
            "transactions": [tx.to_dict() for tx in self.transactions],
            "previous_hash": self.previous_hash,
            "merkle_root": self.merkle_root,
            "nonce": self.nonce,
            "hash": self.hash,
            "difficulty": self.difficulty
        }

# ═══ Lessons 6+7+9: Blockchain ═══
class Blockchain:
    def __init__(self, difficulty=2):
        self.difficulty = difficulty
        self.chain = []
        self.pending_transactions = []
        self.utxo_pool = {}  # Lesson 7: UTXO pool
        self._create_genesis_block()

    def _create_genesis_block(self):
        genesis = Block(0, [], "0" * 64, self.difficulty)
        self.chain.append(genesis)

    def get_latest_block(self):
        return self.chain[-1]

    def add_transaction(self, sender, receiver, amount):
        tx = Transaction(sender, receiver, amount)
        self.pending_transactions.append(tx)
        return tx

    def mine_pending(self, miner_address):
        """Mine pending transactions into a block"""
        # Mining reward transaction
        reward_tx = Transaction("NETWORK", miner_address, 50)
        txs = self.pending_transactions + [reward_tx]

        new_block = Block(
            len(self.chain),
            txs,
            self.get_latest_block().hash,
            self.difficulty
        )
        self.chain.append(new_block)
        self.pending_transactions = []
        return new_block

    def is_chain_valid(self):
        """Lesson 6: Verify chain integrity"""
        for i in range(1, len(self.chain)):
            current = self.chain[i]
            previous = self.chain[i - 1]
            if current.previous_hash != previous.hash:
                return False
            if not current.hash.startswith("0" * self.difficulty):
                return False
        return True

    # 🆕 Lesson 9: Select the longest valid chain
    def resolve_conflicts(self, other_chains):
        """
        Compares against other nodes' chains and adopts the longest valid chain.
        other_chains: list of Blockchain instances
        Returns: True if our chain was replaced, False if it was kept
        """
        longest_chain = self.chain
        replaced = False

        for other_bc in other_chains:
            other_chain = other_bc.chain
            # Only adopt chains that are longer and valid
            if len(other_chain) > len(longest_chain) and other_bc.is_chain_valid():
                longest_chain = other_chain
                replaced = True

        if replaced:
            self.chain = longest_chain
        return replaced

    def __len__(self):
        return len(self.chain)

    def print_chain(self):
        for block in self.chain:
            tx_summary = ", ".join(
                f"{tx.sender}{tx.receiver}:{tx.amount}"
                for tx in block.transactions
            ) or "genesis"
            print(f"  Block {block.index} | {block.hash[:16]}... | [{tx_summary}]")


if __name__ == "__main__":
    print("=== PyChain Test ===")
    bc = Blockchain(difficulty=2)
    bc.add_transaction("Alice", "Bob", 5)
    bc.add_transaction("Bob", "Charlie", 2)
    bc.mine_pending("Miner1")
    bc.add_transaction("Charlie", "Dave", 1)
    bc.mine_pending("Miner2")
    bc.print_chain()
    print(f"Chain validity: {bc.is_chain_valid()}")
    print(f"Chain length: {len(bc)}")

attack_simulation.py (🆕 Written new this lesson)

# attack_simulation.py — 51% attack demonstration and safety experiments by confirmation count

from blockchain import Blockchain
import time

def simulate_51_percent_attack(difficulty=2, honest_blocks=3, attacker_blocks=4):
    """
    51% attack simulation:
    1. Build honest chain
    2. Attacker builds a longer alternate chain
    3. Demonstrate chain replacement via resolve_conflicts
    """
    print("=" * 60)
    print("🔴 51% Attack Simulation")
    print("=" * 60)

    # Step 1: Build honest chain
    honest_chain = Blockchain(difficulty=difficulty)
    honest_chain.add_transaction("Alice", "Bob", 10)  # Key transaction
    honest_chain.mine_pending("HonestMiner")

    for i in range(honest_blocks - 1):
        honest_chain.add_transaction("User", f"Vendor{i}", i + 1)
        honest_chain.mine_pending("HonestMiner")

    print(f"\n✅ Honest chain ({len(honest_chain)} blocks):")
    honest_chain.print_chain()

    # Step 2: Attacker's chain — built without the Alice→Bob transaction!
    attacker_chain = Blockchain(difficulty=difficulty)
    attacker_chain.add_transaction("Alice", "Alice", 10)  # Send to self (double spend!)
    attacker_chain.mine_pending("AttackerMiner")

    for i in range(attacker_blocks - 1):
        attacker_chain.add_transaction("Fake", f"Addr{i}", i)
        attacker_chain.mine_pending("AttackerMiner")

    print(f"\n😈 Attacker's chain ({len(attacker_chain)} blocks):")
    attacker_chain.print_chain()

    # Step 3: Run resolve_conflicts
    print(f"\n⚔️ Resolving chain conflict...")
    was_replaced = honest_chain.resolve_conflicts([attacker_chain])

    print(f"Chain replaced: {was_replaced}")
    if was_replaced:
        print(f"🚨 Attack successful! Honest chain replaced by attacker's chain!")
        print(f"   Alice→Bob 10BTC transaction erased!")
    else:
        print(f"🛡️ Defense successful! Honest chain maintained.")

    return was_replaced

def confirmation_safety_experiment(difficulty=2):
    """
    Safety experiment by confirmation count:
    Demonstrates how many additional blocks an attacker must mine to overtake the honest chain
    """
    print("\n" + "=" * 60)
    print("🔬 Safety Experiment by Confirmation Count")
    print("=" * 60)
    print(f"{'Confirms':>8} | {'Honest Chain':>12} | {'Blocks Needed to Attack':>22} | Safety")
    print("-" * 60)

    for confirmations in [1, 2, 3, 4, 6, 10]:
        # Honest chain length = genesis + confirmation count
        honest_len = 1 + confirmations
        # Attacker needs to be 1 block longer than honest chain to succeed
        needed = confirmations + 1
        # Simple approximation of success probability for a 30% hashpower attacker
        q = 0.3
        prob = (q / (1 - q)) ** confirmations
        safety = "🟢 Safe" if prob < 0.01 else "🟡 Caution" if prob < 0.1 else "🔴 Danger"
        print(f"{confirmations:>8} | {honest_len:>12} | {needed:>22} | {safety} ({prob:.4%})")

if __name__ == "__main__":
    # 51% attack demonstration
    simulate_51_percent_attack(difficulty=2, honest_blocks=3, attacker_blocks=4)

    # Safety experiment by confirmation count
    confirmation_safety_experiment()

    print("\n" + "=" * 60)
    print("📊 Conclusions")
    print("=" * 60)
    print("• Majority hashpower → always able to build a longer chain → double spend succeeds")
    print("• Deeper confirmations → exponential increase in attack cost")
    print("• Bitcoin's 6 confirmations ≈ <0.1% success rate even for a 30% hashpower attacker")

How to Run

# Save blockchain.py first, then
python attack_simulation.py
# Expected output:
============================================================
🔴 51% Attack Simulation
============================================================

✅ Honest chain (4 blocks):
  Block 0 | 00a3f7b2c8d1e9f4... | [genesis]
  Block 1 | 003e8a1f5b2c7d90... | [Alice→Bob:10, NETWORK→HonestMiner:50]
  Block 2 | 00c4d2e6f8a1b3c5... | [User→Vendor0:1, NETWORK→HonestMiner:50]
  Block 3 | 007f1a2b3c4d5e6f... | [User→Vendor1:2, NETWORK→HonestMiner:50]

😈 Attacker's chain (5 blocks):
  Block 0 | 00a3f7b2c8d1e9f4... | [genesis]
  Block 1 | 001b2c3d4e5f6a7b... | [Alice→Alice:10, NETWORK→AttackerMiner:50]
  Block 2 | 00d4e5f6a7b8c9d0... | [Fake→Addr0:0, NETWORK→AttackerMiner:50]
  Block 3 | 00e5f6a7b8c9d0e1... | [Fake→Addr1:1, NETWORK→AttackerMiner:50]
  Block 4 | 00f6a7b8c9d0e1f2... | [Fake→Addr2:2, NETWORK→AttackerMiner:50]

⚔️ Resolving chain conflict...
Chain replaced: True
🚨 Attack successful! Honest chain replaced by attacker's chain!
   Alice→Bob 10BTC transaction erased!

============================================================
🔬 Safety Experiment by Confirmation Count
============================================================
Confirms |  Honest Chain |     Blocks Needed to Attack | Safety
------------------------------------------------------------
       1 |             2 |                           2 | 🔴 Danger (42.8571%)
       2 |             3 |                           3 | 🔴 Danger (18.3673%)
       3 |             4 |                           4 | 🟡 Caution (7.8717%)
       4 |             5 |                           5 | 🟡 Caution (3.3736%)
       6 |             7 |                           7 | 🟢 Safe (0.6203%)
      10 |            11 |                          11 | 🟢 Safe (0.0049%)

============================================================
📊 Conclusions
============================================================
• Majority hashpower → always able to build a longer chain → double spend succeeds
• Deeper confirmations → exponential increase in attack cost
• Bitcoin's 6 confirmations ≈ <0.1% success rate even for a 30% hashpower attacker

Run the project you've built so far yourself. If the resolve_conflicts() method in blockchain.py and attack_simulation.py work correctly, you've understood blockchain attacks and defenses through code.


Full Structure at a Glance


3 Things to Apply in Practice

1. If you operate an exchange or service — apply different confirmation counts per coin Raise the confirmation count for coins with low hashrate. After the ETC attack, Coinbase raised ETC confirmations to 3,500 or more.

2. If you're a DeFi developer — design for chain reorganization (Reorg) This is something I always check in smart contract audits. "Is this contract safe after a reorg?" Logic that depends on block numbers is vulnerable to reorganization.

3. If you're an investor — recognize the risk of small PoW coins Low hashrate coin = coin with low attack cost. Sites like crypto51.app let you check attack costs. My personal rule: I never hold a large position in any coin with an hourly attack cost below $10,000.


Difficulty Fork

🟢 If it was easy

Key takeaways:

  • 51% attack = create alternate chain with majority hashpower → double spend
  • Selfish mining = unfair profit with just 33%, strategy of delaying block publication
  • Sybil attack = mass creation of fake identities, PoW provides natural defense
  • resolve_conflicts() = select the longest valid chain
  • 6 confirmations = practical safety standard

Preview of next lesson: In Lesson 10, we'll integrate all the code built so far into a REST API node as a capstone project. We'll create real HTTP endpoints with Flask so you can manipulate the blockchain from a browser!

🟡 If it was difficult

Think of a 51% attack with a different analogy: election fraud.

If you control the majority of votes (= block mining), you can change the outcome. But:

  • Polling stations are distributed worldwide (= node distribution)
  • Every vote requires buying an expensive ballot (= electricity/hashpower)
  • Once cast, votes are hard to reverse (= block depth)

So the more expensive the ballot (= hashrate↑), and the more votes already cast (= confirmations↑), the harder it is to commit fraud.

Additional practice: Change difficulty=1 in attack_simulation.py and run it. You'll experience firsthand how faster mining speeds up both attacks and defenses.

🔴 Challenge

Interview question: "What is finality in a blockchain, and does true finality exist in a PoW chain?"

In a PoW chain, theoretically only probabilistic finality exists. No matter how deep the confirmations, an attacker with infinite hashpower can still reverse the chain. This is why Ethereum introduced checkpoints when switching to PoS — providing deterministic finality where certain blocks can never be reverted.

Production challenge: Implement resolve_conflicts_v2 that uses "cumulative difficulty (total work)" as the criterion instead of "chain length." Real Bitcoin selects chains based on total work, not length.

Code Playground

Python아래 코드는 공격자의 해시파워 비율과 컨펌 수에 따른 추월 확률을 계산한다. 나카모토 백서의 포아송 분포 근사를 그대로 구현한 것이다.
Python확률 공식만으로는 감이 안 올 수 있다. 레슨 5의 PoW와 레슨 6의 체인 구조를 직접 사용해서 공격을 시뮬레이션해보자. 아래 코드는 정직한 체인과 공격자 체인을 각각 만든 뒤, 가장 긴 체인 규칙으로 어느 쪽이 승리하는지 보여준다.
Python이 규칙을 코드로 만들어보자. 레슨 6에서 만든 `Blockchain` 클래스에 핵심 메서드를 추가한다. 여러 노드가 서로 다른 체인을 가질 때, 가장 긴 유효 체인을 선택하는 로직이다. `is_valid_chain`이 PoW 검증과 해시 연결을 모두 확인하고, `resolve_conflicts`가 유효한 체인 중 가장 긴 것을 채택한다.
Python
Python
Python
Python말로만 들으면 그럴듯해 보이지만, 정말로 이득일까? 아래 시뮬레이션은 이기적 채굴과 정직한 채굴의 수익성을 해시파워 비율별로 비교한다. `strategy="selfish"`일 때 비밀 체인이 2블록 이상 앞서면 공개하고, 1블록 차이에서 경쟁이 붙으면 50% 확률로 승부를 가른다.
Python레슨 7에서 UTXO 모델을 배울 때 각 트랜잭션의 입출력을 추적했다. 만약 투표 기반 합의라면, 시빌 공격자는 가짜 노드 1000개로 투표를 조작할 수 있다. 아래 코드가 그 과정을 보여준다 — 정직한 노드 10개가 "블록A"에 투표해도, 가짜 노드 20개가 "블록B_악성"에 투표하면 악성 블록이 채택된다.
Python그렇다면 PoW 기반 네트워크에서도 같은 공격이 통할까? 아래 코드로 확인해보자. PoW에서는 노드 수가 아니라 **해시파워**가 투표권이다. 노드를 1000개 만들어도 해시파워가 같으면 발언권은 동일하다.
Python

Q&A