Anatomy of the Blockchain Skeleton — Hash Functions, Block Structure, and the Principles of Chain Linking

Lesson 223min4,609 chars

Learning Objectives

  • 해시 함수의 4가지 핵심 성질(결정성, 일방향성, 눈사태 효과, 충돌 저항성)을 설명할 수 있다
  • 비트코인 블록 헤더에 포함된 6개 필드의 역할을 각각 설명할 수 있다
  • 블록 하나를 위변조할 경우 후속 블록 전체에 연쇄적 영향이 발생하는 이유를 도식화할 수 있다
  • Python으로 SHA-256 해시를 생성하고 눈사태 효과를 직접 검증할 수 있다
  • 미니 블록체인을 구현하여 체인 무결성 검증 로직을 작성할 수 있다

Dissecting the Blockchain Skeleton — Hash Functions, Block Structure, and the Principles of Chain Linking

In 2008, Satoshi Nakamoto declared he would replace trust with code. Last time, we defined the background of that declaration — distributed ledgers, proof of work, economic incentives — as the "soul" of Bitcoin. Today's topics — hash functions, block structure, and chain linking — are the "skeleton" that holds that soul.

If last time was a design meeting discussing "why this building is necessary," today is being deployed to the field to weld the steel frame yourself. Theory is over. Starting today, we write code.


🎯 Today's Mission

By the end of today's lesson, you should have these in hand:

  1. The experience of running the SHA-256 hash function yourself and seeing the "avalanche effect" with your own eyes
  2. The ability to explain what each of the 6 block header fields does
  3. Code where you've built a 5-block mini blockchain in Python yourself
  4. A blockchain simulator in Google Sheets where you can visually confirm that tampering with a middle block breaks the chain

The only tools are Python (runnable in the browser) and Google Sheets. No special installation required.


1. Hash Functions — The Digital Fingerprint of Data

Understanding What a Hash Function Is in 30 Seconds

When I first studied blockchain, the word "hash" was the most confusing thing. Open a cryptography textbook and math symbols run rampant, but the essence is surprisingly simple.

Hash function = A machine that takes any data as input and outputs a "fingerprint" of fixed length

Just as a person's fingerprint uniquely identifies that person, a hash value uniquely identifies that data.

Whether the input is 5 characters or 9 pages, the output is always a 64-character hexadecimal string (256 bits). This is SHA-256.

Running SHA-256 Yourself

Seeing is believing. Let's create a hash directly.

# First experience with hash functions — basic SHA-256 usage
import hashlib

# Hashing a string with SHA-256
message = "Hello"
hash_result = hashlib.sha256(message.encode()).hexdigest()

print(f"Input: {message}")
print(f"Hash: {hash_result}")
print(f"Hash length: {len(hash_result)} characters")

# Output:
# Input: Hello
# Hash: 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969
# Hash length: 64 characters

.encode() converts the string to bytes, and .hexdigest() returns the result as a hexadecimal string. These two lines can fingerprint any data in the world.

❌ → 🤔 → ✅ : Not All "Hash Functions" Are the Same

There's a common mistake when first studying blockchain — thinking "I just need to make a hash" and using any hash function at all. Let's compare three approaches in code.

❌ WRONG WAY: Using Python's built-in hash()

# ❌ Never do this — using Python hash() for blockchain
data = "Alice→Bob 1 BTC"

# Python built-in hash() — results change every run!
print(f"Run 1: {hash(data)}")
print(f"Run 2: {hash(data)}")  # Same within a session, but...

# 💥 Problem: In Python 3.3+, hash() mixes in a random seed for security.
# Restarting the program produces completely different values for the same input!
# → The "determinism" property breaks. Verification between nodes becomes impossible.
# → Also, output length is not fixed, and there's no cryptographic security.

hash() is a function designed for internal use like dictionary key lookups. When the process restarts, it outputs different values for the same input. Use this in a blockchain? Node A and node B calculate different hashes for the same block. Consensus is impossible from the start.

🤔 BETTER: Using MD5 — Deterministic but a Broken Algorithm

# 🤔 Better but still insufficient — using MD5
import hashlib

data = "Alice→Bob 1 BTC"
md5_hash = hashlib.md5(data.encode()).hexdigest()

print(f"MD5 hash: {md5_hash}")
print(f"Hash length: {len(md5_hash)} characters (128 bits)")

# Output:
# MD5 hash: 7a1f2b3c4d5e6f708192a3b4c5d6e7f8  (example)
# Hash length: 32 characters (128 bits)

# ⚠️ MD5 is deterministic and has fixed-length output. But:
# 1) Collision resistance is broken — in 2004, attacks were demonstrated
#    that create pairs of different inputs with the same hash
# 2) 128-bit output — half the size of SHA-256, also weak against brute force
# 3) MD5 collisions have been exploited in real SSL certificate forgery cases

MD5 at least satisfies the basic requirements of "deterministic" and "fixed length." But in 2004, its collision resistance collapsed. This means you can manipulate two different files to have the same MD5 hash. Without collision resistance in a blockchain? An attacker can create a forged transaction with the same hash as a legitimate one.

✅ BEST: Using SHA-256 — The Blockchain Standard

# ✅ The correct way — using SHA-256
import hashlib

data = "Alice→Bob 1 BTC"
sha256_hash = hashlib.sha256(data.encode()).hexdigest()

print(f"SHA-256 hash: {sha256_hash}")
print(f"Hash length: {len(sha256_hash)} characters (256 bits)")

# Output:
# SHA-256 hash: 9b4e7f2a... (64 characters)
# Hash length: 64 characters (256 bits)

# ✅ Why SHA-256 is the blockchain standard:
# 1) Deterministic — same result on any computer, at any time
# 2) 256-bit output — 2^256 possible values (more than the number of atoms in the universe)
# 3) Collision resistance intact — SHA-1 was broken in 2005, but SHA-256 is safe
# 4) Excellent avalanche effect — changing 1 bit flips ~50% of output bits on average
# 5) Used by Bitcoin since 2009 — 17 years of real-world validation
Comparisonhash()🤔 MD5✅ SHA-256
Determinism❌ Changes each run
Fixed-length output❌ Varies by platform✅ 128 bits✅ 256 bits
Collision resistance❌ None❌ Broken in 2004✅ Intact
Avalanche effect❌ Not guaranteed🤔 Partial✅ Ideal (~50%)
Blockchain suitability❌ Unusable❌ Dangerous✅ Industry standard

Key lesson: Just because something has "hash" in the name doesn't mean all hashes are equal. In blockchain, you must use a cryptographically secure hash function (SHA-256, SHA-3, BLAKE2, etc.). Bitcoin's choice of SHA-256 was no accident — it was the most proven option satisfying all four properties above.

The 4 Core Properties of Hash Functions

What I learned painfully through smart contract auditing is that failing to understand the 4 properties of hash functions precisely means missing security vulnerabilities. Let's prove each one with code.

Property 1: Deterministic

Same input → always same output. Run it a hundred times, run it on any computer — the result is identical.

# Property 1 proof: Determinism — same input always gives same hash
import hashlib

for i in range(5):
    h = hashlib.sha256("Bitcoin".encode()).hexdigest()
    print(f"Attempt {i+1}: {h[:16]}...")  # Print first 16 characters only

# Output:
# Attempt 1: b0d56e4c6f25b1a2...
# Attempt 2: b0d56e4c6f25b1a2...
# Attempt 3: b0d56e4c6f25b1a2...
# Attempt 4: b0d56e4c6f25b1a2...
# Attempt 5: b0d56e4c6f25b1a2...

This seems obvious, but it's critically important. If hashes changed every time, there would be no way to prove "this block's data hasn't been tampered with" in a blockchain.

Property 2: Avalanche Effect

This is where the real magic begins. Change just one character in the input and the hash value changes completely.

# Property 2 proof: Avalanche effect — one character difference completely changes the hash
import hashlib

texts = ["Hello", "Hello!", "hello", "Hellp"]

for text in texts:
    h = hashlib.sha256(text.encode()).hexdigest()
    print(f"'{text:6s}' → {h[:20]}...")

# Output:
# 'Hello ' → 185f8db32271fe25f561...
# 'Hello!' → 334d016f755cd6dc58c9...
# 'hello ' → 2cf24dba5fb0a30e26e8...
# 'Hellp ' → a8cfab14b0981f2260c1...

"Hello" and "Hellp" differ by exactly one character. The hash? Not a single match in the first 20 characters. Like a single pebble rolling down a hillside and shaking an entire mountain, a small change overturns the result completely. That's why it's called the "avalanche effect."

Measuring it at the bit level makes it even clearer:

# Measuring the avalanche effect at the bit level
import hashlib

def to_bits(hex_str):
    return bin(int(hex_str, 16))[2:].zfill(256)

h1 = hashlib.sha256("Hello".encode()).hexdigest()
h2 = hashlib.sha256("Hellp".encode()).hexdigest()

bits1 = to_bits(h1)
bits2 = to_bits(h2)

# Count differing bits
diff_bits = sum(b1 != b2 for b1, b2 in zip(bits1, bits2))

print(f"Total bits: 256")
print(f"Differing bits: {diff_bits}")
print(f"Change rate:   {diff_bits/256*100:.1f}%")

# Output:
# Total bits: 256
# Differing bits: 131
# Change rate:   51.2%

131 bits — about half of 256 — were flipped. An ideal hash function inverts about 50% of the output bits when even 1 input bit changes. SHA-256 approaches this ideal almost exactly.

🤔 Think about it: If there were no avalanche effect, so that the hashes of "Hello" and "Hellp" were similar, what problems would arise in a blockchain?

See answer

If someone could slightly alter transaction data inside a block and still get a nearly identical hash, other nodes would have difficulty detecting the tampering. Thanks to the avalanche effect, modifying even 1 byte completely changes the hash, so all nodes can immediately tell "something's off with this block." This is the core principle of blockchain's tamper evidence.

Property 3: Pre-image Resistance (One-Way)

Reverse-engineering the original input from a hash value is practically impossible. The only way to find "Hello" from 185f8db3...? Try every possible input one by one (brute force).

A mistake I've repeatedly witnessed while auditing Ethereum smart contracts: developers store passwords on-chain as hashes believing "it's hidden by hashing, so it must be safe." The hash itself can't be reversed, but if the input space is narrow (e.g., 4-digit numbers — only 10,000 possibilities), brute force breaks it quickly. Pre-image resistance is only meaningful when the input space is sufficiently large.

Property 4: Collision Resistance

The probability that two different inputs produce the same hash is astronomically low. For SHA-256, there are 2²⁵⁶ possible hash values — a scale that matches the number of atoms in the observable universe (about 10⁸⁰ ≈ 2²⁶⁶).

PropertyOne-line descriptionRole in blockchain
DeterminismSame input → same outputAll nodes compute the same hash for the same block
Avalanche effect1 bit change → hash completely changesEven microscopic data tampering is instantly detected
Pre-image resistanceHash → original cannot be reversedTransaction contents cannot be inferred from the hash alone
Collision resistanceDifferent inputs → same hash virtually impossibleEach block's hash serves as a unique identifier

2. Dissecting the Block Interior — Let's Open the Hood

Now that we have the four properties of hash functions in hand, it's time to see where this tool is actually used. Let's open a single block. A Bitcoin block is composed of two main parts: the block header and the transaction list.

The 6 Fields of a Block Header

A block header is exactly 80 bytes. Small enough to store millions on a single USB drive. Yet compressed inside these 80 bytes is all the core information sustaining blockchain security.

Let's dissect each field one by one:

#FieldSizeRoleAnalogy
Version4 bytesProtocol rules this block followsDocument template version number
Previous Block Hash32 bytesSHA-256 hash of the immediately preceding block"Previous page seal" in a notarized ledger
Merkle Root32 bytesHash summarizing all transactions into oneTable of contents checksum
Timestamp4 bytesBlock creation time (Unix time)Notarization date stamp
Bits (Difficulty Target)4 bytesMining puzzle difficultyExam passing cutoff score
Nonce4 bytesThe only number miners adjustLottery number

Of these, ② Previous Block Hash is today's protagonist. This field is the key that binds blocks into a "chain." ⑤ Difficulty and ⑥ Nonce will be explored deeply in lesson 3, so for today just note that "these exist."

Merkle Tree — Efficient Summarization of Transactions

A block can contain thousands of transactions. The structure that compresses these thousands into a single hash (the Merkle root) is the Merkle tree. Think of a tournament bracket.

Starting from the bottom, pair up and hash two at a time. Then pair up those results and hash them again. Repeat until one hash remains at the top. This way, if even 1 transaction changes, the Merkle root changes completely — the avalanche effect works here too.

🤔 Think about it: Say a block has 4,096 transactions. To prove that a specific transaction is included in this block, do you have to show all 4,096?

See answer

No! Thanks to the Merkle tree, you only need 12 hashes (log₂(4096) = 12). This is called a "Merkle Proof," and it's the core principle that lets Bitcoin light clients (SPV wallets) verify transactions without downloading the entire blockchain. Instead of checking all 4,096, you just follow the "path" through the tree, making it enormously more efficient.


3. Chain Linking — Why "Blockchain"?

I said the "Previous Block Hash" — one of the 6 block header fields — is today's protagonist. Now it's time to reveal why.

A single block is just a bundle of data. Nothing more than a page in a ledger. But the moment these pages are woven into a chain, it becomes a "blockchain," and tampering becomes practically impossible.

The principle is surprisingly simple: each block stores the previous block's hash in its own header.

Block #1's "previous hash" field contains Block #0's hash (0a3f...). Block #2's "previous hash" contains Block #1's hash (7c1d...). This is the entirety of the "chain."

Why Tampering Is Impossible — The Domino Effect

Here comes the key insight. If someone tampers with Block #1's data, what happens?

  1. Block #1's data changes → Block #1's hash changes completely (avalanche effect)
  2. Block #2 stores Block #1's original hash → mismatch!
  3. Block #2 must also be fixed → Block #2's hash also changes
  4. Block #3, #4, ... all the way to the end must be recalculated

Think of dominoes. Pull out and replace one in the middle, and the angle and sequence of every domino that follows changes. Bitcoin currently has more than ~830,000 blocks stacked up. To tamper with one block, you'd need to recalculate every subsequent block — and do it faster than the entire network. Physically impossible.

Let's prove it directly with code.

# Chain linking and tamper detection — demonstrating the core principle
import hashlib

def calc_hash(data, prev_hash):
    """Function to calculate a block's hash"""
    content = prev_hash + data
    return hashlib.sha256(content.encode()).hexdigest()

# Constructing a mini chain of 3 blocks
block0_data = "Genesis block"
block0_prev = "0" * 64  # Genesis block has no previous hash
block0_hash = calc_hash(block0_data, block0_prev)

block1_data = "Alice→Bob 1 BTC"
block1_hash = calc_hash(block1_data, block0_hash)

block2_data = "Bob→Charlie 0.5 BTC"
block2_hash = calc_hash(block2_data, block1_hash)

print("=== Normal chain ===")
print(f"Block0 hash: {block0_hash[:16]}...")
print(f"Block1 hash: {block1_hash[:16]}...")
print(f"Block2 hash: {block2_hash[:16]}...")

# Tamper with Block 1's data!
print("\n=== Tampering with Block 1 data ===")
tampered_data = "Alice→Bob 100 BTC"  # Manipulating 1 BTC to 100 BTC
tampered_hash = calc_hash(tampered_data, block0_hash)

print(f"Original Block1 hash: {block1_hash[:16]}...")
print(f"Tampered Block1 hash: {tampered_hash[:16]}...")
print(f"Match: {block1_hash == tampered_hash}")  # False!

# Output:
# === Normal chain ===
# Block0 hash: 5765e0b1f3a870f6...
# Block1 hash: a3c41f8d22b5e9c1...
# Block2 hash: 8f2e14d6b7a3c095...
#
# === Tampering with Block 1 data ===
# Original Block1 hash: a3c41f8d22b5e9c1...
# Tampered Block1 hash: d9f742e1083bc6a4...
# Match: False

This code combines the previous hash and data to create a block hash, then shows how the hash changes when data is tampered with. Just changing 1 BTC to 100 BTC completely changes the hash. Block #2 remembers the original hash (a3c4...), so it immediately knows "this was tampered with."

🤔 Think about it: Couldn't an attacker succeed in tampering by recalculating all the hashes from Block #1 to the last block?

See answer

Theoretically, yes. But this is where Proof of Work serves as the defense. Calculating each block's hash requires enormous computation (you must find a hash satisfying specific conditions by changing the nonce billions of times). It takes an average of 10 minutes to recalculate one block, and during that time honest miners are continuously adding new blocks. Unless an attacker controls more than 51% of the total network hash power, catching up to the honest chain is impossible. This will be covered in detail in lesson 3.


4. Building a Mini Blockchain Yourself

Now that we understand the principles, it's time to actually build one. We define a block with a single Python class and add chain verification logic.

# Mini blockchain implementation — block creation + chain verification
import hashlib
import time

class Block:
    def __init__(self, index, data, prev_hash):
        self.index = index            # Block number
        self.timestamp = time.time()  # Creation time
        self.data = data              # Transaction data
        self.prev_hash = prev_hash    # Previous block hash
        self.hash = self.calc_hash()  # Current block hash
    
    def calc_hash(self):
        """Combine block header info and generate SHA-256 hash"""
        header = (
            str(self.index) +
            str(self.timestamp) +
            self.data +
            self.prev_hash
        )
        return hashlib.sha256(header.encode()).hexdigest()

# Genesis block (first block)
genesis = Block(0, "Genesis block", "0" * 64)

# Add 4 blocks
chain = [genesis]
transactions = [
    "Alice→Bob 1 BTC",
    "Bob→Charlie 0.5 BTC",
    "Dave→Eve 2.3 BTC",
    "Eve→Alice 0.1 BTC"
]

for i, tx in enumerate(transactions):
    new_block = Block(i + 1, tx, chain[-1].hash)
    chain.append(new_block)

# Print the full chain
for b in chain:
    print(f"Block #{b.index} | Hash: {b.hash[:12]}... | Previous: {b.prev_hash[:12]}... | Data: {b.data}")

# Output:
# Block #0 | Hash: 7a8f3b2c1d0e... | Previous: 000000000000... | Data: Genesis block
# Block #1 | Hash: 3e5d9c4a7b81... | Previous: 7a8f3b2c1d0e... | Data: Alice→Bob 1 BTC
# Block #2 | Hash: c2f1a8d6e934... | Previous: 3e5d9c4a7b81... | Data: Bob→Charlie 0.5 BTC
# Block #3 | Hash: 91b4e7f2a305... | Previous: c2f1a8d6e934... | Data: Dave→Eve 2.3 BTC
# Block #4 | Hash: f6d8c1b5e4a2... | Previous: 91b4e7f2a305... | Data: Eve→Alice 0.1 BTC

This code is designed so that every time a Block is instantiated, it automatically calculates its own hash by combining the index, timestamp, data, and previous hash. In the output, verify with your own eyes that Block #1's "Previous" exactly matches Block #0's "Hash." This connection is what blockchain is all about.

Adding a Chain Verification Function

Now that we've built blocks, we also need a "doctor" to check if this chain is healthy.

# Chain integrity verification function — tamper detection
def verify_chain(chain):
    """Traverse all blocks in the chain and check integrity"""
    for i in range(1, len(chain)):
        current = chain[i]
        previous = chain[i - 1]
        
        # Check 1: Does the current block's hash match the actual data?
        if current.hash != current.calc_hash():
            print(f"❌ Block #{i} hash mismatch! Data has been tampered")
            return False
        
        # Check 2: Does the previous block hash match the actual previous block?
        if current.prev_hash != previous.hash:
            print(f"❌ Block #{i} chain broken! Previous hash mismatch")
            return False
    
    print("✅ Chain integrity verification complete — all blocks normal")
    return True

# Verify normal chain
verify_chain(chain)

# Secretly tamper with Block #2's data
chain[2].data = "Bob→Charlie 999 BTC"  # 0.5 to 999!

# Verify tampered chain
verify_chain(chain)

# Output:
# ✅ Chain integrity verification complete — all blocks normal
# ❌ Block #2 hash mismatch! Data has been tampered

This verification function checks two things. First, whether the block's hash matches the value recalculated from the current data (data tampering detection). Second, whether the "previous hash" each block remembers matches the actual previous block's hash (chain link verification). The data was changed but the hash wasn't recalculated, so it's immediately caught by the first check. This is the essence of blockchain tamper prevention.

🔍 Deep dive: What if you fix the hash too?

Good question. What if an attacker changes Block #2's data and then calls calc_hash() again to update the hash? Check 1 would pass, but Check 2 would catch it. This is because Block #3 remembers Block #2's original hash. Ultimately, Blocks #3, #4, etc. all need to be recalculated too, and in a proof-of-work environment this represents an astronomical cost.

# Confirming the chain breaks even when the hash is also fixed
chain[2].hash = chain[2].calc_hash()  # Recalculate hash
verify_chain(chain)
# Output: ❌ Block #3 chain broken! Previous hash mismatch

The location where the chain breaks just moved from #2 to #3. Like dominoes, fixing one breaks the next.


5. Looking at Real Blocks with a Block Explorer

Now that we've internalized the principles through code, it's time to look at the actual Bitcoin network. Click on any block at mempool.space and you can verify all the fields we've learned.

The actual values for Bitcoin's genesis block (Block #0) are:

FieldActual Value
Version1
Previous Block Hash0000000000000000000000000000000000000000000000000000000000000000
Merkle Root4a5e1e4baab89f3a32518a88c31bc87f618f76673e2cc77ab2127b7afdeda33b
Timestamp2009-01-03 18:15:05 UTC
Difficulty (Bits)1d00ffff
Nonce2083236893

The genesis block's previous hash being all zeros simply means "there is no previous block." It's the very first block, hardcoded by Satoshi Nakamoto himself.

The message Satoshi embedded in the genesis block's coinbase transaction is also famous:

"The Times 03/Jan/2009 Chancellor on brink of second bailout for banks"

A January 3, 2009 London Times headline. A bank bailout article permanently inscribed on the blockchain. The answer to "why was Bitcoin necessary" — covered in lesson 1 — is compressed into this single line.

🤔 Think about it: Can the 50 BTC mining reward from the genesis block be spent?

See answer

It cannot. Due to special handling in Bitcoin's code, the genesis block's coinbase transaction is not included in the UTXO set. Whether Satoshi designed this intentionally or it was a coding mistake, no one knows. But as a result, these 50 BTC are locked forever. This will be revisited in lesson 4 (UTXO model).


🔨 Project Update

In the last lesson (lesson 1), there was no project code yet — we were only laying conceptual foundations. Starting today, we build the project in earnest.

What We're Adding in This Lesson

Google Sheets blockchain simulator + Python simulator code

Step 1: Build the Complete Simulator in Python

# === Complete Blockchain Simulator (Lesson 2 Project) ===
# Copy and run this code
import hashlib

def calc_block_hash(index, prev_hash, data):
    """Combine block data and calculate SHA-256 hash"""
    raw = f"{index}{prev_hash}{data}"
    return hashlib.sha256(raw.encode()).hexdigest()

def create_chain():
    """Create a 5-block mini blockchain"""
    chain = []
    
    # List of block data
    block_data = [
        "Genesis block — Start",
        "Alice→Bob 1.5 BTC",
        "Bob→Charlie 0.8 BTC",
        "Charlie→Dave 2.0 BTC",
        "Dave→Eve 0.3 BTC"
    ]
    
    for i, data in enumerate(block_data):
        prev_hash = "0" * 64 if i == 0 else chain[i-1]["hash"]
        block_hash = calc_block_hash(i, prev_hash, data)
        
        block = {
            "index": i,
            "data": data,
            "prev_hash": prev_hash,
            "hash": block_hash
        }
        chain.append(block)
    
    return chain

def print_chain(chain):
    """Print the chain in a readable format"""
    print("=" * 70)
    print("  Blockchain Simulator — 5 Blocks")
    print("=" * 70)
    for b in chain:
        print(f"\n📦 Block #{b['index']}")
        print(f"   Data:          {b['data']}")
        print(f"   Previous hash: {b['prev_hash'][:16]}...")
        print(f"   Current hash:  {b['hash'][:16]}...")

def verify_and_report(chain):
    """Chain integrity verification + results report"""
    print("\n" + "=" * 70)
    print("  Integrity Verification Results")
    print("=" * 70)
    all_valid = True
    
    for i in range(len(chain)):
        # Recalculate hash
        expected = calc_block_hash(
            chain[i]["index"],
            chain[i]["prev_hash"],
            chain[i]["data"]
        )
        hash_ok = (chain[i]["hash"] == expected)
        
        # Check chain link (excluding first block)
        link_ok = True
        if i > 0:
            link_ok = (chain[i]["prev_hash"] == chain[i-1]["hash"])
        
        status = "✅" if (hash_ok and link_ok) else "❌"
        if not (hash_ok and link_ok):
            all_valid = False
        
        print(f"  Block #{i}: {status}  Hash={'OK' if hash_ok else 'FAIL'}  Link={'OK' if link_ok else 'BROKEN'}")
    
    print(f"\n  Final result: {'✅ Chain normal' if all_valid else '❌ Tampering detected!'}")
    return all_valid

# === Run ===
print("\n🟢 [Step 1] Create normal chain")
chain = create_chain()
print_chain(chain)
verify_and_report(chain)

print("\n\n🔴 [Step 2] Tamper with Block #2 data")
chain[2]["data"] = "Bob→Charlie 999 BTC"  # Manipulating 0.8 to 999!
print(f"  → Changed Block #2 data to 'Bob→Charlie 999 BTC'")
verify_and_report(chain)

# Output:
# 🟢 [Step 1] Create normal chain
# ======================================================================
#   Blockchain Simulator — 5 Blocks
# ======================================================================
#
# 📦 Block #0
#    Data:          Genesis block — Start
#    Previous hash: 0000000000000000...
#    Current hash:  a1b2c3d4e5f67890...
# (... Blocks #1~#4 output ...)
#
# ======================================================================
#   Integrity Verification Results
# ======================================================================
#   Block #0: ✅  Hash=OK  Link=OK
#   Block #1: ✅  Hash=OK  Link=OK
#   Block #2: ✅  Hash=OK  Link=OK
#   Block #3: ✅  Hash=OK  Link=OK
#   Block #4: ✅  Hash=OK  Link=OK
#
#   Final result: ✅ Chain normal
#
# 🔴 [Step 2] Tamper with Block #2 data
#   → Changed Block #2 data to 'Bob→Charlie 999 BTC'
# ======================================================================
#   Integrity Verification Results
# ======================================================================
#   Block #0: ✅  Hash=OK  Link=OK
#   Block #1: ✅  Hash=OK  Link=OK
#   Block #2: ❌  Hash=FAIL  Link=OK
#   Block #3: ✅  Hash=OK  Link=OK
#   Block #4: ✅  Hash=OK  Link=OK
#
#   Final result: ❌ Tampering detected!

Step 2: Build the Google Sheets Simulator

Moving the same principles to a spreadsheet is much more visually intuitive.

Sheet setup instructions:

  1. Create a new tab in Google Sheets and name it "Blockchain Simulator"
  2. Enter the following structure:
ABCDE
1Block NumberDataPrevious HashCurrent HashVerification
20Genesis block0000000000(formula below)(formula below)
31Alice→Bob 1.5 BTC(formula below)(formula below)(formula below)
42Bob→Charlie 0.8 BTC(formula below)(formula below)(formula below)
53Charlie→Dave 2.0 BTC(formula below)(formula below)(formula below)
64Dave→Eve 0.3 BTC(formula below)(formula below)(formula below)
  1. Enter formulas:

    • D2 (Block #0 hash): =SHA256(A2&B2&C2) (Since Google Sheets has no SHA256 function, use the Apps Script below as an alternative)
    • C3~C6 (previous hash): =D2, =D3, =D4, =D5
    • E2~E6 (verification): =IF(D2=SHA256(A2&B2&C2), "✅ Normal", "❌ Tampered")
  2. Add SHA256 function via Google Apps Script (Extensions → Apps Script):

// Custom function to use SHA256 hash in Google Sheets
function SHA256(input) {
  // Convert input value to string
  var rawInput = String(input);
  // Calculate SHA-256 hash with Utilities.computeDigest
  var rawHash = Utilities.computeDigest(
    Utilities.DigestAlgorithm.SHA_256,
    rawInput,
    Utilities.Charset.UTF_8
  );
  // Convert byte array to hexadecimal string
  var hash = '';
  for (var i = 0; i < rawHash.length; i++) {
    var byte = rawHash[i];
    if (byte < 0) byte += 256; // Correct negative bytes
    var hex = byte.toString(16);
    if (hex.length === 1) hex = '0' + hex; // Pad single digit with 0
    hash += hex;
  }
  return hash;
}
  1. Set up conditional formatting:

    • Select the E column range → Format → Conditional formatting
    • Rule 1: Text contains "✅ Normal" → background color green (#d4edda)
    • Rule 2: Text contains "❌ Tampered" → background color red (#f8d7da)
  2. Test: Change cell B4 ("Bob→Charlie 0.8 BTC") to "Bob→Charlie 999 BTC". You should see E4 through E6 turn red!

Run the project you've built so far. In the Python simulator, changing Block #2's data should print ❌ Tampering detected!, and in Google Sheets you should see the rows from that point turn red.


Summary — The Blockchain Skeleton at a Glance

3-Line Summary

  1. Hash functions convert any data into a fixed-length fingerprint, and thanks to the avalanche effect, even microscopic tampering is immediately revealed
  2. Inside the block header's 80 bytes, 6 fields — previous hash, Merkle root, nonce, etc. — are compressed without gaps
  3. Because each block holds the previous block's hash, touching one invalidates all subsequent blocks — this is the core of blockchain security

Preview of Next Lesson

Today we saw how blocks are linked. But one question remains unanswered — "who has the authority to create a new block?" That's mining and proof of work. Next time we'll dig into the process of finding a hash satisfying specific conditions by changing the nonce billions of times, and why the difficulty automatically adjusts so that exactly one block is produced every ~10 minutes.


Difficulty Fork

🟢 If it was easy — revisiting only the essentials

If you followed along well, just remember these 3 things:

  • SHA-256: input → 64-character fixed output, one-way, avalanche effect
  • Block header: 6 fields (version, previous hash, Merkle root, timestamp, difficulty, nonce)
  • Chain linking: storing the previous hash means changing one breaks everything after it

Next lesson is mining and proof of work. The "nonce" and "difficulty" fields we learned today play a central role.

🟡 If it was difficult — re-explained with a different analogy

Let's think about it again with the notary office analogy:

  • Hash function = A notary stamp machine. Insert a document and a unique seal number comes out. Change even one character in the document and the seal number changes completely.
  • Block = One page in a notarized ledger. It contains transaction records + the seal number from the previous page.
  • Chain = Since each page records the previous page's seal, if you tear out a middle page and swap it, all the seals on subsequent pages won't match.

If this analogy becomes clear in your mind, read through the code again slowly. prev_hash is "the previous page's seal" and calc_hash() is "the notary stamp machine."

Additional practice: In the Python simulator, try changing Block #0 (genesis) data and see which block starts failing verification.

🔴 Challenge — Interview/professional level problems

Challenge 1: Implement a Merkle tree

Write a function in Python that calculates the Merkle root when there are 8 transactions. Use hashlib.sha256, and include handling for when the transaction count is odd by duplicating the last one.

Challenge 2: Birthday attack simulation

Instead of SHA-256, create a "weak hash function" that truncates hash output to the first 4 characters (16 bits), and insert random inputs until a collision occurs. How many attempts does it take on average? Compare with the theoretical expected value (2^8 = 256 attempts).

Challenge 3: Real interview question

You're asked: "Instead of the Merkle root in the Bitcoin block header, why can't we just list all the transaction hashes directly?" Answer in 3 sentences from the perspective of space complexity and verification efficiency.

Code Playground

Python백문이 불여일견. 직접 해시를 만들어보자.
Python
Python
Python
Python같은 입력 → 항상 같은 출력. 백 번을 돌려도, 어느 컴퓨터에서 돌려도 결과는 동일하다.
Python여기서 진짜 마법이 시작된다. 입력을 한 글자만 바꿔도 해시값이 **완전히** 달라진다.
Python이걸 비트 단위로 측정하면 더 선명해진다:
Python코드로 직접 증명해보자.
Python원리를 이해했으니, 이제 진짜 만들어볼 시간이다. Python 클래스 하나로 블록을 정의하고, 체인 검증 로직까지 넣는다.
Python블록을 만들었으면, 이 체인이 건강한지 검사하는 "의사"도 필요하다.

Q&A