Anatomy of a Block — The Data Chain Built with Headers, Timestamps, and Previous Hashes
Learning Objectives
- ✓비트코인 트랜잭션의 입력·출력 구조를 도식으로 그리고 각 필드의 역할을 설명할 수 있다
- ✓서명된 트랜잭션을 Python으로 생성하고 txid를 계산할 수 있다
- ✓제3자 관점에서 트랜잭션의 서명 유효성을 검증하는 코드를 작성할 수 있다
Transaction Design: Structuring Sender, Recipient, and Amount
In Lesson 2, we completed the Wallet class. The flow of signing with a private key and verifying with a public key. We now hold a tool that proves "I sent this" mathematically. But a signature alone is useless. What you sign is what matters.
Today we build that "what." The smallest unit of value transfer in a blockchain: the Transaction.
When I first wrote smart contracts, I glossed over transaction structure. The results were catastrophic. I couldn't grasp why msg.sender was needed in Solidity, where msg.value came from, and I ended up missing a reentrancy bug in a DeFi contract. Understanding transaction structure properly isn't just "studying Bitcoin." It's building the mental model that runs through all of blockchain.
Today's Mission
By the end of today's lesson, you'll have:
transaction.py— aTransactionclass,create_transaction(), andvalidate_transaction()functions- A working system that actually connects
hash_utils.pyfrom Lesson 1 andwallet.pyfrom Lesson 2 - A complete flow for creating signed transactions and independently verifying them as a third party
What Is a Transaction — The 'Digital Check' Analogy
Think of a bank check. What's written on it?
| Check Field | Transaction Field | Role |
|---|---|---|
| Issuer name | sender (sender address) | Who is sending |
| Payee name | recipient (recipient address) | Who is receiving |
| Amount | amount | How much is sent |
| Signature | signature | Did the issuer truly consent |
| Check number | txid | Unique identifier for this transaction |
There's one critical difference from a check. A bank teller verifies a check signature "by eye," but a blockchain transaction is verified mathematically. The ECDSA signature verification we learned in Lesson 2 is that math.
SHA-256 from Lesson 1 reappears here too. The key property of hash functions — same input always produces same output, changing even 1 bit produces completely different output. We use this property to create txids. Feed transaction content into SHA-256 and out comes a unique "fingerprint" for that transaction. That's the txid.
🤔 Think about it: A check has a "date." Do blockchain transactions have timestamps? If so, where?
View answer
Transactions themselves usually don't have timestamps. Instead, blocks carry timestamps. The moment a transaction is included in a block, that block's timestamp becomes the transaction's time. We'll look at this in detail in Lesson 4 when we cover block structure.
Bitcoin Transaction Inputs and Outputs — Simplified Real Structure
I'll be honest. At first I thought of transactions as simple records like "A sent B 5 BTC." I was wrong. Bitcoin's actual model is UTXO (Unspent Transaction Output) based. We'll dive deep in Lesson 7, but let's grab just the core idea now.
The key formula:
Fee = Sum of Inputs − Sum of Outputs
Alice referenced a 10 BTC output and sent 7 BTC to Bob and 2.99 BTC back to herself. The remaining 0.01 BTC? There's no explicit output, so it becomes the miner's fee. I still vividly remember the shock of first understanding this structure. "Ah, Bitcoin doesn't even have a concept of 'balance.'"
🤔 Think about it: What if Alice accidentally omits the change output and puts input 10 BTC, output 7 BTC (to Bob) only?
View answer
All 3 BTC becomes the fee! The miner gets all 3 BTC. This has actually happened in Bitcoin history. In 2016, someone broadcast a transaction that accidentally paid 291 BTC (roughly $135,000 USD at the time) as a fee. This is why you must always validate that the sum of inputs equals the sum of outputs when creating transactions in code.
We'll simplify this UTXO model in PyChain to start. Today we build the basic structure with sender, recipient, and amount fields, then expand to UTXO in Lesson 7. Why simplify first? In my experience, trying to implement the full UTXO from the start buries the more important concept of the sign-verify flow.
Step 1: The Transaction Skeleton — Serialization and txid
To create a transaction, we first need to structure the data. Let's start with a Python dictionary.
# Basic structure of transaction data
import json
import hashlib
tx_data = {
"sender": "alice_address_abc123",
"recipient": "bob_address_def456",
"amount": 7.0
}
# Serialization: dictionary → sorted JSON string
# sort_keys=True is crucial — different key order means different hash!
tx_string = json.dumps(tx_data, sort_keys=True)
print(f"Serialized transaction: {tx_string}")
# Generate txid: create a 'fingerprint' with SHA-256 from Lesson 1
txid = hashlib.sha256(tx_string.encode()).hexdigest()
print(f"Transaction ID (txid): {txid}")
# Expected output:
# Serialized transaction: {"amount": 7.0, "recipient": "bob_address_def456", "sender": "alice_address_abc123"}
# Transaction ID (txid): 6a7f8d... (64-character hex hash)
In this code, json.dumps(tx_data, sort_keys=True) converts the dictionary to a JSON string sorted alphabetically by key. Then we feed it into SHA-256 to get a 64-character hash — the txid. The information "who, to whom, how much" stored in the dictionary gets compressed into a single unique fingerprint.
So what happens if we remove sort_keys=True?
❌ Don't do this:
# Serialize without sort_keys
tx_string_bad1 = json.dumps({"sender": "alice", "recipient": "bob", "amount": 7.0})
tx_string_bad2 = json.dumps({"amount": 7.0, "sender": "alice", "recipient": "bob"})
hash1 = hashlib.sha256(tx_string_bad1.encode()).hexdigest()
hash2 = hashlib.sha256(tx_string_bad2.encode()).hexdigest()
print(f"Hash 1: {hash1}")
print(f"Hash 2: {hash2}")
print(f"Same content, same hash? {hash1 == hash2}")
# Expected output:
# Hash 1: 3f2a1b...
# Hash 2: 9c7d4e...
# Same content, same hash? False
The same transaction produces different txids. The "Avalanche Effect" from Lesson 1 triggers here. Change even 1 bit of input and the hash changes completely. Different JSON key order means a different string, which inevitably means a different hash. I once spent 3 days debugging a production system because of this. Serialization must always be deterministic.
✅ Always use sort_keys=True.
🤔 Think about it: Are there other ways to guarantee deterministic serialization besides
json.dumpswithsort_keys=True?
💡 Hint
Think about Python's collections.OrderedDict, or binary serialization formats like Protocol Buffers or MessagePack.
View answer
There are several options:
OrderedDict— explicitly fix key order- Protocol Buffers / MessagePack — schema-based, field order is fixed
- CBOR (Concise Binary Object Representation) — the format used in Ethereum 2.0
Real Bitcoin doesn't use JSON but its own binary serialization format. In PyChain we use sort_keys=True JSON for learning purposes, but binary formats are the standard in production.
Step 2: Connecting to Wallet — Adding the Signature
Now that we know how to structure data and create a txid, let's use the Wallet from Lesson 2 to add a signature to the transaction. The core flow is:
An important point here: the signature is made over the txid (the hash of the transaction). We don't sign the entire transaction data directly — we sign the hash of that data. The reason was covered in Lesson 2 — ECDSA signing works on fixed-length messages, and SHA-256 compresses data of any size to 32 bytes.
# We need wallet.py from previous lessons (full code provided in project update below)
import json
import hashlib
# === Simple signature simulation ===
# (when combining with actual wallet.py, use the ecdsa library)
def create_tx_data(sender, recipient, amount):
"""Create a transaction data dictionary"""
return {
"sender": sender,
"recipient": recipient,
"amount": amount
}
def get_txid(tx_data):
"""Serialize transaction data and return its SHA-256 hash (txid)"""
tx_string = json.dumps(tx_data, sort_keys=True)
return hashlib.sha256(tx_string.encode()).hexdigest()
# Create transaction
tx_data = create_tx_data("alice_addr", "bob_addr", 5.0)
txid = get_txid(tx_data)
print(f"Transaction data: {tx_data}")
print(f"txid: {txid}")
print(f"txid length: {len(txid)} chars (SHA-256 = always 64 chars)")
# Expected output:
# Transaction data: {'sender': 'alice_addr', 'recipient': 'bob_addr', 'amount': 5.0}
# txid: a1b2c3d4... (64-character hash)
# txid length: 64 chars (SHA-256 = always 64 chars)
This code uses create_tx_data() to create a dictionary and get_txid() to convert it to a sorted JSON string and extract the SHA-256 hash. It's the same process we did manually in Step 1, now organized into functions. Signing the txid with the Wallet's sign() method completes the transaction — the full integration is shown in Step 5.
3-Stage Evolution of Transaction Construction — From Novice to Pro
In Steps 1 and 2, we learned the individual pieces of serialization and signing. This is where many beginners make a mistake. They have no sense of "how complete does a transaction need to be?" Compare the three stages below. This is the actual path I went through when I first wrote blockchain code.
❌ WRONG WAY — Broadcasting a raw dictionary as-is
# "Can't we just send the data?"
import json
# 🚫 Transaction with no signature and no txid
raw_tx = {
"sender": "alice_addr",
"recipient": "bob_addr",
"amount": 5.0
}
# Assume we broadcast this directly to the network
def send_to_network(tx):
print(f"Broadcast: {json.dumps(tx)}")
return True
send_to_network(raw_tx)
# The problem: anyone can create a dictionary like this!
fake_tx = {
"sender": "alice_addr", # Pretending to be Alice!
"recipient": "eve_addr", # To Eve (the attacker)
"amount": 1000.0 # 1000 BTC!
}
send_to_network(fake_tx) # Broadcasts without any resistance 😱
# Expected output:
# Broadcast: {"sender": "alice_addr", "recipient": "bob_addr", "amount": 5.0}
# Broadcast: {"sender": "alice_addr", "recipient": "eve_addr", "amount": 1000.0}
What's the problem? Without a txid there's no way to verify data integrity, and without a signature there's no way to tell whether Alice really sent it or Eve is impersonating her. This is like dropping an unsigned check in a mailbox — anyone can write the issuer's name.
🤔 BETTER — Adding txid to ensure integrity
import json
import hashlib
def build_tx_with_id(sender, recipient, amount):
"""Create a transaction with txid included"""
tx_data = {
"sender": sender,
"recipient": recipient,
"amount": amount
}
tx_string = json.dumps(tx_data, sort_keys=True)
txid = hashlib.sha256(tx_string.encode()).hexdigest()
return {**tx_data, "txid": txid}
tx = build_tx_with_id("alice_addr", "bob_addr", 5.0)
print(f"Transaction: {tx['txid'][:16]}... | {tx['amount']} BTC")
# Verification: can detect if amount was tampered with in transit
tx["amount"] = 500.0 # Attacker tampers with amount!
recalc = hashlib.sha256(
json.dumps({"sender": tx["sender"], "recipient": tx["recipient"],
"amount": tx["amount"]}, sort_keys=True).encode()
).hexdigest()
print(f"txid match? {tx['txid'] == recalc}") # False → tampering detected!
# ⚠️ But... what if the attacker also recalculates the txid?
tx["txid"] = recalc # Recalculate txid too!
print(f"txid match after recalculation? {tx['txid'] == recalc}") # True → tampering not caught!
# Expected output:
# Transaction: a1b2c3d4e5f67890... | 5.0 BTC
# txid match? False
# txid match after recalculation? True
One step forward. Simple data tampering can be caught by comparing txids. But what if a crafty attacker changes both the data and the txid together? It still gets through. A hash guarantees integrity only — not authentication (who created it).
✅ BEST — Complete protection with txid + digital signature
from ecdsa import SigningKey, VerifyingKey, SECP256k1, BadSignatureError
import json
import hashlib
def build_signed_tx(sender_sk, sender_address, recipient, amount):
"""Create a complete signed transaction"""
# 1. Build core data
tx_data = {"sender": sender_address, "recipient": recipient, "amount": amount}
tx_string = json.dumps(tx_data, sort_keys=True)
# 2. Generate txid (integrity)
txid = hashlib.sha256(tx_string.encode()).hexdigest()
# 3. Sign with private key (authentication) — this is the decisive difference!
signature = sender_sk.sign(txid.encode())
public_key = sender_sk.get_verifying_key()
return {
**tx_data,
"txid": txid,
"signature": signature.hex(),
"public_key": public_key.to_string().hex()
}
def verify_signed_tx(tx):
"""Full verification of a signed transaction"""
# Recalculate txid
tx_data = {"sender": tx["sender"], "recipient": tx["recipient"], "amount": tx["amount"]}
expected_txid = hashlib.sha256(
json.dumps(tx_data, sort_keys=True).encode()
).hexdigest()
if tx["txid"] != expected_txid:
return False, "txid mismatch"
# Verify signature — attackers cannot get past this wall
try:
vk = VerifyingKey.from_string(bytes.fromhex(tx["public_key"]), curve=SECP256k1)
vk.verify(bytes.fromhex(tx["signature"]), tx["txid"].encode())
return True, "valid"
except BadSignatureError:
return False, "invalid signature"
# Generate Alice's keys
alice_sk = SigningKey.generate(curve=SECP256k1)
alice_addr = hashlib.sha256(alice_sk.get_verifying_key().to_string()).hexdigest()[:40]
# Create signed transaction
tx = build_signed_tx(alice_sk, alice_addr, "bob_addr", 5.0)
valid, msg = verify_signed_tx(tx)
print(f"Normal transaction: {valid} ({msg})")
# Attack attempt: tamper with amount + recalculate txid
tx["amount"] = 500.0
tx["txid"] = hashlib.sha256(
json.dumps({"sender": tx["sender"], "recipient": tx["recipient"],
"amount": tx["amount"]}, sort_keys=True).encode()
).hexdigest()
valid, msg = verify_signed_tx(tx)
print(f"After crafty attack: {valid} ({msg})")
# → The signature was created for the original 5.0 BTC txid, so it mismatches the new 500.0 BTC txid!
# Expected output:
# Normal transaction: True (valid)
# After crafty attack: False (invalid signature)
Comparison summary:
| Integrity (tamper detection) | Authentication (impersonation prevention) | Defense against crafty attacks | |
|---|---|---|---|
| ❌ Raw dictionary | Impossible | Impossible | Impossible |
| 🤔 Add txid | Possible | Impossible | Impossible |
| ✅ txid + signature | Possible | Possible | Possible |
Remember this evolution. Hash (integrity) + Signature (authentication) = Complete protection. Neither alone is sufficient. This is the fundamental principle of blockchain transaction design, and the Transaction class we implement in Steps 3–5 is the code embodiment of this "✅ BEST" stage.
Step 3: Verification — How a Third Party Checks a Transaction
The core value of blockchain is trustless verification. Bob doesn't need to trust Alice. He just checks mathematically. There are four things to verify in the validation logic:
- Is the signature valid? — verify the signature with the sender's public key
- Is the amount positive? — a negative-amount transaction is essentially theft
- Are sender and recipient different? — sending to yourself is (usually) meaningless
- Is the txid correct? — re-hash the transaction data and compare against the txid
# Core logic of the transaction validation function
def validate_transaction_basic(tx):
"""Basic transaction validity check (excluding signature verification)"""
errors = []
# 1. Check required fields exist
required = ["sender", "recipient", "amount", "txid"]
for field in required:
if field not in tx:
errors.append(f"Missing required field: {field}")
if errors:
return False, errors
# 2. Validate amount — must be positive
if tx["amount"] <= 0:
errors.append(f"Invalid amount: {tx['amount']} (must be positive)")
# 3. Verify integrity by recalculating txid
tx_data = {
"sender": tx["sender"],
"recipient": tx["recipient"],
"amount": tx["amount"]
}
expected_txid = get_txid(tx_data) # function created above
if tx["txid"] != expected_txid:
errors.append("txid mismatch — possible data tampering")
is_valid = len(errors) == 0
return is_valid, errors
# Validate a normal transaction
good_tx = {"sender": "alice", "recipient": "bob", "amount": 5.0}
good_tx["txid"] = get_txid(good_tx)
valid, errs = validate_transaction_basic(good_tx)
print(f"Normal transaction: valid={valid}, errors={errs}")
# Validate a tampered transaction — amount changed to 50
tampered_tx = dict(good_tx) # copy
tampered_tx["amount"] = 50.0 # Tamper with amount! But txid stays the same
valid, errs = validate_transaction_basic(tampered_tx)
print(f"Tampered transaction: valid={valid}, errors={errs}")
# Expected output:
# Normal transaction: valid=True, errors=[]
# Tampered transaction: valid=False, errors=['txid mismatch — possible data tampering']
This function checks whether required fields exist, validates that the amount is positive, and recalculates the txid directly to compare against the existing one. Changing the amount from 5 to 50 made the recalculated txid completely different, instantly detecting the tampering. The integrity verification power of the hash function from Lesson 1 shines here.
🤔 Think about it: What if an attacker tampers with the amount and simultaneously recalculates and inserts a new txid? Isn't txid verification alone insufficient?
View answer
Exactly right! txid alone is insufficient. If an attacker changes the amount and recalculates the txid, it passes txid validation. That's why digital signatures are needed. A signature can only be created with the original sender's private key, so if an attacker tampers with the data, the signature becomes invalid. As confirmed in Lesson 2 — if data changes by even 1 bit, signature verification fails. Hash (integrity) + Signature (authentication) = Complete protection.
Step 4: The Secret of Fees — Why Miners Process Transactions
We've secured transaction integrity and authentication with hashes and signatures. But one question remains: why do miners include other people's transactions in their blocks? It's not charity. It's because of an economic incentive called a fee.
# Fee calculation simulation
def calculate_fee(input_amount, outputs):
"""
Fee = Sum of inputs - Sum of outputs
Negative means invalid transaction (attempt to spend nonexistent money)
"""
output_total = sum(outputs)
fee = input_amount - output_total
return fee
# Scenario 1: Normal transaction
fee1 = calculate_fee(
input_amount=10.0,
outputs=[7.0, 2.99] # 7 to Bob, 2.99 change
)
print(f"Scenario 1 — Fee: {fee1} BTC") # 0.01 BTC
# Scenario 2: Forgot the change output!
fee2 = calculate_fee(
input_amount=10.0,
outputs=[7.0] # No change output
)
print(f"Scenario 2 — Fee: {fee2} BTC (accidentally donating 3 BTC!)")
# Scenario 3: Outputs exceed inputs (invalid!)
fee3 = calculate_fee(
input_amount=10.0,
outputs=[7.0, 5.0] # Total 12 > input 10
)
print(f"Scenario 3 — Fee: {fee3} BTC (negative = invalid transaction!)")
# Expected output:
# Scenario 1 — Fee: 0.010000000000000675 BTC
# Scenario 2 — Fee: 3.0 BTC (accidentally donating 3 BTC!)
# Scenario 3 — Fee: -2.0 BTC (negative = invalid transaction!)
This code shows fee calculation with three scenarios. Scenario 1 yields the intended 0.01 BTC fee. Scenario 2 omits the change output, sending all 3 BTC to the miner. Scenario 3 has outputs exceeding inputs — a negative fee, meaning an invalid transaction attempting to spend nonexistent money.
🔍 Deep dive: The floating-point trap
Did you notice 0.010000000000000675 in the output above? 10.0 - 7.0 - 2.99 is not exactly 0.01. This is a limitation of IEEE 754 floating-point. Real Bitcoin avoids this problem by handling amounts in satoshi units as integers. 1 BTC = 100,000,000 satoshis. PyChain can be converted to integer-based later, but for now we'll use floats for learning convenience.
# The actual Bitcoin way — satoshi (integer) based
input_satoshi = 1_000_000_000 # 10 BTC
output1_satoshi = 700_000_000 # 7 BTC
output2_satoshi = 299_000_000 # 2.99 BTC
fee_satoshi = input_satoshi - output1_satoshi - output2_satoshi
print(f"Satoshi-based fee: {fee_satoshi} satoshi = {fee_satoshi / 1e8} BTC")
# Output: Satoshi-based fee: 1000000 satoshi = 0.01 BTC
Using floats in DeFi contracts can throw off funding rate calculations by millions of dollars. This is exactly why Solidity has no float type at all.
Step 5: Putting It All Together — A Complete Signed Transaction
We've covered every individual piece: serialization, txid, validation, and fees. Now let's put it all together. Here's the complete flow of actually signing and verifying using the ecdsa library.
# Creating and verifying a complete signed transaction
from ecdsa import SigningKey, VerifyingKey, SECP256k1, BadSignatureError
import json
import hashlib
# --- Step 1: Create wallets (Lesson 2 review) ---
alice_sk = SigningKey.generate(curve=SECP256k1)
alice_vk = alice_sk.get_verifying_key()
alice_address = hashlib.sha256(alice_vk.to_string()).hexdigest()[:40]
bob_sk = SigningKey.generate(curve=SECP256k1)
bob_vk = bob_sk.get_verifying_key()
bob_address = hashlib.sha256(bob_vk.to_string()).hexdigest()[:40]
print(f"Alice address: {alice_address}")
print(f"Bob address: {bob_address}")
# --- Step 2: Create transaction data ---
tx_data = {
"sender": alice_address,
"recipient": bob_address,
"amount": 5.0
}
tx_string = json.dumps(tx_data, sort_keys=True)
txid = hashlib.sha256(tx_string.encode()).hexdigest()
# --- Step 3: Alice signs with her private key ---
signature = alice_sk.sign(txid.encode())
# --- Step 4: Complete transaction object ---
transaction = {
**tx_data,
"txid": txid,
"signature": signature.hex(),
"public_key": alice_vk.to_string().hex()
}
print(f"\nTransaction created!")
print(f"txid: {txid[:16]}...")
print(f"Signature length: {len(signature.hex())} chars")
# --- Step 5: Third party (network node) verifies ---
# The verifier doesn't know Alice's private key. They only need the public key!
vk_from_tx = VerifyingKey.from_string(
bytes.fromhex(transaction["public_key"]),
curve=SECP256k1
)
try:
vk_from_tx.verify(
bytes.fromhex(transaction["signature"]),
transaction["txid"].encode()
)
print("✅ Signature verification succeeded — transaction valid!")
except BadSignatureError:
print("❌ Signature verification failed — transaction invalid!")
# Expected output:
# Alice address: 7a3f2b... (40 chars)
# Bob address: 9d1e8c... (40 chars)
#
# Transaction created!
# txid: a1b2c3d4e5f67890...
# Signature length: 128 chars
# ✅ Signature verification succeeded — transaction valid!
This code is the heart of blockchain. Step 1 generates key pairs and addresses for Alice and Bob. Step 2 sorts-serializes the transaction data and extracts the txid. Step 3 signs the txid with Alice's private key. Step 4 bundles everything into a single transaction object. Step 5 lets a third party verify the signature using only the public key. No trusted third party needed — no bank, no notary.
🔨 Project Update
Time to add the third module to our PyChain project. First, let's review the code so far.
hash_utils.py from Lesson 1:
# hash_utils.py — SHA-256 hash utilities (Lesson 1)
import hashlib
def sha256_hash(data: str) -> str:
"""Returns the SHA-256 hash of string data as a hex string"""
return hashlib.sha256(data.encode('utf-8')).hexdigest()
def double_hash(data: str) -> str:
"""Bitcoin-style double SHA-256 hash"""
first = hashlib.sha256(data.encode('utf-8')).digest()
return hashlib.sha256(first).hexdigest()
# Test
if __name__ == "__main__":
test = "Hello, PyChain!"
print(f"Input: {test}")
print(f"SHA-256: {sha256_hash(test)}")
print(f"Double hash: {double_hash(test)}")
wallet.py from Lesson 2:
# wallet.py — Wallet class (Lesson 2)
from ecdsa import SigningKey, VerifyingKey, SECP256k1, BadSignatureError
import hashlib
class Wallet:
"""Bitcoin-style wallet — key pair generation, address derivation, signing/verification"""
def __init__(self):
# Generate private key (SECP256k1 elliptic curve)
self.private_key = SigningKey.generate(curve=SECP256k1)
# Derive public key (private → public is one-way)
self.public_key = self.private_key.get_verifying_key()
# Address: first 40 chars of SHA-256 hash of public key
self.address = hashlib.sha256(
self.public_key.to_string()
).hexdigest()[:40]
def sign(self, message: str) -> str:
"""Sign a message and return the hex signature string"""
signature = self.private_key.sign(message.encode('utf-8'))
return signature.hex()
def get_public_key_hex(self) -> str:
"""Return the public key as a hex string"""
return self.public_key.to_string().hex()
@staticmethod
def verify(public_key_hex: str, signature_hex: str, message: str) -> bool:
"""Verify a signature with a public key — anyone can do this"""
try:
vk = VerifyingKey.from_string(
bytes.fromhex(public_key_hex),
curve=SECP256k1
)
vk.verify(bytes.fromhex(signature_hex), message.encode('utf-8'))
return True
except BadSignatureError:
return False
# Test
if __name__ == "__main__":
w = Wallet()
print(f"Address: {w.address}")
msg = "test message"
sig = w.sign(msg)
print(f"Signature: {sig[:32]}...")
print(f"Verification: {Wallet.verify(w.get_public_key_hex(), sig, msg)}")
🆕 transaction.py added in Lesson 3:
# transaction.py — Transaction class (Lesson 3)
import json
from hash_utils import sha256_hash
from wallet import Wallet
class Transaction:
"""Blockchain transaction — structures sender, recipient, and amount"""
def __init__(self, sender: str, recipient: str, amount: float,
signature: str = "", public_key: str = ""):
self.sender = sender # Sender address
self.recipient = recipient # Recipient address
self.amount = amount # Transfer amount
self.signature = signature # Digital signature (hex)
self.public_key = public_key # Sender's public key (hex)
self.txid = self._calculate_txid() # Unique transaction ID
def _get_tx_data(self) -> dict:
"""Extract only the core data that gets signed"""
return {
"sender": self.sender,
"recipient": self.recipient,
"amount": self.amount
}
def _calculate_txid(self) -> str:
"""SHA-256 hash of transaction data = txid"""
tx_string = json.dumps(self._get_tx_data(), sort_keys=True)
return sha256_hash(tx_string)
def to_dict(self) -> dict:
"""Convert transaction to dictionary (for serialization)"""
return {
"txid": self.txid,
"sender": self.sender,
"recipient": self.recipient,
"amount": self.amount,
"signature": self.signature,
"public_key": self.public_key
}
def __repr__(self):
return (f"TX({self.txid[:8]}... | "
f"{self.sender[:8]}→{self.recipient[:8]} | "
f"{self.amount} BTC)")
def create_transaction(sender_wallet: Wallet, recipient_address: str,
amount: float) -> Transaction:
"""Create a signed transaction using a wallet"""
# 1. Create transaction skeleton
tx = Transaction(
sender=sender_wallet.address,
recipient=recipient_address,
amount=amount
)
# 2. Sign txid with wallet's private key
tx.signature = sender_wallet.sign(tx.txid)
tx.public_key = sender_wallet.get_public_key_hex()
return tx
def validate_transaction(tx: Transaction) -> tuple[bool, list[str]]:
"""Validate a transaction — anyone can run this"""
errors = []
# Check 1: Verify amount is positive
if tx.amount <= 0:
errors.append(f"Invalid amount: {tx.amount}")
# Check 2: Verify data integrity by recalculating txid
expected_txid = tx._calculate_txid()
if tx.txid != expected_txid:
errors.append("txid mismatch — suspected data tampering")
# Check 3: Verify signature exists
if not tx.signature or not tx.public_key:
errors.append("Missing signature or public key")
return False, errors
# Check 4: Verify digital signature (the key check!)
is_sig_valid = Wallet.verify(tx.public_key, tx.signature, tx.txid)
if not is_sig_valid:
errors.append("Signature verification failed — forged transaction")
return len(errors) == 0, errors
# === Test: Run the full flow ===
if __name__ == "__main__":
print("=" * 60)
print("PyChain Transaction System Test")
print("=" * 60)
# 1. Create wallets
alice = Wallet()
bob = Wallet()
print(f"\n👛 Alice address: {alice.address}")
print(f"👛 Bob address: {bob.address}")
# 2. Create Alice → Bob 5 BTC transaction
tx = create_transaction(alice, bob.address, 5.0)
print(f"\n📝 Transaction created: {tx}")
print(f" txid: {tx.txid}")
print(f" signature: {tx.signature[:32]}...")
# 3. Verify (anyone can do this)
is_valid, errors = validate_transaction(tx)
print(f"\n✅ Validation result: valid={is_valid}")
# 4. Tampering attempt — change amount to 500
print("\n--- Attack simulation: amount tampering ---")
tx.amount = 500.0
# Without recalculating txid, caught by txid mismatch
is_valid, errors = validate_transaction(tx)
print(f"After tampering: valid={is_valid}, errors={errors}")
# 5. Crafty attack — tamper amount + recalculate txid
print("\n--- Crafty attack: tamper amount + recalculate txid ---")
tx.amount = 500.0
tx.txid = tx._calculate_txid() # Recalculate txid too!
is_valid, errors = validate_transaction(tx)
print(f"After crafty tampering: valid={is_valid}, errors={errors}")
print("→ The signature was made for the original txid (5 BTC), so it mismatches the new txid (500 BTC)!")
# Expected output:
# ============================================================
# PyChain Transaction System Test
# ============================================================
#
# 👛 Alice address: 7a3f2b9e... (40 chars)
# 👛 Bob address: 9d1e8cf4... (40 chars)
#
# 📝 Transaction created: TX(a1b2c3d4... | 7a3f2b9e→9d1e8cf4 | 5.0 BTC)
# txid: a1b2c3d4...
# signature: 3045022100...
#
# ✅ Validation result: valid=True
#
# --- Attack simulation: amount tampering ---
# After tampering: valid=False, errors=['txid mismatch — suspected data tampering', 'Signature verification failed — forged transaction']
#
# --- Crafty attack: tamper amount + recalculate txid ---
# After crafty tampering: valid=False, errors=['Signature verification failed — forged transaction']
# → The signature was made for the original txid (5 BTC), so it mismatches the new txid (500 BTC)!
Run the project you've built so far:
# Install required library
pip install ecdsa
# File structure:
# pychain/
# ├── hash_utils.py (Lesson 1)
# ├── wallet.py (Lesson 2)
# └── transaction.py (Lesson 3) ← Added today!
# Run:
python transaction.py
Review — Self-Check Checklist
If you've run the code, use the checklist below to assess your understanding.
| # | Check item | Done |
|---|---|---|
| 1 | Can you explain the 5 fields of a transaction (sender, recipient, amount, signature, txid)? | ☐ |
| 2 | Can you explain that txid is a SHA-256 hash and why sort_keys=True is necessary? | ☐ |
| 3 | Do you understand the difference between "amount-only tampering" and "amount+txid recalculation" attacks? | ☐ |
| 4 | Do you understand the formula: fee = sum of inputs − sum of outputs? | ☐ |
| 5 | Can you explain why create_transaction() requires the wallet's private key? | ☐ |
Top 3 common mistakes:
- Ignoring serialization order — forgetting
sort_keys=Trueproduces different txids for the same transaction - Signing the wrong target — you must sign the
txid(the hash of core data), not the entire transaction - Omitting the public key from the transaction — the verifier has no way to check the signature without the public key
Next Level — How a Senior Engineer Thinks Differently
When I audit actual DeFi protocols, there are things I always check in transactions:
-
Replay Attack defense: An attack that reuses the same transaction on a different chain. This actually happened on Ethereum before EIP-155 came out. Adding a
chain_idfield to our Transaction class defends against this. -
Nonce field: Sending the same transaction twice produces the same txid. To prevent this, each transaction needs an incrementing nonce (sequence number).
-
Gas Limit: In Ethereum, transactions specify not just fees but computational cost. It's a safeguard protecting the network from infinite-loop contracts.
🔍 Deep dive: What is a Replay Attack?
When Ethereum and Ethereum Classic split in 2016 (hard fork), a transaction sent on one chain could be copied and submitted to the other chain where it would execute identically. If Alice sent Bob 10 ETH on the ETH chain, an attacker could submit that same transaction to the ETC chain and Alice's 10 ETC would also go to Bob. This was later fixed in EIP-155 by including chain_id in the signature.
# Example of adding chain_id for Replay Attack defense
tx_data_with_chain = {
"sender": "alice_addr",
"recipient": "bob_addr",
"amount": 5.0,
"chain_id": 1 # 1: Ethereum mainnet, 61: ETC
}
# Now ETH and ETC transactions have different txids, so reuse is impossible
Final Summary Before the Quiz
Difficulty Fork
🟢 If it was easy
3-line summary:
- Transaction = sender + recipient + amount + signature + txid
- txid = SHA-256(serialized data), signature = private_key(txid)
- Verification = compare recalculated txid + verify signature with public key
Preview of next lesson: In Lesson 4, we bundle multiple transactions into a single block. We'll learn the role of block headers, nonces, and timestamps, and implement the structure of how transactions fit into blocks.
🟡 If it was difficult
Let's revisit the transaction with a package delivery analogy:
- Sender address = shipper
- Recipient address = recipient
- Amount = contents of the package
- Signature = shipper's official seal — the courier verifies "yes, this person really sent it"
- txid = tracking number — auto-generated from the package contents, so if contents change, the number changes too
If the signature part is confusing, run the Wallet code from Lesson 2 again. Tracing through how sign() and verify() work by hand will make today's code much clearer.
Extra practice: In transaction.py, try changing the order of checks in validate_transaction(). What happens if you do signature verification first?
🔴 Challenge
Challenge 1 — Add Nonce:
Add a nonce field to the Transaction class so that even if sender→recipient→amount is identical, a different txid is generated each time. The nonce should be the cumulative number of transactions by the sender.
Challenge 2 — Multi-output transaction:
Extend the amount field from a single number to outputs: list[dict] ([{"address": "...", "amount": 3.0}, ...]). Implement fee calculation logic as well.
Interview question: "What is Bitcoin transaction malleability, and how did SegWit solve it?"
Interview question hint
Before SegWit, signature (witness) data was included in the txid calculation. Slightly altering the signature format (same mathematical signature but different encoding) would change the txid, making it impossible to predict a txid before the transaction was included in a block. SegWit separated signatures into a separate "witness" area and excluded them from txid calculation.