Vivory Codex

On-Chain Data Design: Structs, Mappings, and Building the Campaign Registry

3강214,329

학습 목표

  • Define a struct that models a real-world entity with multiple typed fields
  • Use mappings with the counter pattern to store and retrieve records by unique ID
  • Explain the gas cost difference between storage writes and memory operations
  • Distinguish between storage references and memory copies when working with structs
  • Implement a createCampaign function that validates input, increments a counter, and stores structured data on-chain
  • Read structured data from a mapping using a view function with proper data location annotations

On-Chain Data Design: Structs, Mappings, and Building the Campaign Registry

Last lesson, you deployed a contract skeleton with an owner address, a campaignCount set to zero, and a getCampaignCount() view function. You've got the bones. Now we need organs.

Here's the reality: every serious smart contract is fundamentally a database with business logic attached. Your FundChain contract needs to store campaigns — each with a creator, a title, a goal amount, a deadline, and more. A single uint256 won't cut it anymore. Today, you learn how Solidity lets you design complex on-chain data, and more importantly, how to do it without burning your users' ETH on gas.

I've seen contracts blow through $50k in unnecessary gas costs because the developer chose the wrong data structure. Arrays where they needed mappings. Storage writes where memory would do. By the end of today, you'll understand exactly why those mistakes happen — and you won't make them.


Today's Mission

Concrete output: A working FundChain contract with:

  • A Campaign struct holding 7 fields
  • A campaigns mapping linking IDs to Campaign data
  • A createCampaign() function that stores new campaigns on-chain
  • A getCampaign() view function to read them back
  • Tested in Remix: create 2+ campaigns, read them, verify they persist

You'll walk away able to design data models for any smart contract, not just crowdfunding.


Structs: Your Custom Data Blueprint

Remember from Lesson 2 — Solidity gives you primitive types: uint256, address, bool, string. But a crowdfunding campaign isn't a single number or a single address. It's a bundle of related data. That's what structs are for.

Think of a struct like a custom type you're inventing. Solidity doesn't know what a "Campaign" is — you teach it.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract StructDemo {
    // Define the blueprint — no data stored yet
    struct Campaign {
        address creator;
        string title;
        string description;
        uint256 goalAmount;
        uint256 currentAmount;
        uint256 deadline;
        bool isCompleted;
    }

    // Now USE the blueprint to create an actual variable
    Campaign public myCampaign;

    function setCampaign() public {
        myCampaign = Campaign(
            msg.sender,         // creator
            "Build a Bridge",   // title
            "Community bridge", // description
            5 ether,            // goalAmount
            0,                  // currentAmount
            block.timestamp + 30 days, // deadline
            false               // isCompleted
        );
    }
}
// Deploy → call setCampaign() → click myCampaign
// Output: tuple with all 7 fields populated

My strong opinion: always list struct fields in a logical order — identity fields first (creator), content fields next (title, description), financial fields (goalAmount, currentAmount), then temporal and status fields (deadline, isCompleted). I've reviewed dozens of production contracts, and the ones that are readable six months later all follow this pattern. The ones that don't? I've seen isCompleted wedged between title and goalAmount. Debugging those is misery.

🤔 Think about this: In Lesson 1, we said the EVM has three data areas — stack, memory, and storage. When you declare Campaign public myCampaign as a state variable, which area does it live in?

Answer

Storage. State variables always live in contract storage — the persistent key-value store that survives between transactions. This is the most expensive area to write to (around 20,000 gas for a fresh slot write), but it's the only place data persists permanently on-chain. Memory and stack are wiped after each transaction completes.

Two Ways to Initialize a Struct

There's a positional syntax and a named-field syntax. I always prefer named fields in production — here's why:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract InitStyles {
    struct Campaign {
        address creator;
        string title;
        uint256 goalAmount;
    }

    Campaign public campA;
    Campaign public campB;

    function positionalInit() public {
        // ❌ Positional — which field is which?
        campA = Campaign(msg.sender, "Alpha Fund", 1 ether);
    }

    function namedInit() public {
        // ✅ Named — crystal clear, order doesn't matter
        campB = Campaign({
            title: "Beta Fund",
            creator: msg.sender,
            goalAmount: 2 ether
        });
    }
}
// Deploy → call both functions → click campA and campB
// campA: (your_address, "Alpha Fund", 1000000000000000000)
// campB: (your_address, "Beta Fund", 2000000000000000000)

Positional init looks clean with 3 fields. Now imagine 7 fields. Or 12. You will swap two uint256 values by accident, and the compiler won't catch it because the types match. I've done this. The contract deployed with goalAmount and deadline switched. Named fields eliminate that entire class of bugs.


Mappings: The On-Chain Key-Value Store

Arrays let you store ordered lists. But on-chain, you almost never want to iterate through a list — each step costs gas, and if your array grows to 10,000 items, a loop over it could exceed the block gas limit and make your function permanently uncallable.

Mappings are Solidity's answer. They're hash-based lookups: give me a key, I give you a value. Constant time. No iteration. That's the tradeoff — and for 90% of smart contract use cases, it's the right one.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract MappingDemo {
    // mapping(KeyType => ValueType) visibility name;
    mapping(uint256 => string) public names;
    mapping(address => uint256) public balances;

    function setData() public {
        names[1] = "Alice";
        names[2] = "Bob";
        balances[msg.sender] = 100;
    }

    function getData() public view returns (string memory, uint256) {
        return (names[1], balances[msg.sender]);
    }
}
// Deploy → call setData() → call getData()
// Output: ("Alice", 100)
// Try: names(1) → "Alice", names(99) → "" (default!)

Here's the critical thing most tutorials skip: every possible key in a mapping already "exists" with its default value. Call names(99) and you won't get an error — you'll get an empty string. Call balances on a random address and you'll get 0. The mapping doesn't know what's been explicitly set and what hasn't.

This matters enormously. You can't ask a mapping "how many entries do you have?" or "give me all the keys." That's why we need the counter pattern — more on that shortly.

FeatureMappingDynamic Array
Lookup by key/indexO(1) — constantO(1) — constant
Check if key existsNo built-in wayCan check length
Iterate all itemsNot possibleYes, but costly
Get total countNeed separate counter.length built-in
Gas for single access~200 gas (warm)~200 gas (warm)
Gas for iterationN/AO(n) — dangerous
Use caseMost contract dataSmall, bounded lists

My rule of thumb: default to mappings. Only use arrays when you truly need to enumerate every element (like listing all donor addresses for a UI) — and even then, cap the array size.

🤔 Think about this: If mappings can't be iterated and don't have a length, how does a DApp frontend display "all campaigns"? How does Etherscan show all the token holders of an ERC-20?

Answer

Two approaches:

  1. Events (Lesson 6): Contracts emit events when state changes. Off-chain indexers (like The Graph or Etherscan) listen to these events and build a queryable database. The blockchain is the source of truth; the indexer makes it searchable.
  2. Counter pattern + sequential reads: Keep a count variable, use sequential IDs (1, 2, 3...), and let the frontend loop through getCampaign(1), getCampaign(2), etc. This is exactly what we'll build today.

Most production DApps use both.


Storage vs. Memory vs. Calldata: Where Your Data Lives

This is where Solidity gets genuinely tricky, and where Lesson 1's EVM architecture pays off. Remember the three data regions? Now you need to explicitly tell Solidity where complex data (strings, arrays, structs) should live during function execution.

Here's a concrete example that shows the difference:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract DataLocations {
    struct Player {
        string name;
        uint256 score;
    }

    Player public player1;

    // calldata: read-only, cheapest for inputs
    function setPlayer(string calldata _name) external {
        // memory: temporary copy, modifiable
        Player memory tempPlayer = Player(_name, 0);
        tempPlayer.score = 100; // ✅ can modify memory

        // storage: writing to persistent state
        player1 = tempPlayer; // copies memory → storage
    }

    function getScore() external view returns (uint256) {
        // Reading from storage — no gas cost in view
        return player1.score;
    }
}
// Deploy → setPlayer("Alex") → getScore()
// Output: 100

The rule is simple:

  • Function parameters (complex types like string, arrays): use calldata for external functions, memory for public functions
  • Local variables (complex types): use memory
  • State variables: always storage (you don't even write the keyword — it's implicit)

Where it gets dangerous is storage references:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract StorageRef {
    uint256[] public numbers;

    constructor() {
        numbers.push(10);
        numbers.push(20);
    }

    function dangerousModify() public {
        // storage reference — points directly at state!
        uint256[] storage ref = numbers;
        ref[0] = 999; // ⚠️ This modifies the ACTUAL state variable
    }

    function safeRead() public view returns (uint256) {
        // memory copy — independent from state
        uint256[] memory copy = numbers;
        // copy[0] = 999 would NOT affect `numbers`
        return copy[0];
    }
}
// Deploy → safeRead() returns 10
// Call dangerousModify() → safeRead() now returns 999!

A storage reference is a pointer to the actual on-chain data. Modify it and you modify the state. A memory copy is independent — changes to it don't affect storage. I've seen junior developers create storage references thinking they were copies, then accidentally overwrite contract state. The compiler will actually warn you if you try to use memory where storage is intended in newer versions, but understanding the concept is what saves you.

🔍 Deep Dive: Why does a storage write cost 20,000 gas?

Every storage slot modification must be:

  1. Processed by the executing node
  2. Included in the transaction receipt
  3. Replicated across every Ethereum node in the world — tens of thousands of them
  4. Stored permanently (or until overwritten)

You're not just writing to one SSD. You're writing to a globally replicated, immutable database. That 20,000 gas is the economic cost of permanent, decentralized persistence. That's why in Lesson 4, when we start moving ETH, gas optimization becomes critical — every unnecessary storage write is real money wasted.


The Counter Pattern: Sequential IDs Without an Array

Now let's combine everything. We need to assign each campaign a unique ID. In a traditional database, you'd use auto-increment. Solidity doesn't have auto-increment. But we already have campaignCount from Lesson 2 — and that's our counter.

The pattern:

Why increment first, then store? Because IDs start at 1, not 0. Mapping keys that haven't been set return default values (all zeros). If your first campaign had ID 0, you couldn't distinguish "campaign 0 exists" from "this ID was never used." Starting at 1 means any campaign with a zero-address creator was never created. This is a well-known pattern in Solidity and I use it in every contract.

Let me show you the counter pattern in isolation before we apply it to FundChain:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract CounterPattern {
    struct Item {
        string name;
        uint256 value;
    }

    uint256 public itemCount;
    mapping(uint256 => Item) public items;

    function addItem(string calldata _name, uint256 _value) external {
        itemCount++;  // 0 → 1, then 1 → 2, etc.
        items[itemCount] = Item(_name, _value);
    }

    function getItem(uint256 _id) external view returns (string memory, uint256) {
        Item memory item = items[_id];
        return (item.name, item.value);
    }
}
// Deploy → addItem("Sword", 100) → addItem("Shield", 50)
// itemCount() → 2
// getItem(1) → ("Sword", 100)
// getItem(2) → ("Shield", 50)
// getItem(99) → ("", 0) ← default values, not an error!

🤔 Think about this: What happens if two users call addItem() at the exact same time? Could they get the same ID?

Answer

No. Ethereum transactions execute sequentially within a block, not in parallel. Even if two transactions are in the same block, the EVM processes them one after another. The first transaction increments itemCount to, say, 3, and the second sees it as 3 and increments to 4. Race conditions as you know them from multi-threaded programming don't exist in the EVM. This is one of the beauties of blockchain: deterministic, sequential execution.

(There are subtle ordering issues called MEV — Miner/Maximal Extractable Value — but that's well beyond today's scope.)


Building createCampaign() and getCampaign()

Time to apply everything. Let's extend our FundChain contract with the full campaign registry.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

contract FundChain {
    // ── From Lesson 2 ──
    address public owner;
    uint256 public campaignCount;

    // ── NEW: Campaign struct ──
    struct Campaign {
        address creator;
        string title;
        string description;
        uint256 goalAmount;
        uint256 currentAmount;
        uint256 deadline;
        bool isCompleted;
    }

    // ── NEW: Campaign storage ──
    mapping(uint256 => Campaign) public campaigns;

    constructor() {
        owner = msg.sender;
    }

    // ── From Lesson 2 ──
    function getCampaignCount() public view returns (uint256) {
        return campaignCount;
    }

    // ── NEW: Create a campaign ──
    function createCampaign(
        string calldata _title,
        string calldata _description,
        uint256 _goalAmount,
        uint256 _durationInDays
    ) external {
        campaignCount++;
        campaigns[campaignCount] = Campaign({
            creator: msg.sender,
            title: _title,
            description: _description,
            goalAmount: _goalAmount,
            currentAmount: 0,
            deadline: block.timestamp + (_durationInDays * 1 days),
            isCompleted: false
        });
    }

    // ── NEW: Read a campaign ──
    function getCampaign(uint256 _id) external view returns (
        address creator,
        string memory title,
        string memory description,
        uint256 goalAmount,
        uint256 currentAmount,
        uint256 deadline,
        bool isCompleted
    ) {
        Campaign memory c = campaigns[_id];
        return (
            c.creator,
            c.title,
            c.description,
            c.goalAmount,
            c.currentAmount,
            c.deadline,
            c.isCompleted
        );
    }
}

Let me explain the design decisions:

Why calldata for _title and _description? The function is external, and these string parameters are only read, never modified. calldata is the cheapest option — it avoids copying the data into memory. For strings in external functions, calldata over memory every single time. It's free optimization.

Why _durationInDays instead of a raw timestamp? Usability. Making the caller compute block.timestamp + (30 * 86400) is error-prone. The contract does the math: block.timestamp + (_durationInDays * 1 days). The 1 days suffix is Solidity syntactic sugar — the compiler converts it to 86400 seconds.

Why does getCampaign return individual fields instead of the struct? As of Solidity 0.8.x, returning a struct from an external function works if you use memory, but returning named individual fields makes the ABI cleaner for frontend integration. Both approaches work — I prefer explicit returns because it's unambiguous what the frontend receives.


Step-by-Step: Test It in Remix

  1. Open Remix → create a new file FundChain.sol → paste the full contract above

  2. Compile → select Solidity 0.8.26 (or any 0.8.x)

  3. Deploy → use Remix VM (Cancun)

  4. Check initial state:

    • Click getCampaignCount → should return 0
    • Click owner → should show your deployer address
  5. Create Campaign #1:

    • In createCampaign, enter:
      • _title: "Build a School"
      • _description: "Community school in rural area"
      • _goalAmount: 5000000000000000000 (5 ETH in wei)
      • _durationInDays: 30
    • Click transact
  6. Create Campaign #2:

    • _title: "Clean Ocean Project"
    • _description: "Remove plastic from coastline"
    • _goalAmount: 10000000000000000000 (10 ETH in wei)
    • _durationInDays: 60
    • Click transact
  7. Verify:

    • getCampaignCount()2
    • getCampaign(1) → returns all 7 fields for "Build a School"
    • getCampaign(2) → returns all 7 fields for "Clean Ocean Project"
    • getCampaign(99) → returns all zeros/empty strings (no error!)
Expected output for getCampaign(1):
  creator: 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
  title: "Build a School"
  description: "Community school in rural area"
  goalAmount: 5000000000000000000
  currentAmount: 0
  deadline: (some future timestamp, e.g., 1745000000)
  isCompleted: false
💡 Stuck? Common Remix Errors
  • "Invalid type for argument": Make sure strings are in double quotes "like this", and numbers have no quotes.
  • "Gas estimation failed": Usually means you're passing wrong parameter types. _goalAmount must be a raw number, not "5 ether" as a string.
  • Transaction reverted: At this stage, our function has no require checks, so reverts shouldn't happen. If they do, recheck your Solidity version.
  • getCampaign returns all zeros: You probably queried an ID that doesn't exist. Try 1 or 2, not 0.

🔨 Project Update

Here's the cumulative code — everything from Lessons 2 and 3 combined. Copy-paste this into Remix and you have a working campaign registry:

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.26;

/// @title FundChain — Decentralized Crowdfunding Platform
/// @notice Lessons 2-3: Contract skeleton + Campaign registry

contract FundChain {
    // ═══════════════════════════════════════════
    //  STATE — Lesson 2
    // ═══════════════════════════════════════════
    address public owner;
    uint256 public campaignCount;

    // ═══════════════════════════════════════════
    //  DATA STRUCTURES — Lesson 3 (NEW)
    // ═══════════════════════════════════════════
    struct Campaign {
        address creator;
        string title;
        string description;
        uint256 goalAmount;
        uint256 currentAmount;
        uint256 deadline;
        bool isCompleted;
    }

    mapping(uint256 => Campaign) public campaigns;

    // ═══════════════════════════════════════════
    //  CONSTRUCTOR — Lesson 2
    // ═══════════════════════════════════════════
    constructor() {
        owner = msg.sender;
    }

    // ═══════════════════════════════════════════
    //  READ FUNCTIONS — Lesson 2
    // ═══════════════════════════════════════════
    function getCampaignCount() public view returns (uint256) {
        return campaignCount;
    }

    // ═══════════════════════════════════════════
    //  CAMPAIGN MANAGEMENT — Lesson 3 (NEW)
    // ═══════════════════════════════════════════
    function createCampaign(
        string calldata _title,
        string calldata _description,
        uint256 _goalAmount,
        uint256 _durationInDays
    ) external {
        campaignCount++;
        campaigns[campaignCount] = Campaign({
            creator: msg.sender,
            title: _title,
            description: _description,
            goalAmount: _goalAmount,
            currentAmount: 0,
            deadline: block.timestamp + (_durationInDays * 1 days),
            isCompleted: false
        });
    }

    function getCampaign(uint256 _id) external view returns (
        address creator,
        string memory title,
        string memory description,
        uint256 goalAmount,
        uint256 currentAmount,
        uint256 deadline,
        bool isCompleted
    ) {
        Campaign memory c = campaigns[_id];
        return (
            c.creator,
            c.title,
            c.description,
            c.goalAmount,
            c.currentAmount,
            c.deadline,
            c.isCompleted
        );
    }
}

Run the project you've built so far:

  1. Deploy to Remix VM
  2. getCampaignCount()0
  3. createCampaign("Build a School", "Community project", 5000000000000000000, 30) → transaction succeeds
  4. getCampaignCount()1
  5. getCampaign(1) → returns all 7 fields with your address as creator
  6. Create a second campaign and verify getCampaignCount()2

Expected: Two campaigns stored on-chain, each retrievable by its unique ID. The data persists between calls because it's in storage — the permanent layer of the EVM you learned about in Lesson 1.


Review: Self-Check

Before moving on, verify these:

  • Your Campaign struct has exactly 7 fields with correct types
  • campaignCount increments before being used as the mapping key (IDs start at 1)
  • createCampaign uses calldata for string parameters
  • getCampaign uses memory for the local Campaign copy
  • Querying a non-existent ID returns defaults (not an error)
  • You understand why mappings can't be iterated

Common mistakes I see:

  1. Forgetting to increment the counter — campaigns all overwrite ID 0
  2. Using memory instead of calldata for external function string params — works but wastes gas
  3. Trying to return Campaign storage from a view function — won't compile for external calls
  4. Assuming campaigns[0] is the first campaign — it's not, it's the default empty Campaign

Next Level: Seniors Do This Differently

Packed structs for gas savings. Storage slots in the EVM are 32 bytes. Solidity packs smaller types together. If you reorder struct fields so that smaller types are adjacent, they share a slot:

// ❌ Naive ordering — wastes storage slots
struct BadLayout {
    bool isCompleted;     // 1 byte  → slot 0 (31 bytes wasted)
    uint256 goalAmount;   // 32 bytes → slot 1
    bool isActive;        // 1 byte  → slot 2 (31 bytes wasted)
    uint256 deadline;     // 32 bytes → slot 3
}
// Uses 4 slots

// ✅ Packed ordering — bools share a slot
struct GoodLayout {
    uint256 goalAmount;   // 32 bytes → slot 0
    uint256 deadline;     // 32 bytes → slot 1
    bool isCompleted;     // 1 byte  → slot 2
    bool isActive;        // 1 byte  → slot 2 (packed!)
}
// Uses 3 slots — saves ~20,000 gas on first write

For our current FundChain contract, the strings (title, description) each take their own slot regardless of ordering, so packing the bools and uints wouldn't save much. But when you build contracts with many boolean flags or small integer fields, struct packing is a real gas saver. I've optimized contracts where reordering struct fields saved 15% on deployment costs.

Enums for status tracking. Right now we have bool isCompleted — a campaign is either done or not. But real crowdfunding needs more states: Active, Successful, Failed, Cancelled. In Lesson 7, we'll replace the bool with a proper state machine. For now, keep it simple.

🔍 Deep Dive: Why not use an array instead of mapping + counter?

You could use Campaign[] public campaigns — Solidity dynamic arrays support .push() and .length. Some developers prefer this. Here's why I don't for primary data storage:

  1. Deletion gaps. If you delete campaigns[3] in an array, you get a zeroed-out element at index 3 but the length stays the same. You can't "close the gap" without moving every subsequent element — extremely expensive.
  2. Front-end coupling. Arrays return data differently from mappings in the ABI, and some front-end libraries handle one better than the other.
  3. Consistency. The counter + mapping pattern works the same whether you're storing campaigns, users, orders, or proposals. Learn it once, use it everywhere.

The one case where I use arrays: when a contract needs to enumerate all items for on-chain logic (like distributing rewards to all stakers). Even then, I cap the array size.


Summary Diagram

Looking ahead to Lesson 4: We've built the skeleton (Lesson 2) and the data layer (Lesson 3). But our campaigns can't receive ETH yet — there's no payable function, no msg.value handling. Next lesson dives into gas economics and moving real money on Ethereum. You'll learn why some transactions cost $2 and others cost $200, and you'll add the fundCampaign() function that accepts ETH contributions. The data structures you built today are the foundation that money will flow into.


Difficulty Fork

🟢 This was easy for me

Great. You now know the three pillars of on-chain data: structs for shape, mappings for lookup, and the counter pattern for IDs. Key takeaway: mappings are your default data structure, not arrays. In Lesson 4, we add money to the mix with payable and msg.value.

Quick practice: try adding an updateTitle() function that lets only the campaign creator change the title. You'll need campaigns[_id].creator == msg.sender as a check — we'll formalize this with require in Lesson 6, but try it now with a simple if statement.

🟡 This was challenging

That's normal. The storage/memory distinction is the hardest concept in this lesson — and arguably one of the trickiest parts of Solidity overall.

Think of it like this:

  • Storage = your hard drive. Data stays after you turn the computer off. Slow and expensive to write.
  • Memory = your RAM. Fast, cheap, but everything vanishes when the program closes.
  • Calldata = a read-only USB drive someone plugged in. You can read from it, but you can't change what's on it.

Go back to the StorageRef contract above and try it in Remix. Call dangerousModify() and watch the state change. Then try the same experiment with a memory copy and see that state doesn't change. That hands-on experiment will make the concept click.

🔴 Challenge: Interview-level question

Question: You have a contract that stores user profiles in a mapping(address => Profile). A function reads a profile, checks a condition, and conditionally updates one field. Which is more gas-efficient?

// Option A
function updateA(address user) external {
    Profile memory p = profiles[user];
    if (p.score > 100) {
        profiles[user].score = 0; // write to storage directly
    }
}

// Option B
function updateB(address user) external {
    Profile storage p = profiles[user];
    if (p.score > 100) {
        p.score = 0; // modify via storage reference
    }
}

Answer: Option B is more gas-efficient. Option A copies the entire Profile struct from storage to memory (reading every field costs gas), then writes back to storage. Option B creates a storage reference (essentially a pointer) — reading p.score only loads that one slot, and writing p.score = 0 directly modifies that single storage slot without ever copying the full struct. When you only need to read/write one field of a large struct, storage references avoid the overhead of copying fields you never use.

In production, this matters. A struct with 10 fields copied to memory means 10 storage reads. A storage reference reading 1 field means 1 storage read.


질문 & 토론