In this MP you will create a blockchain. Broadly speaking, a blockchain consists of two parts:

1 Initial Files

Initial files are available in mp10.zip. You will edit only blockchain.py.

2 Task overview

You will complete a class BlockChain where we provided a set of methods to implement. You can design its member variables and helper functions however you wish.

The main thing you’ll do is create and store blocks. A block is a dict with two keys: "change" is another dict, and signature is an int.

Blocks will be referred to in many places by the hash of their "change" value, which acts much like a pointer to or identifier of the block. Places where a change’s hash is used to reference a block include:

The "old" entry in each block’s "change".
The return value of get_head_hash.
The argument of get_block and is_live.
The keys in the dict returned by get_chain.

Blocks form a quasi-linked-list, where each block is a node and a block’s block["change"]["old"] value points to another node in the list. But the linked list might fork, with two or more different blocks pointing to the same old block. Because of this, you’ll need to both:

Have a way of finding the block that’s the head of the list. When this changes will always change to the most recently added block, but it won’t change on every block addition and might change blocks that used to be on the list starting at the head to no longer be on the list.

The head of the list is used when creating a block (as the new block’s "old" value) and when getting the current account balances.

The head of the list is potentially modified when adding a block (which can change the computed head).

Other tasks neither depend on nor change the head of the list.
Have a way of computing the state represented by a given block. Because the list can change heads and lose nodes it used to contain in the process, state is not as simple as a single class-wide dict that each new block edits.

3 Necessary Cryptography

3.1 Hashing values

We need all agents to agree on the hash of an object, even if they store that object in different ways internally. We define how to do that as follows:

Encoded the object to a string using JSON, with the following disambiguation:
- No optional spaces, newlines, or indentation: {"a":1,"b":2} not {"a": 1, "b": 2}.
- Object keys in sorted order, using standard string comparison.
- No character escape sequences except those required by JSON.
In Python, this means
```
string_val = json.dumps(change, separators=(',',':'), indent=None, sort_keys=True, ensure_ascii=False)
```
The string is encoded to bytes using UTF-8:
```
byte_string = string_val.encode('utf-8')
```
The bytes of the string are sent to the SHA-256 cryptographic hash function:
```
hash_bytes = hashlib.sha256(byte_string).digest()
```
The bytes of the SA-256 hash are converted to a big-endian integer:
```
hash_int = int.from_bytes(hash_bytes, byteorder='big')
```

To hash the empty dict:

Convert to json: "{}"
Encode to UTF-8 bytes: b'{}'
Hash with SHA-256:
b'D\x13o\xa3U\xb3g\x8a\x11F\xad\x16\xf7\xe8d\x9e\x94\xfbO\xc2\x1f\xe7~\x83\x10\xc0\xf6\x1c\xaa\xff\x8a'
Convert to int
30791614295234051711832508548800469788824342480481074093233550318061354680202

3.2 Signing and checking signatures

We will use RSA and hashes to create and check signatures.

An RSA public key is a large integer.

An RSA private key is sometimes described as being another large integer, or sometimes as a public key paired with another large integer. We’ll use the first terminology, meaning all keys are integers but private keys are only useful if paired with their public keys.

Signatures and hashes are also integers.

To sign a value:

Hash the value
signature = pow(hash, private_key, public_key)

To check a signature

Hash the value
valid = (hash == pow(signature, 0x10001, public_key))

Putting these together can be used to check that a private key matches a public key: if using them to sign a hash and then validate the signature succeeds, then the public and private keys match.

What the pow function is doing

The documentation for pow describes several operations, including that pow(base, exp, mod) is equivalent to, but more efficient than, (base**exp) % mod.

Here’s a basic outline of how it is implemented:

Observe that $x^a x^b = x^{a+b}$ .
Apply that observation to the binary representation of the exponent.

$\begin{split}x^{13} &= x^{0b1101}\\ &= x^{0b1000 + 0b100 + 0b1}\\ &= x^{0b1000}x^{0b100}x^{0b1}\\&=x^8 x^4 x^1\end{split}$
Modulo can be distributed to multiplication’s operands as long as it is kept outside the operation too.

(a * b) % c == ((a%c) * (b%c)) % c
A value raised to a power-of-two power can be found by repeated squaring.

$x^{0b1000} = x^{8} = \big((x^2)^2\big)^2$

Putting these pieces together, we have something like

Let $x$ start as the base
Let $v$ , which will become the answer, start as 1
For each bit of the exponent from smallest to largest,
- If the bit is 1, let $v$ become $v \cdot x \mod \text{base}$
- Let $x$ become $x^2 \mod \text{base}$

The repeated modulus operations ensure all values stay bounded and thus make the algorithm linear in the bits in the exponent instead of exponential in that term.

4 Parts of a Block

4.1 State

It is common to treat each block in the ledger as a change to a state. The state itself is not stored directly in the blockchain, but can be recovered by applying the changes each block represents in sequence.

For this MP, there are two key parts to the state of the ledger:

Account balances.

Each account (both player and booth) has an integer ticket balance, which may be positive or negative.
Paid status.

Each account is either a player account or a booth account. Booths may only transfer tickets to players if that player first transfers tickets to the booth.

A player is said to have paid a booth if both
- There are balance transfers between the player to the booth in the blockchain; and
- The most recent such transfer is from player to booth, not the other way around.

The full set of accounts (both player and booth) are provided in a file pubkeys.json. Booth accounts end _b, while player accounts do not.

4.2 Change

A change represents a transfer of tickets between two accounts. It also represents the state resulting after that transfer.

A change is encoded as a map with five entries:

"old"’s value is the hash of another change; or the value 30791614295234051711832508548800469788824342480481074093233550318061354680202 if this is the very first change.
"src"’s value is an account name
"dst"’s value is an account name
"n"’s value is a positive integer no greater than 5
"memo"’s value is an arbitrary string, useable by clients to help flag the purpose of a transfer

A change represents decrementing the account balance of the src account by $n$ and incrementing the account balance of the dst account by the same amount. The special hash

30791614295234051711832508548800469788824342480481074093233550318061354680202

represents the state where all accounts have 20 tickets.

The following change:

{
    "old": 30791614295234051711832508548800469788824342480481074093233550318061354680202,
    "src": "t23",
    "dst": "t12_b",
    "n": 4,
    "memo": "this is placeholder text"
}

represents the state where t23 has 16 tickets, t12_b has 24 tickets, and all other accounts have 20 tickets.

The hash of that change is 110871783320773641057401402892794555637327327860213801464912164781292788545720.

The following change:

{
    "old": 110871783320773641057401402892794555637327327860213801464912164781292788545720,
    "src": "t23",
    "dst": "t08_b",
    "n": 3,
    "memo": "different placeholder"
}

represents the state where t23 has 13 tickets, t12_b has 24 tickets, t08_b has 23 tickets, and all other accounts have 20 tickets.

The hash of that change is 49587917409408816145332928946146028929296101045042549461135519830931180427271.

A change is valid if the following rules hold:

Both src and dst are known accounts.
One of src or dst is a booth and the other is a player.
This is not a self-transfer: that is, the booth’s name is not the player’s name with _b at the end.
n is an integer between 1 and 5 (inclusive) if src is a player, or between 0 and 10 (inclusive) if src is a booth.
If src is a booth, the player has a paid status with the booth.

4.3 Block

A block is a change paired with a digital signature.

A block is valid if the following rules hold:

The change is valid.
The signature is of the change using the key of the src account.

Agents should keep track of all valid blocks they have seen, such that if another agent requests one by its hash the agent can reply with the block.

Suppose the account luthert has public key 22085872136742192509001032546141143848782927575687387791524012819582188685104998353868641044646749834749318267209831 and private key 702274283632342700076196609347937307372034185848496394604139693631645304800930250893614694681445728465698775749201. Then the following is a valid block:

{
    "change": {
        "old": 30791614295234051711832508548800469788824342480481074093233550318061354680202,
        "src": "luthert",
        "dst": "drschatz_b",
        "n": 3,
        "memo": "test case 0"
    },
    "signature": 17364724725136084366499967940080890396095698570866881944104465122106018537782099691459320895775930095874780392194196
}

The hash we use for this block is the hash of its change entry, which is 72437272566220696310379569536899579591372329069879712385069173735988452269622. The signature can be created from that hash and both the private and public key, as pow(hash, private, public), and can be verified using just the public key, as hash == pow(signature, 0x10001, public).

5 Consensus on a blockchain

When an agent wishes to add a block to the blockchain, they send it to other agents.

When receiving a block where the old hash is not known, they reply with a query asking what the block of that hash is.

The blockchain ignores (neither stores nor responds to) any invalid blocks it receives.

Structurally, the blockchain is a tree with only parent (old) pointers; a block may have any number of children. For any given blockchain, one leaf node is the head of the blockchain. The head of the blockchain has

the longest path to the starting block; of those,
the smallest hash.

When a change initiator sees the head become a block that does not descend from the change, it re-sends the change with the new head.

Preventing bad actors

In commercial blockchains, one concern is that a bad actor might do something like the following:

Spend a lot of the currency to purchase an item with real-world value.
Once the item is attained, go back to the block before the purchase and add a long chain of new blocks on that, overtaking the head of the chain that includes the purchase with a new head that does not.

This type of cheating can be partially prevented by policy, though such policies are never complete. For example, we prohibit a user giving tickets back and forth between their own player and booth accounts; but two users could still collude, bouncing tickets between themselves to create a new head.

The most common solution to this is to make adding to the blockchain expensive: either charging some in-chain cost (proof-of-stake, this is what Etherium does) or charging some real-world energy cost (proof-of-work, this is what Bitcoin does). Another solution is to have the identities of each participant known and have bad actors punished outside of the blockchain system itself; this is what Ripple does, and also what we’ll do for our project in this class.

Do not try to overtake the head with a long chain based on an old block. Doing so, alone or with partners, will result in failing the project. If half head creation or other bad messages (such as denial of service attacks or attempting to break encryption) interferes with the project’s success, that is a violation of Student Code prohibition on tampering with an educational resource (see §1-402(f)) and will be punished accordingly.

6 The BlockChain class

In this MP, you will implement most of the blockchain logic in a class BlockChain. The starter file has method headers with docstrings describing what they ought to do. A few other considerations are included here.

6.1 Type annotations

The starter file has some more advanced type annotations in it. Type annotations have not direct impact on code behavior, but they do communicate programmer intent and there are tools like mypy which can use them to help catch errors. The type annotations in the starter file are also used by the tests and the web client that will attach this to your project, so do not remove or edit them.

Using mypy

mypy is not installed by default; in your Dockerfile, you can add mypy to the list of packages being built and then rebuild the container to get mypy installed; on your VM, you can run sudo apt install mypy in the terminal.

Once installed, you can run mypy somefilename.py to have it check that any type annotation you have in the file are matched by the code in the file. This can catch a variety of programmer errors.

type Sender = typing.Callable[[dict],typing.Awaitable[None]]

Your blockchain will be used by an asyncio tool to build consensus between many clients. Mostly that will be hidden from your code, but your code does need to know one piece: if you get a block with an unknown old value, you’ll need to ask the block provider for that value. You’ll do that by calling a provided Sender type, meaning a function (typing.Callable) accepting a single dict argument ([dict]) and requiring await (typing.awaitable) but ultimately not having a return value ([None]).

class Change(typing.TypedDict):
    old: int
    src: str
    dst: str
    n: int
    memo: str

class Block(typing.TypedDict):
    change: Change
    signature: int

class UserData(typing.TypedDict, total=False):
    key: int
    host: str

Your code will interact with quite a bit of JSON data, and these three types define how they ought to look. A Change is a dict with five keys: old and n have integer values while src, dst, and memo have string values. A Block is has two keys: change’s value is a Change and signature’s value is an int. A UserData has up to two keys (the total=False allows some keys to be missing): key’s value is an int (and is always provided in practice); host’s value is a str (and is sometimes omitted).

6.2 Four `get_` methods

The get_chain method needs to return a dict with change hashes as keys and signed blocks as values. This is the set of all valid blocks the blockchain has ever seen. You don’t need to store the blocks in this format internally, but doing so can work.

The get_block method is like get_chain followed by an indexing operation: it returns one block given the hash of its change (or None if there’s no such block).

The get_head_hash returns the hash of the block that is currently at the head of the blockchain. The return value should always either be a key that would be returned by get_chain or the ROOT_HASH if there are no such blocks.

The get_accounts returns a dict of the account balances of each agent. Note that each block represents a different set of account balances; the return of this function is the balances of the current head block. Because the head can change often, this should either be computed on the fly or computed each time the head changes or computed and stored internally for every block in the blockchain.

You will likely want to store, or have helper methods to compute, additional data such as the length of the chain ending at each block and the paid/unpaid status of each player+booth pair.

6.3 Checking liveness

The is_live method returns whether or not a given block (specified by its change’s hash) is on the path from the head to the root. We provide¹¹ There was a typo in this implementation fixed 2025-12-01 8:50am, replacing self.get_block[ptr]['change']['old'] with self.get_block(ptr)['change']['old'] an implementation of this that works, but because it will be called often when there are hundreds of thousands of blocks on the chain, if your chosen way of storing data can make it more efficient, please do so.

How could a block not be live?

In many ways, the blockchain seems like a singly-linked list: each node points to one other node (where the old value is the pointer). These pointers are functionally previous, not next, pointers: the root is null and the head is the last item in the list.

In the ideal world, the blockchain would act like list in all ways.

But the blockchain is also decentralized, and that means that race conditions might arise where two agents try to add new nodes to the list at the same time on two different computers. Computer 1 might create the list [A, B, C, D, X] while computer 2 is creating the list [A, B, C, D, Y]. When those two computers learn about one another’s added blocks, one of the two will win: either X or Y will be the new head and the other one will stop being live. Alas, that’s not a final decision: maybe X wins out over X and Y, but computer 3 saw Y and then created a new node Z to get [A, B, C, D, Y, Z] before learning about X. Now Z is the winning head (because it makes a longer list) and X stops being live again.

6.4 Creating and adding blocks

The create_block method creates a block, but does not add it to the blockchain! Its purpose is to prepare a block that can be sent to other agents as part of the blockchain consensus process.

create_block always uses the current head as the old value of the block, and checks for a variety of possible failures, returning an error string if they occur. We do test for the specific error strings given in the docstring.

The add_block method adds a block to the blockchain, if and only if it is a valid block. That means it needs to perform essentially the same validity checks that create_block does, though it does not return error messages: it simply ignores invalid blocks, doing nothing at all if they are provided.

add_block needs to have special logic for the case where the old value of the provided block is not a known block in the blockchain. In particular,

Given a block with an unknown old,
1. Call (and await) the provided send_json function with an argument {"missing": old}, where old is the value of the provided block’s old key. This will allow the web app to contact the block provider, asking for the missing block.
2. Store the provided block, but not as a part of the blockchain. It won’t be added until the missing old block is also on the blockchain.
Given a valid block with a known old value,
1. Add it to the blockchain.
2. See if there are any stored-but-not-added blocks were missing this block as their old; if so, finish validating them and adding them to the blockchain.

Note that one aspect of block validity (namely, if src is a booth, the player has a paid status with the booth) depends on the old value because that is needed to know the status of the chain that the new block is modifying. Thus, that validity check piece cannot be done until the old is provided.

Missing old values and denial-of-service attacks

Denial-of-service (DOS) attacks occur when a malicious agent floods a servier with a very large number of requests, consuming server resources and causing the server to bog down or crash. One common class of protections is to log which computers send requests and ignore requests from those that send too many. To counter that, hackers sometimes take control of millions of devices and have each of them send requests to the same server, creating a Distributed denial-of-service (DDOS) attack.

For our blockchain, one type of DOS would be to fill an agent’s memory with missing-old blocks that never get added to the chain. To protect against that you could add a time limit to elements in your collection of blocks with missing old values by using something like

async def timeout_entry(theset, thevalue, timeout):
    await asyncio.sleep(timeout) # wait this many seconds
    ... # remove thevalue from theset of block missing their old value

... # add thevalue to theset of blocks missing their old value
asyncio.create_task(timeout_entry(unfinished, newblock, 10)) # run in background

You could also track unfinished blocks on a per-sender basis and ignore future messages from an agent that sends too many.

Having DOS protection is not required for this MP, because any student engaging in a DOS attack is in violation of the Student Code prohibition on tampering with an educational resource (see §1-402(f)) and will be punished accordingly; but if you would like to add this protection, you are welcome to do so.

6.5 Efficiency Targets

We expect hundreds of thousands of blocks to be added to the blockchain during our final project checkoff: 200 players × 2+ blocks per game played × dozens of games each = a hundred thousand blocks.

get_chain we expect to be called infrequently, and to take non-trivial time to run.
get_block, is_live we expect to be called by each game. Ideally they should be fast for the common case where the block in question exists, is live, and was added to the blockchain recently.
get_head_hash and create_block we expect to be called by each game and each player on their currrent server. Ideally they should be fast.
add_block will be called very often, likely dozens of times a second. If at all possible, it should be $O(1)$ or at worst $O(\log(n))$ , where $n$ is the number of blocks you have. If add_block is $O(n)$ or worse, expect your server to stop working after a few minutes of the project checkoff.

We do not test efficiency as part of MP10, but if you’re too inefficient you’ll have trouble passing the final project that uses your MP10 as part of its mechanism.