It is often useful to ensure that we can have a global ordering of events across processes in order to perform consistent actions across all processes. In this MP, your events will be transactions that move money between accounts; you will process them to maintain account balances, and use ordering to decide which of the transactions are successful.
Note that in an asynchronous system, it is impossible to know for sure which of two events happened first. As a result, you will want to use totally ordered multicast to ensure the ordering. You also will have to detect and handle the potential failure of any of the nodes in your system, so unlike MP0, you will not be able to use a centralized node as a manager.
Your system needs to keep track of accounts, each of which has a non-negative integer balance.
Each account is named by a string of lowercase alphabetic characters such as wqkby
and yxpqg
.
Initially, all accounts start with a balance of 0. A transaction either deposits some amount of
funds into an account, or transfers funds between accounts:
DEPOSIT wqkby 10
DEPOSIT yxpqg 75
TRANSFER yxpqg -> wqkby 13
The first transaction deposits 10 units into account wqkby
; the second one deposits 75 units
into yxpqg
and the third one transfers 13 from yxpqg
to wqkby
. Every 5 seconds, each node
should print out a list of balances:
BALANCES wqkby:23 yxpqg:62
All DEPOSIT
transactions are always successful; an account is automatically
created if it does not exist. A TRANSFER
transaction must use a source
account that exists and has enough funds; the destination account is again
automatically created if needed. A TRANFSER
that would create a negative
balance must be rejected. For example, if after the above 3 transactions we
processed:
TRANSFER wqkby -> hreqp 20
TRANSFER wqkby -> buyqa 15
The first transfer would succeed, but the second one would be rejected, because there would not be
enough funds left in wqkby
for the second transfer. The balances at the end would be:
BALANCES wqkby:3 yxpqg:62 hreap:20
On the other hand, if the two transactions arrived in the opposite order:
TRANSFER wqkby -> buyqa 15
TRANSFER wqkby -> hreqp 20
then the transfer to buyqa
would succeed and the transfer to hreap
would fail, resulting in:
BALANCES wqkby:8 yxpqg:62 buyqa:15
(You may choose to include or omit accounts with 0 balances; any account with a non-zero balance must be reported)
Your node should receive transactions from the standard input.
Each node will be told how many other nodes will be running in an argument and
connect to them. For simplicity, you may assume that the nodes are running on the first n VMs in
your group (e.g., -01
, -02
, and -03
if n=3). You can also add a small start up delay to the
generation of transactions to allow the nodes to come online and connect to each other, and you can assume
that no node failures will occur during this time. You can also add an argument for a port number to
use in your communications; e.g.: mp1_node 3 4321
to run 3 nodes on port 4321.
Once nodes are connected, they should multicast any transactions they receive on stdin to each
other. This should follow the constraints of totally ordered, reliable multicast. Briefly, all nodes
should process the same set of transactions in the same order, and any transaction received by a
node that has not crashed must eventually be processed by all nodes. Note that the BALANCE
messages may not be exactly the same since they are capturing the state of accounts at a different
points in time.
You should detect and handle node failures. Any of your nodes can fail, so your design should be decentralized (or, if you use a centralized node in some way, you should be able to handle its failure). Note that a node failure is an abrupt event, you should not expect that a failing node sends any sort of “disconnect” message. Your system should remain functional with 1 out of 3 nodes failing in the small-scale scenario and 3 out of 8 nodes failing in the large scale scenario (see below).
You can test out functionality by entering transactions directly into each
node. We have also provided a simple transaction generator for you
gentx.py
. As in MP0, it takes an optional
rate argument:
python3 -u gentx.py 0.5 | mp1_node 3 4321
By default it uses 26 accounts (a
through z
) and generates only valid transactions, but you are welcome to modify it
any way you wish during your testing and enable occasional invalid transaction attempts. Note that we will likely use
a different transaction generator in testing, to explore corner cases.
As in the last assignment, we want to track the bandwidth of the nodes and the delay in message propagation to all nodes. Here are the metrics you should track:
For the former, you should report the bandwidth for each of the nodes. For the latter, report when the first and last node (of those that have not failed) have processed each message. There is no set performance goal for this MP but excessive overhead may be penalized.
You will need to generate graphs to evaluate your system in the following scenarios.
You should write up a short description of the design of the protocol you are using, and explain how it ensures reliable message delivery and ordering. Also explain how your protocol handles node failures. Your document should justify the correctness of your design.
Your code needs to be in a git repository hosted by the Engineering GitLab. After creating your repository, please add the following NetIDs as members with Reporter access:
You will also need to submit a report through Gradescope. In your report, you must include the URL
of your repository and the full revision number that you want us to grade, as shown by git
rev-parse HEAD
. Make sure that revision is push
ed to GitLab before you submit.
In addition to this, your report should include:
Makefile
if you’re using a compiled language!
If there are any libraries or packages that need to be installed, please list those, too.