MP0 — Event Logging

MP0: Event Logging

Due: 11:59 p.m., –Monday, Sep 13– Wednesday, Sep 15

Notes

This MP is largely intended to (re)familiarize yourself with the process of writing networked applications. Future MPs will be more challenging.
You can do this (and all other MPs) in a group of 1 or 2 students. If you choose to work by yourself, the expected amount of work is the same, so we do encourage finding a partner.
You are allowed to use any programming language; however, the TAs will only help with the four supported languages: C/C++, Go, Java, and Python.
Your implementation will be tested on the CS425 VMs. It is your responsibility to make sure that the code runs on these VMs. Make sure to request your VM using this form

Overview

A key tool in distributed systems is the ability to collect a log of events that have occurred across multiple processes. This allows you to obtain a picture of what is happening in your system and aids debugging and diagnosing crashes or other incorrect behavior. In this MP, you will implement a simple logging system that will send events from several nodes to a centralized log server that will collect them all in one place. You may find this functionality useful in future MPs.

Event Generator

Events can be generated using this Python script: generator.py. For this assignments, events are represented as random strings of characters. Each event is preceded by the generation timestamp, expressed as a fractional number of seconds since 1970. Here is a sample of the output of generator.py:

% python3 generator.py 0.1
1579666871.892629 58f7eb5b7d25906471ff1e1b8847c891f5b275aecd71451a8c040fe0fd2011a0
1579666871.9974 2c7d235d2dc1ceee78d5521fae1e53c21f216af3b6685a37d3263137a95e116b
1579666872.10252 ad0d8bb72c4fb74ee9095c1cac3e11e7f56b180eb19fb1e01faded0feff8984a
1579666872.2044811 79d3baf0060850ec70ac766d92ab0070e526f33780b0ed6cdfed175e72d57ad9
1579666872.307186 38e88dd2999db43368662bccfb03587280cbfd51208aed27bd81462b0404508f
1579666872.409765 c7da3c0a7135342ff80a111f36b3d32f5b80c4cecd536caf96930ae5dfc6b5ed
1579666872.514535 bc1e0d37e939d4b58607d259e517259c9e6e9a872ccdebbaeb8aae80ecf502cf
1579666872.614976 9e30f271fe6e65416eae5b9787e3d42ac406a260160e18692f379c4fa19e7a27
1579666872.7168908 6b8cba43143f1f5f29a3ea57614c91622be0214ce020bebcd88bed2a4169743a
1579666872.820116 17fb03ee5ffe9431b0c44c89995c3d82060267c96a2e4954564bd456f653b4c7
[...]

The argument to generator.py is the rate at which events occur. Note that this is the average rate, specified in hertz (i.e., events / second), so 0.1 Hz = 1 event every 10 seconds on average. The events are exponentially distributed and so you could receive a burst of events spaced closer together.

Centralized Logger

Your centralized logger should start by listening on a port, specified on a command line, and allow nodes to connect to it and start sending it events. It should then print out the events, along with the name of the node sending the events, to standard out. (If you want to include diagnostic messages, make sure those are sent to stderr).

% logger 1234
1579666871.892629 - node1 connected
1579666871.9974 node1 2c7d235d2dc1ceee78d5521fae1e53c21f216af3b6685a37d3263137a95e116b
1579666872.10252 node1 ad0d8bb72c4fb74ee9095c1cac3e11e7f56b180eb19fb1e01faded0feff8984a
1579666872.2044811 - node2 connected
1579666872.307186 node1 38e88dd2999db43368662bccfb03587280cbfd51208aed27bd81462b0404508f
1579666872.409765 node2 c7da3c0a7135342ff80a111f36b3d32f5b80c4cecd536caf96930ae5dfc6b5ed
1579666872.514535 - node2 disconnected
1579666872.614976 node1 9e30f271fe6e65416eae5b9787e3d42ac406a260160e18692f379c4fa19e7a27
1579666872.7168908 - node3 connected
1579666872.820116 node3 17fb03ee5ffe9431b0c44c89995c3d82060267c96a2e4954564bd456f653b4c7
[...]

The first field is the time of the event; the second field is the name of the node that generated the event. The remainder of the line is the event itself. Connection events are specified by using - as the node name, as shown above. You do not need to implement an explicit failure detector; it is sufficient to create a TCP connection from the nodes to the logger and have the logger report when it closes.

Node

Your node should receive events from the standard input (as sent by the generator) and send them to the centralized logger. You will run your node as follows:

% python3 -u generator.py 0.1 | node node1 10.0.0.1 1234

The first argument is the name of the node. The second and third arguments are the address and port of the centralized logging server. This should be the address of your VM running the centralized server (e.g., VM0) and the port.

Graphs

Once you see that your system is working, you will also want to evaluate its performance. To do this, you will need to generate graphs. In this assignment we want to track two metrics:

Delay from the time the event is generated to the time it shows up in the centralized logger
The amount of bandwidth used by the centralized logger

You will want to create an auxilliary log maintained by the centralized logger to track these two metrics. For the delay, you can just use the difference between the current time when you are about to print the event and the timestamp of the event itself. For measuring the bandwidth, you will need to track the length of all the messages received by the logger.

You should produce a graph of these two metrics over time. For the bandwidth, you should track the average bandwidth across each second of the experiment. For the delay, for each second you should plot the minimum, maximum, median, and 90th percentile delay at each second. Make sure your graphs and axes are well labeled, with units.

Evaluation scenarios

You will need to generate graphs to evaluate your system in two scenarios:

3 nodes, 0.5 Hz each, running for 100 seconds
8 nodes, 5 Hz each, running for 100 seconds

Your logger and each of the nodes will need to be running on a separate VM.

Submission instructions

To submit the assignment, we will be using GitHub classroom. Please use the invite code posted to CampusWire and submit the assignment by pushing your commit to GitHub. You are encouraged to submit your code early; we will always grade the latest commit available. (If the latest commit is after the deadline, we will grade it and adjust your grade accordingly).

You will also need to submit a report through Gradescope. Your report should include:

The names and NetIDs of the group members
The cluster number you are working on (gXX)
Instructions for building and running your code. Please include a Makefile if you’re using a compiled language! If there are any libraries or packages that need to be installed, please list those, too. Make sure the instructions are clear; if we cannot run your code we will not be able to give you functionality points.
A description of how you are measuring the delay and bandwidth
Graphs of the evaluation as described above

High-Level Rubric

Report: 5 points
- Clear instructions on how to build and run your code
Functionality testing: 30 points
- Small scenario: 3 nodes, 0.5 Hz each, 15 points
- Large scenario: 8 nodes, 5 Hz each, 15 points
Graphs: 15 points
- Clear, readable graphs with labeled axes, units