Change Log

2024-10-23 09:05 CDT: Updated mp7.zip to fix the test cases for status codes 415 and 422 (the previous file had these swapped).

MP 3 had you implement part of the PNG specification and hide arbitrary payloads inside a special uiuc chunk of them. This MP has you wrap that MP in a web microservice so that it can be run by visiting a website rather than by having the source code on hand.

1 Initial Files

mp7.zip contains initial files. You will modify

chunkservice.py

To facilitate working on this MP with your VM, we offer the following:

In mp7.zip is a script vmpoweron.sh which, if run as bash vmpoweron.sh from your Docker terminal (or possibly from your native terminal if it supports bash and curl), will power on your virtual machine.
You can get mp7.zip onto your VM by
1. copying the link target (available in the right-click menu of the link)
2. in the VM terminal typing wget, a space, pasting the link, then pressing Enter
3. unzipping using unzip mp7.zip
In mp7.zip is a script submitcode.sh which, if run as bash submitcode.sh from your virtual machine terminal, will upload your current MP7 to the submission page without requiring a web browser. To use this script, you first need to sudo apt install jq because it uses jq as part of its processing.

2 Specification

Sending a request with method GET to path / should return a response with contents of the supplied index.html.
Sending a request with method POST to path /extract with message body being a multipart form with a file in field png containing a PNG image should return a response with the extracted file from the uiuc chunk.

The file should have a content type that matches contents of the chunk, as reported by the file command-line tool.

If the uploaded file is not a PNG file, return status code 415 Unsupported Media Type along with some useful text for the end-user. If the uploaded file is a PNG with no uiuc chunk, return status code 422 Unprocessable Content along with some useful text for the end-user.
Sending a request with method POST to path /insert with message body being a multipart form with a file in field png containing a PNG image and field hide containing some data should return a response with an the image with the data hidden inside a uiuc chunk.

The resposne should have the image/png content type.

If the uploaded file is not a PNG file, return status code 415 Unsupported Media Type along with some useful text for the end-user.
The resulting service must be running on your course virtual machine at the time of grading.

3 Tips

3.1 aiohttp tips

3.1.1 Response types

There are four types of Response objects you can use in aiohttp; two might be useful for this MP.

A plain Response can have an arbitrary body, status code, and content_type.
A FileResponse creates a response from a file specified by its path on your local drive, setting the content type based on the file extension (i.e. if the file path ends .png it uses content type image/png regardless of what the file contains.

FileResponse can make handling path / very simple:

@routes.get('/')
async def index(req : Request) -> StreamResponse:
    return FileResponse(path="index.html")

FileResponse might also be useful for responding to /extract and /insert, depending on how you pick destination file names, but if you need to set the content type manually you’ll need to use Response instead.

Type annotations

Response and FileResonse are both subtypes of StreamResponse. If you want to check type annotations with mypy or pyright, StreamResponse is the better type to use for the return type of aiohttp methods.

3.1.2 Files in requests

This web service will work with an HTML form we provide. HTML forms send files in a somewhat strange way known as a multipart message body. That format splits the message body into a sequence of parts, separated by random strings including many hyphens; each part has headers like an HTTP message would, followed by its own body.

Fortunately, aiohttp can parse all of this for us. The usual structure to read a submitted file that was submitted as field name bazzle would look like

async def my_aiohttp_function(req : Request) -> Response:
    multi = await req.multipart()
    filename, filedata = None, None
    async for part in multi:
        if part.name == 'bazzle':
            filename = part.filename
            filedata = await part.read()

Note that you must use an async for because each iteration might block while waiting for more data to come over the network, and also must await the read() inside the loop because the next loop iteration will skip past any data you haven’t awaited and stored in a variable. You don’t need to await the filename because it was parsed as part of the async for.

If you want to see what the raw multipart format looks like, you can print the results of await req.read(). Note that calling read() consumes the request body and means you cannot also do await req.mulitpart().

3.1.3 Working with files

If you need to read from or write to a file in aiohttp, best practice is to use asyncio.to_thread. This means writing a non-async function that does the file work, then calling it with asyncio.to_thread, like so:

def save_file_helper(data, save_as):
    directory = os.path.dirname(save_as)
    os.makedirs(directory, exist_ok=True)
    open(save_as, 'wb').write(data)
await asyncio.to_thread(helper, mydata, my_destination_path)

We won’t enforce doing this in this MP: if you use open and so on directly in your code you’ll still get full points. That said, using to_thread for file operations will make your code faster if you try to do many file operations concurrently, for example by getting a hundred of our friends to all access your application simultaneously.

3.2 Invoking programs from Python

You’ll need to invoke a few programs from python. Python has at least half a dozen tools for this, each suited to a different context. When writing async functions, the best tool is asyncio.create_subprocess_shell (or it’s close cousin asyncio.create_subprocess_exec).

The documentation page has an example of how to use these, which also uses asyncio.run; you should not use asyncio.run because aiohttp handles its operations itself (via run_app).

3.2.1 Your MP3 solution

You’ll need to invoke extractuiucchunk and insertuiucchunk from MP3. We recommend copying your source code into MP7’s directory and have its compilation handled by the Makefile. If you don’t have a working MP3, we do provide a precompiled refernece program you can use instead; if you don’t have the .c files in your mp7 folder, the Makefile will use the provided binaries instead. We compiled these on a course virtual machine; they may or may not function properly elsewhere.

Using this program will necessitate having the provided values in a file on the disk, and will store the results on the disk. We recommend putting those files in a directory named temp/ that your code creates in setup_app using os.makedirs

3.2.2 Content type with `file`

You need to match the Content-Type header of your response to the content type of the file you’re returning, but for extraction that content type is not immediately available. file is a tool that looks into the contents of a file to try to guess its content-type.

There are two ways you might want to use file in this MP:

file --extension myfile will give back something like png or jpg, suitable to append to the file as myfile.png and then send that renamed file into a FileResponse. However, it has trouble with some text files (many file formats are text and are hard to tell apart from content) so it will report all of them with ???, which you should treat as meaning txt.

Note that the output might contain newlines or other whitespace you’ll need to remove before using it.
file --mime-type myfile¹¹ Originally, the values were put into the content type header were called MIME type because they were defined as part of the Multipurpose Internet Mail Extensions (MIME) standard. Later they were renamed to media types, but older tools and sources still often call them MIME types. will give back something like text/html or image/gif, suitable to send directly in the Content-Type header. However, that value cannot be directly specified in a FileResponse so you’ll need to send a regular Response instead.

Either of the above will work for this MP.

3.3 Manually Testing Your Microservice

Launch your service by starting your web application:

python3 chunkservice.py

Then, you can test your program in two different ways:

By using your web browser and visiting your web server (ex: http://localhost:5000/).
By the command line, using curl to make a request to your web server.
```
curl -f -o output.txt http://localhost:5000/extract -F "png=@tests/hiddentxt.png"
curl -f -o newimage.png http://localhost:5000/insert -F "png=@tests/onered.png" -F "hide=@Makefile"
```
- You can replace @tests/hiddentxt.png and so on with other files. The @ symbols tell curl to send the contents of the file, not just the file name.
- You should inspect output.txt and newimage.png (incuding extracting the hidden uiuc chunk from newimage.png) to ensure the extraction was successful.
- Make sure that you get an error when sending either endpoint an invalid PNG file.
- Make sure you also get an error when sending the extract endpoint a PNG file without a hidden uiuc chunk.

3.4 Pytest Test Suite

A pytest suite is provided for you.

python3 -m pytest

This is also exactly what make test does. Because part of the score on this MP comes from checking that your service is running on your VM and visible from other computers, using our provate list mapping students to their VMs, make test does not provide a full points breakdown.

4 Deploying your MicroService

You have a Virtual Machine (VM) provided for you as part of being in CS 340. If you have not done so already, learn more about your VM on our environment page.

You might need to install additional libraries to get your VM fully set up. For systems packages you can use something like sudo apt-get install gcc. For python libraries, you can either use the system libraries with sudo apt-get install python3-aiohttp python3-pytest or you can use the python package system to install them isntad with something like python3 -mpip install pytest aiohttp.

4.1 Deployment Requirements

At the time submit your code to the upload page, and from then until your feedback appears (usually 30 minutes or less), you server must be running on your VM using port 5000 and no IP filtering (host 0.0.0.0). This port and host is set up for you in the starter code.

If you have done this correctly, you can visit the URL of your VM, with port 5000, as e.g. http://fa24-cs340-???.cs.illinois.edu:5000/, and see and interact with your web service.

If you want to leave your code running without staying logged in to your VM, you can use nohup to start it and killall to later stop it, like so:

# to start in the background, surviving a log-out:
nohup python3 chunkservice.py </dev/null >/dev/null 2>/dev/null &
# note: the VM will still eventually turn itself off, stopping the app

# to stop all python programs, even those running in the background:
killall --signal SIGINT python3