Understanding aiohttp webservices

As of Fall 2024, aiohttp is the most popular web framework on PyPI with over 6 million downloads a day.

The goal of this page is to give a conceptual overview of how it works under the hood: that is, what it is doing when you run it.

1 Couroutines everywhere

Web frameworks try to hide the complexity of having many clients connect concurrently. Some do this with threads, but aiohttp does it with coroutines. Almost all of its code, and almost all code you write to use it, consists of async def coroutine functions.

Recall that coroutine functions in Python do not fully execute when you invoke them; instead they return coroutine objects that can be stated, stopped, and resumed by a scheduling program like that found in asyncio.run.

aiohttp uses a wrapper around asyncio.run called aiohttp.web.run_app. This function never returns: it accepts an Application object and uses it to decide how to respond to any incoming HTTP requests.

2 Parsing and Routing Requests

The main service provided by aiohttp is

  1. Setting up a TCP/IP server (i.e. opening sockets).
  2. Parsing HTTP requests sent to it (like you did in MP 5).
  3. Sending those parsed requests to functions you registered based on their path and method.
  4. sending the results of your functions back to those who sent the requests, formatted as HTTP responses.

The third of these steps requires the most intervention because we have to tell it what methods and paths we want to go to which of our functions. There are multiple ways to do that, but a common and effective one is to

  1. Make a routing table, which is effectively a dict with (method, host) pairs as keys and your functions as values.

    from aiohttp import web
    routes = web.RouteTableDef()
  2. Decorate each function you write with the method and path you want to send to it

    @routes.get("/some/path")
    async def handle_this_path(req : web.Request) -> web.Response:
        return web.Response(status=500, text='Server is incomplete')

    This decorator does allow some fancy processing with placeholders in the path and the like, but it’s basic operation is something like

    class RouteTableDef:
        """for illustrative purposed only; do not use"""
        def __init__(self):
            self.routes = {}
        def get(self, path):
            def wrapper(func):
                self.routes[('GET',path)] = func
                return func
            return wrapper

    In other words, get(path) returns a function wrapper which, given a function func, remembers that that func should be called for GET-method request to path and then returns func unchanged.

  3. After defining all functions you want to use, if running as a program (not an import), make an app that uses those routes and start it

    if __name__ == '__main__':
        app = web.Application()
        app.add_routes(routes)
        web.run_app(app) # this function never returns

    run_app opens a server socket, accepts client sockets, parses any messages sent by the client as HTTP Requests, compares the requests to the routes in routes, and if it finds a match calls that function; otherwise it returns a 404 Not Found HTTP response.

    Because it might take arbitrarily long to get an HTTP request and there’s no upper limit to how many may arrive, run_app never returns: it keeps running until you kill the app from the command line with Ctrl+C. Code in the file after run_app won’ be executed.

In essence, these components mean you can focus on writing functions that decide what response to give to each request without worrying about all of the details of sockets and HTTP formatting and so on.