When you first learned to program you focused on gaining language fluency. Language fluency is the ability to express in the programming language on hand the behaviors you want the computer to take. This is partly about correctness, but also partly about idiomatic usage; for example, in C while (s[0] != '\0') and while (*s) are equally correct, but the second is more idiomatic, more what we’d expect to see an experienced C programmer write.

Once you had gained basic fluency in a language or two, you moved on to specifying behaviors using algorithms and data structures. Algorithms describe how to derive desired information from available information. Data structures describe how to store and retrieve information. Both provide a simplified interface – an API or ADT – and both are accompanied by formal descriptions of their performance, scalability, and requirements for correct functioning. Together, they allow us to add new behaviors to the vocabulary of the language itself.

To build functioning and useful software systems, we must go beyond fluency and behavior specification to designing information flow within the system. When something changes in on place, how do we make sure that other places that need to react are aware of that? How do we know if a resource is done being used and ready to be deallocated or merely on pause and worth holding onto? How do we ensure that confidential information is available to the parts of the system that need it but not leaked to anyone else? A well-designed system includes data structures and algorithms in the same way that a well-designed building includes furnaces and wiring, but the quality of the system depends more on how well those components are integrated than on the quality of the components themselves.

Software design is not a topic that gets much attention in the average CS degree program. Partly this is due to the need to build language and algorithmic fluency first. Partly it is due to the scale of meaningful design projects making them challenging to fit within a semester. Partly it is due to the challenge of delivering design instruction to classes of hundreds of students. Partly it is due to the fact that design is part of software engineering, not computer science, but the two fields tend to share departments. Whatever the reason, you are likely to receive only scattered instruction in software design.

1 Our PNG reading design

We had you implement a PNG reader. In doing this, we did some design for you, requesting that you create a function that reads one chunk at a time and a separate function that frees a chunk. Those two functions embodied an important design idea:

We made an API for file format that was entirely independent of the GIF-hiding task we had in mind.
We made an API that mirrored the malloc/free pairing for memory management, but with some application-specific smarts.

We could have instead designed the task to have one function that skimmed the entire file looking for the uiuc chunk and only when it was found do any additional work. That would have been a functional design, but it also would have meant that that one function would have to have understood all of the pieces of the PNG format and have handled memory resources correctly, making it a much more cognitively-intricate piece of code and thus much more susceptible to error. It also would have made writing a chunk significantly more complicated than reading one, further exacerbating the challenges introduced by bad design.

2 `malloc` as a design problem

Implementing malloc is theoretically simple. You need a map from pointer to (size, free) pairs; a list of free blocks; and an algorithm for seeing if two free blocks are adjacent.

The frustrating challenge of implementing malloc is that it is hard to debug: when malloc breaks, everything breaks, including debuggers.

The more intellecually interesting challenge in implementing malloc is about design. Given you don’t have a separate part of memory to store the data structures you need in, how can you integrate them with the allocated blocks of memory? How do you adjust algorithms that would be simple on a stand-alone list to work with your merged list-and-block system?

3 Worker threads

We will soon have you start writing multi-threaded applications. These will have multiple functions running concurrently, each doing their own thing, while all sharing the same memory. A major challenge of multi-threaded application design is the same challenge faced by any multi-agent system such as an office with multiple employees: keeping the agents work separate enough to work on it on their own but coordinated enough that their individual work combines to achieve a common goal.

One design we’ll use is similar to a manager with several employees: one main thread will keep track of the overall status of the work and send specific tasks to individual worker threads to accomplish on their own.

Worker threads are far from the only threading design, and we’ll also explore some less centralized threading designs; but worker threads are an effective and transferable design.

4 as-a-Service designs

Later in the semester we’ll have you both use and implement various Something-as-a-Service designs. These will each have a similar design principle: we’ll

Identify some part of the computing ecosystem.
Define the smallest, simplest way of requesting that part that we can.
Build a system that accepts and responds to those requests.

Figuring out what such a request interface should look like is a major creative endeavor that has helped refine and improve software design many times. Realizing that some task can be isolated from all others and implemented all on its own helps tame the complexity of the software design space and is a major driver of every-increasing software quality and capability.

1 Our PNG reading design

2 malloc as a design problem

3 Worker threads

4 as-a-Service designs

2 `malloc` as a design problem