A maze is a type of undirected graph.
The classic definition of an undirected graph is two sets: a set of vertices and a set of edges where each edge is a set of two vertices.
Because our mazes will be grids, the natural representation of a vertex is an pair.
Python supplies sets with two types: set
can be mutated (good for adding and removing edges) but cannot be a member of another set; and frozenset
cannot be mutated but can be a member of another set.
This suggests the following data structure:
(0,0)
frozenset([(0,0), (1,0)])
To make a 7×7 maze, the set of vertices would range from (0,0)
to (6,6)
and the edges would be a subset of all possible horizontal edges frozenset([(x,y),(x+1,y)])
and all possible vertical edges frozenset([(x,y),(x,y+1)])
. Which subset you pick changes which maze you have.
dict
s for edges.
The set-of-edges approach makes finding the edges that include a given vertex slightly tedious: you have to generate the four possibilities and check which ones are in the set.
An alternative is to store bidirectional edges in a dict
. Use vertices as keys and the set of vertices the key is connected to as values. This makes finding the edges of a path easy, but makes randomly picking an edges more complicated and also necessitates updating two entries for every edit to the edges to keep them bidirectional.
dict
s if you prefer.
You’ll need a subset of the following four special edges to be exits to your maze:
left: | |
down: | |
right: | |
up: |
Those exit edges should be the only edges that connect to a vertex not . We recommend adding them as a last step after your maze is fully built to avoid your maze generation algorithm being confused by an edge to nowhere.
We need to send mazes between computers, which means we need a standard way of representing them. We’ll use the following format:
A maze is represented by a list of strings, one string per row of vertices in the maze, ordered top to bottom.
A row is represented by a string of hexadecimal digits, one digit per vertex in the maze, ordered left to right.
A cell is represented by a hexadecimal digit (i.e. a number between 0 and 15 inclusive) where each bit represents whether that cell as a wall (1) or not (0) in a given direction. We also use these same hexadecimal digits to represent which borders of your maze should have exits.
We map bits to directions as follows:
Direction | Bit | Exit vertex | Exit edge |
---|---|---|---|
Left | |||
Down | |||
Right | |||
Up |
Cell 0
has no walls.
Cell 2
has one wall, on its bottom side.
Cell 5
has walls to its left and right because 0x5 = 1 + 4
.
Cell F
has walls in all directions, because 0xF = 1 + 2 + 4 + 8
.
Maze 2
has exits on all sides except down. That means it has three exit edges (, , and ) and that the maze is sufficiently navigable that a path exists between the three exit vertices: a path between
and and a path between and and a path between and . Note that due to the transitivity of paths, if two of those paths exist the third is
guaranteed to exist as well.
Maze F
has no exists and needn’t be navigable at all.
The maze
["988088c","1000004","1000004","0000000","1000004","1000004","3220226"]
is an empty 7×7 room with exits on all four borders.
The maze
["98a0a8c","1494b04","1430e14","0280820","1e575d5","1e1c575","3a202a6"]
has walls that spell out CS 340
with exits on all four borders.
This will create ugly
mazes with open areas and/or inaccessible areas, but it is easy to implement and sufficient for our needs.
Any spanning tree of the graph with all edges included is a fully navigable maze. You likely learned some minimal spanning tree algorithm in your data structures course which could be applied to maze generation.
Note that minimal spanning tree algorithms are biased toward mazes with short solutions and many dead ends.2 This bias arises because the first acyclic path added to the graph connecting two entry points wins. Minimal spanning tree algorithms add one edge at a time, meaning longer paths require more steps to create and are thus less likely to appear before shorter paths.
Wilson’s algorithm provides a uniform distribution over all possible spanning trees. It uses loop-erased random walks as its primary computational element.
The simplest implementation strategy for Wilson’s algorithm that I know goes as follows:
unused.
unused.
used.
used:
used; call this vertex both and .
used:
used:
used.
used.
In the two pick a vertex
steps (3. and 4.a.), which vertex you pick won’t change the final outcome: you can pick them at random, pick them in a fixed predetermined order, or anything else you wish.
Step 4.b. executes a random walk of the graph: it starts somewhere and moves in a random direction over and over. 4.b.ii. marks the vertices visited during the walk so that the random walk can be repeated later.
Random walks tend to cross over themselves often, creating loops in the walk that would prevent them from being part of a spanning tree. However, those loops are erased
in this code by subsequent executions of 4.b.ii.: if we visit the same vertex multiple times, only the last visit’s path will be recorded in the vertex’s label.
Because 4.b. erased any loops, 4.c. has a random walk with no loops that ends at a vertex in the maze so far. Adding that walk to the maze extends the maze into a larger spanning tree. Because each addition is a truly random walk and may be any length, there’s no bias introduced by adding super-short single-edge paths early on and then later additions being limited by those early choices.
Many simple maze solving algorithms rely on the maze being a tree, not an arbitrary graph, and fail if the maze contains loops. Adding some loops to a maze can simultaneously make it harder to solve and make the shorted path through the maze shorter.
The easiest way to add loops to a maze is to generate a maze without loops first and then add a few edges. Of particular interest is a maze with no dead ends, which can be created by adding another edge to any vertex that only has one edge.
You don’t need to make your mazes entirely at random. It’s perfectly acceptable to draw out a set of walls you find interesting in some way and use that.
It’s also possible to craft small wall patterns and combine those programmatically into a full maze; for example, a 7×7 maze is large enough to represent 6 decimal digits as walls, and most letters can also be created.