CS440/ECE448 Spring 2021
Problem One: Search


i. introduction

Key terms: - starting position - waypoint

In this assignment, we will implement general-purpose search algorithms and use them to solve `pacman'-like maze puzzles. This assignment has five parts.

  1. Breadth-first search, with one waypoint.
  2. A* search, with one waypoint.
  3. A* search, with several waypoints.
  4. A* search, with many waypoints.

Throughout this assignment, the goal will be to find a path from a given starting position in a maze which passes through a given set of waypoints elsewhere in the maze. We will begin by finding a path from the starting position to a single destination waypoint. Then we will generalize the implementation to handle multiple waypoints. Finally, we will explore heuristics to handle large numbers of waypoints in a reasonable amount of time.

ii. environment

Key terms: - python 3 - pygame - pip3

This assignment is written in Python 3. If you have never used Python before, a good place to start is the Python tutorial. We recommend using Python version 3.6 or later.

This assignment contains visualization tools which depend on pygame. You can install pygame locally using the pip3 tool.

iii. getting started

Key terms: - agent - path

To get started on this assignment, download the template code. The template contains the following files and directories:

The data/ directory contains the maze files we will be using in this assignment.

The main.py file will be the primary entry point for this assignment. Let’s start by running it as follows:

python3 main.py --human data/part-1/medium

This will open a pygame-based interactive visualization of the data/part-1/medium maze.

pygame visualization

The blue dot represents the agent. You can move the agent, using the arrow keys, to trace out a path, shown in color.

note: if the red-green gradient is hard for you to see, you can make the visualization use an alternative color scheme by specifying the --altcolor option.

The black dots represent the maze waypoints. Observe that this maze contains a single waypoint, in the lower left-hand corner. When you implement the search algorithms for this assignment, you will need to compute a path that takes the agent through all of the waypoints in the maze.

We will grade your submissions using grade.py. This file is available to you, so that you can understand how this assignment will be graded.

Let's see what happens when we run this script:

./grade.py
running in student mode (instructor key unavailable)
{'visibility': 'visible',
 'tests': ({'name': "part-1: `validate_path(_:)` for 'tiny' maze",
            'score': 0.0,
            'max_score': 2.0,
            'visibility': 'visible'},
           {'name': "part-1: correct path length for 'tiny' maze",
            'score': 0.0,
            'max_score': 2.0,
            'visibility': 'visible'},
           {'name': "part-1: `validate_path(_:)` for 'small' maze",
            'score': 0.0,
            'max_score': 2.0,
            'visibility': 'visible'},
           {'name': "part-1: correct path length for 'small' maze",
            'score': 0.0,
            'max_score': 2.0,
            'visibility': 'visible'},
...

As you can see, we have a score of 0, since we have not implemented any of the functions in search.py. We will do this in the next few sections.

1. BFS for a single waypoint

Key terms: - breadth-first search - maze legend - wall cell - start cell - waypoint cell

For Part 1 of this assignment, we will implement breadth-first search for a single waypoint. Specifically, we will implement a function bfs(_:) in search.py with the following signature:

# search.py
def bfs(maze):

This function takes a maze parameter of type maze. This type is defined in maze.py. You can inspect maze cells programatically using the __getitem__(_:) subscript.

cell = maze[row, column] 

warning: this subscript uses matrix notation, meaning the first index is the row, not the column. Spatially, this means the y-coordinate comes before the x-coordinate.

In the rest of this guide, we will use i to refer to a row index, and j to refer to a column index.

The maze size is given by the size member. The size.x value specifies the number of columns in the maze, and the size.y value specifies the number of rows in the maze.

rows    = maze.size.y
columns = maze.size.x

Keep in mind that the coordinate order in size is reversed with respect to the two-dimensional indexing scheme!

Each cell in the maze is represented by a single character of type str. There are four kinds of cells, which should be self-explanatory:

For obvious reasons, a maze will only ever contain one starting position, but it can contain an arbitrary number of waypoint cells. However, for Part 1, you can assume there is only one waypoint cell in the maze.

You can determine what kind of cell a particular maze cell is by using the maze legend, available in the legend property.

# `True` if the cell is a wall cell
cell == maze.legend.wall 

# `True` if the cell is the starting position
cell == maze.legend.start

# `True` if the cell contains a waypoint
cell == maze.legend.waypoint 

You can assume that any cell that is not a wall, start position, or waypoint is empty.

The maze type also supports the following interfaces, which may or may not be useful to you:

Your bfs(_:) implementation should return a maze path, which should be a sequence of (i, j) coordinates. The first vertex of the path should be start, and the last vertex should be waypoints[0].

hint: the deque type, available in the collections module, may be useful.

To make the rest of the assignment easier, we strongly suggest that your implementation for Part 1 use the state representation, transition model, and goal test needed for solving the problem in the general case of multiple dots. For the state representation, besides your current position in the maze, is there anything else you need to keep track of? For the goal test, keep in mind that in the case of multiple waypoints, the agent can have more than one possible ending position.

You can view your generated path, and some interesting statistics about it, by running main.py as follows:

python3 main.py data/part-1/medium --search bfs

You can test the other mazes by replacing the specified maze file medium (in data/part-1/) with one of tiny, small, large, or open.

python3 main.py data/part-1/tiny --search bfs
python3 main.py data/part-1/small --search bfs
python3 main.py data/part-1/large --search bfs
python3 main.py data/part-1/open --search bfs

If you have implemented bfs(_:) correctly, you should see your score from grade.py increase to 50% for the Part 1 test cases. This is because we have not provided the full answer key containing the expected path vertices in key-student, for obvious reasons. However, if you upload your search.py file to Gradescope, the autograder will show you your complete score for Part 1.

2. A* for a single waypoint

Key terms: - heuristic function - manhattan distance

For Part 2 of this assignment, solve the same set of mazes as in the previous part (the directory data/part-2/ is a symbolic link to data/part-1/), this time using the A* search algorithm.

Write your implementation as a function astar_single(_:) in search.py with the following signature:

def astar_single(maze):

hint: the heapq module may be useful.

Since all the test mazes contain only a single waypoint, you can use the manhattan distance from the agent’s current position to the singular waypoint as the A* heuristic function. For two grid coordinates a and b, the manhattan distance is given by:

abs(a.i - b.i) + abs(a.j - b.j)

You can test your implementation by running main.py as follows, replacing data/part-2/medium as needed:

python3 main.py data/part-2/medium --search astar_single

You should find that the path lengths returned by A* are the same as those computed by breadth-first search, but A* explores fewer states.

3. A* for corner waypoints

For Part 3 of this assignment, we will consider a somewhat harder search problem. The mazes for Part 3, available in the directory data/part-3 contain up to four waypoints, one in each corner of the maze. We will use A* search to compute an optimal path taking the agent through all the waypoints in the maze.

You may notice that for some mazes, such as data/part-3/tiny, the shortest path does not visit the waypoint closest to the starting position first. This means you cannot compute a path to the nearest waypoint and apply your solution for the single-waypoint case recursively.

Since we now have multiple dots, you must design a new admissible and consistent heuristic function for the new, generalized A* search implementation.

Write your implementation as a function astar_corner(_:) in search.py with the following signature:

def astar_corner(maze):

You can test your implementation by running main.py as follows, replacing data/part-3/medium as needed:

python3 main.py data/part-3/medium --search astar_corner

You should be able to handle the tiny maze using uninformed breadth-first search. In fact, it is a good idea to try that first for debugging purposes, to make sure your representation works with multiple dots. For the other two mazes, it is crucial to use A* with a good heuristic function. Your heuristic should compute the solution for the medium and large mazes with many fewer explored states than uninformed breadth-first search and in a reasonable amount of time. Make sure your algorithm executes in less than 2 seconds for each test maze.

note: to be admissible, the heuristic values must be lower bounds on the actual shortest path cost to the nearest waypoint, and non-negative (like the manhattan distance in Part 2). to be consistent, it must additionally hold that if an action has cost c, then taking that action can only cause a drop in the heuristic value of at most c.

Once you have brainstormed an admissible heuristic that works well, you can check whether it is indeed consistent, too. The only way to guarantee consistency is with a proof. However, inconsistency can often be detected by verifying that for each node you expand, its successor nodes are equal or higher in f-value. Moreover, if A* ever returns paths of different lengths, your heuristic is inconsistent.

4. A* for many waypoints

Now, consider the more general and harder problem of finding the shortest path through a maze while hitting multiple waypoints. As instructed in Part 1, your state representation, goal test, and transition model should already be adapted to deal with this scenario. The next challenge is to solve different mazes using A* search with an admissible heuristic designed by you.

You can still debug your method using the tiny maze with uninformed BFS, or the heuristic defined in part 2. However, to successfully handle all the inputs, it is crucial to use A* and come up with a better heuristic with better efficiency. Once again, for full credit, your heuristic should be admissible and should permit you to find the solution for the medium search 1) with much fewer explored states than uninformed BFS and 2) in a reasonable amount of time. If you have some other clever way to approach the multiple-dot problem, implement that for part 5.

Write your implementation as a function astar_multiple(_:) in search.py with the following signature:

def astar_multiple(maze):

You can test your implementation by running main.py as follows, replacing data/part-4/medium as needed:

python3 main.py data/part-4/medium --search astar_multiple

Hints for Part 4

In the past almost all working solutions to this problem have used a heuristic based on the minimum spanning tree. The minimum spanning tree of a set of points can be computed easily via Kruskal's algorithm or Prim's algorithm. If T is the total length of the edges in the minimum spanning tree, then the shortest path connecting all the points must have length between T and 2T.

Now, suppose you are in the middle of a search. You're at some location (x,y) with a set of S dots still to reach. Your heuristic function h might be the sum of the distance from (x,y) to the nearest dot, plus the MST length for the dots in S. To compute the MST for a set of dots, you'll need the distance between each pair of dots. The Manhattan distances (or real shortest-path lengths precomputed by A* for single dot) will work here. You may also be able to find a better method.

During search, you'll have many states with the same set of objectives S. So, once you compute the MST length for a set of dots S, you'll probably need to use this number again. Make a table of known MST values to avoid re-doing the MST computation.

Submitting to Gradescope

Submit this assignment by uploading search.py to Gradescope. You can upload other files with it, but only search.py will be retained by the autograder.

This assignment is worth 100 points. The grading rubric is as follows:

part total points # mazes points per maze
1 20 points 5 4 points
2 20 points 5 4 points
3 30 points 3 10 points
4 30 points 3 10 points

Plagiarism

It's ok to copy small amounts of utility code from 3rd party sources, as long as the source is acknowledged. However, it is not ok to consult or copy from full implementations of the algorithm in question, e.g. old code from similar MPs (at UIUC or elsewhere). Gradescope has a pretty good code similarity detection system, which checks for similarity of your code to both your classmates and to online sources, and we use it. Please do not copy code from your classmates. Thanks!