Skip to content

Graph Traversal

Hello, welcome to competitive programming!

Today we will cover Depth First Search and Breadth First Search.

Objectives

You will be able to implement both of these algorithms and show how to use them to solve some classic graph algorithms like connected components and …

DFS and BFS

DFS Basics

DFS is one of the first search algorithms computer science students learn for graphs and trees. It is very simple. Starting with the root, you mark the current node as visited, and then recursively visit any unvisited children.

It takes up memory proportional to the depth of the tree or graph you are processing. This usually gives good performance, but if you are dealing with a infinitely deep tree, such as a virtual representation of a decision tree where nodes are calculated and not stored, this can cause trouble.

Recursive DFS Code (from the book)

Here is a recursive implementation. For most of the problems you will solve, an adjacency list is the best implementation. There are exceptions, such as if the graph is extremely dense, and hopefully you will recognize that case from the problem statement.

One of the customs in competitive programming is to create these crazy typedefs for integer pairs, vectors of integer pairs, and integer vectors.

This particular representation uses a pair to represent an edge; the first part gives us the neighbor the edge connects to, the second part is the weight.

We also set two constants, visited equal to 1, and unvisited equal to -1. The dfs_num vector will keep track of the nodes we have visited or not, and the adjacency list is kept in AdjList. The only other trick here is that we have to cast size to an integer since size returns a different type; basically an unsigned quantity.

Notice how this code is similar to a dynamic programming algorithm, where we are only concerned about whether or not we visited a node.

BFS Basics

BFS is the dual of DFS. Whereas DFS prefers to go deep, BFS prefers to go wide.

BFS is interesting for a few reasons. First, if there are more that one node that fulfills the search criterea, BFS will find the one closest to the root of the search. Second, BFS can handle virtual trees of infinite depth without trouble, assuming they aren’t too wide.

The tradeoff is that BFS takes a lot more memory – potentially – than DFS. If you consider the effect of BFS on a binary tree you can see what happens; toward the end of the algorithm all the leaf nodes will be in the queue. Given that half of a binary tree’s nodes are leaves, you will need to plan on the queue possibly growing to be pretty large. Usually this is not a problem though.

BFS Implementation, also from the book

To implement this, we initialize a vector d of size (V) with an initial value INF. We can just pick some huge integer constant to represent infinity. The infinities are how we are going to keep track of whether we have visited a node yet.

After we push the source node into the queue, we can begin.

BFS tends to be iterative instead of recursive; it’s not that hard to write it recursively, but the recursion depth may be a problem for a large graph. Anyway, as long as there are elements in the queue, we pop them out, and enqueue any children that haven’t already been visited, after we set their distances.

Solving Problems

Connected Components

There are many kinds of problems you can solve using these traversals. One of them is to find all the groups of connected components in an undirected graph.

All you have to do is loop though your node vector. If a nodes is marked as unvisited, you just DFS it, and increment your count of connected components.

Flood Fill

Another use for DFS or BFS is called flood fill. The use here is to give each connected component of a 2d space a color. The dr and dc vectors are interesting; they give us the directions on a graph for the eight canonical directions. You’ll see where we recurse that the code is much cleaner than if we had to make eight recursive calls explicitly, or else nest some for loops.

Variables r and c are the coordinates, c1 is the color of the component that interests us, and c2 is the new color we will assign it. As a bonus, we are also able to find the size of the component.

Topological Sort

Topological sort is another classic example. The idea is we want to get a list of nodes that we can visit in such a way that we always visit a nodes parents before the nodes children. If you think about it; it’s just a DFS in which you write down the nodes in the order that you could visit them.

The normal DFS code can do this with a simple modification: after we have called the children, we push the current node onto a vector. When you are done, the vector will have a topological sort, but note that it will be in reverse order.

Calling It

Inside of your main function, you will start by clearing out the ts vector and setting your dfs vector to UNVISITED. Afterwards, the code is similar to the connected components code. The reason you need to do this is if the graph you have is not completely connected. Note that we assume the graph has no cycles.