Classic Logic

DFS Subtlety

I’ve been brushing up my algorithms knowledge during my free time. My preferred way to do this is working on Steven Skiena’s excellent Algorithm Design Manual.

Depth-first traversal of graphs have struck me as being subtle. In this post I’m jotting down the subtleties involved in discovering back-edges, so that I don’t have to re-think it every time.

Here’s a python implementation of the DFS routine using pseudocode from the book:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def depth_first_search(graph, vertex, parent, state):
    state[vertex] = VertexState.DISCOVERED
    process_vertex_early(vertex)
    for adj_vertex in graph[vertex]:
        if state[adj_vertex] != VertexState.DISCOVERED:
            parent[adj_vertex] = vertex
            process_edge(vertex, adj_vertex)
            depth_first_search(graph, adj_vertex, parent, state)
        elif state[adj_vertex] != VertexState.PROCESSED and parent[vertex] != adj_vertex:
            # adj_vertex has been discovered before
            print("back-edge from {} to {}".format(vertex, adj_vertex))
            process_edge(vertex, adj_vertex)

    # we have processed all our adjacent vertices (if any)
    # therefore we are processed
    process_vertex_late(vertex)
    state[vertex] = VertexState.PROCESSED

VertexState is an enum with UNDISCOVERED, DISCOVERED, and PROCESSED states, denoting whether we have discovered and completely processed a vertex in our depth-first traversal

state[vertex] is initialized to VertexState.UNDISCOVERED for all vertices.

Vertex states can only progress from undiscovered to discovered to processed.

Look at the first if:

if state[adj_vertex] != VertexState.DISCOVERED:
            parent[adj_vertex] = vertex
            process_edge(vertex, adj_vertex)
            depth_first_search(graph, adj_vertex, parent, state)

  • If the adjacent vertex is not (yet) discovered, we’re seeing it for the first time.
  • Parent of the adjacent vertex is therefore current vertex
  • We process the edge as required, and do a depth-first traversal on the newly-discovered adjacent vertex
  • Marking of the adjacent vertex as DISCOVERED is done as the first step in the recursive call

Now take a look at the elif:

elif state[adj_vertex] != VertexState.PROCESSED and parent[vertex] != adj_vertex:
    print("back-edge from {} to {}".format(vertex, adj_vertex))
    process_edge(vertex, adj_vertex)

  • if the adj_vertex is not in the processed state, then it is in the DISCOVERED state. If it wasn’t even in the discovered state, it would be in the UNDISCOVERED state, and would have gone to the if condition above.
  • The edge (vertex, adjacent_vertex) can be:
    • Cannot be the edge to an undiscovered child because of the above point
    • Cannot be the edge to a discovered child (and not ancestor) because in a depth-first traversal, we process all our children before examining our undiscovered children. ie., our children are either undiscovered, or they are processed. They cannot end up in a discovered state. Once you discover a child, you completely process it before dealing with other children.
    • May (or may not) be the edge to our immediate parent.

If it turns out that this adjacent vertex is infact not our parent (checked by the 2nd part of the elif condition), the only other explanation would be if it is an ancestor. This means we discovered a back-edge.