A Quick Note on Graph Implementations

Today we are going to briefly introduce graphs in the context of computer science and implementations in Python. I have written about graphs before so I don’t want to belabor the point too much, but there are a couple ideas about how graphs can be implemented in code that I haven’t explicitly written about before. Also, if you are interested, I have played around with graph implementations in code thanks to tutorials by [1] and [2].

As we’ve seen before, graphs are a collection of nodes with edges between some or all of the nodes, depending on the type of graph [3]. The edges can be directed or undirected. You may be allowed to have disconnected subgraphs, or your graph may have a path that traverses all the nodes, in which case you have a connected graph. The graph may be cyclic (you can travel in a circle around a group of nodes to return to the node you started at) or it may be acyclic [3].

The concept that I have not really discussed before*1 is how we can represent graphs in code. There are two common methods to do this, using either an adjacency list or an adjacency matrix [3]. With an adjacency list, every vertex is stored as a list, where the entries in the list are the vertices that that particular vertex is connected to. If the graph is undirected, a given edge (a, b) will be stored twice in the lists for both vertex a and b [3].

The alternative to adjacency lists is to use an adjacency matrix, which is an N x N Boolean matrix, where N is the number of nodes [3]. At a given location in the matrix, for example at matrix[i][j], a true value would indicate that there is an edge between vertex i and vertex j. The matrix will be symmetric if the graph is undirected since we will need to have a true value in location matrix[i][j] and location matrix[j][i] [3].

Why does it matter what implementation you choose to use? Your choice of implementation method should be driven by the algorithm you want to apply to the graph [3]. An adjacency list is best for applications where you need to iterate through the neighbors of a specific node very quickly, because you will leverage the advantages of a list data structure [3]. The adjacency matrix will not perform so well if you need to look through all the neighbors of a specific node, because you will need to iterate through all the nodes in the graph to find all the neighbors of one specific node [3].

Algorithms for operating on graphs, starting with breadth-first search (BFS) and depth-first search (DFS), are an entire topic unto themselves. I will definitely cover BFS and DFS in a separate post in the future, and perhaps other topics as well.

Footnotes:

*1 I sort of touched on this in my discussion of graph-based GANS where we used an adjacency tensor, which is basically an adjacency matrix approach on steroids, because our matrix is now three dimensional.

References:

[1] “How to implement a graph in Python.” Edpresso. https://www.educative.io/edpresso/how-to-implement-a-graph-in-python Visited 19 Jan 2021.

[2] Klein, Bernd. “Graphs in Python.” Python Advanced Course Topics. https://www.python-course.eu/graphs_python.php Visited 19 Jan 2021.

[3] Laakmann McDowell, G. Cracking the Coding Interview, 6th edition. 2016. CareerCup, LLC.

Written on January 19, 2021