A Quick Note on Graph Implementations
Today we are going to briefly introduce graphs in the context of computer science and implementations in Python. I have written about graphs before so I don’t want to belabor the point too much, but there are a couple ideas about how graphs can be implemented in code that I haven’t explicitly written about before. Also, if you are interested, I have played around with graph implementations in code thanks to tutorials by [1] and [2].
As we’ve seen before, graphs are a collection of nodes with edges between some or all of the nodes, depending on the type of graph [3]. The edges can be directed or undirected. You may be allowed to have disconnected subgraphs, or your graph may have a path that traverses all the nodes, in which case you have a connected graph. The graph may be cyclic (you can travel in a circle around a group of nodes to return to the node you started at) or it may be acyclic [3].
The concept that I have not really discussed before*1 is how we can represent graphs in code. There are two common methods to do this, using either an adjacency list or an adjacency matrix [3]. With an adjacency list, every vertex is stored as a list, where the entries in the list are the vertices that that particular vertex is connected to. If the graph is undirected, a given edge (a, b) will be stored twice in the lists for both vertex a and b [3].
The alternative to adjacency lists is to use an adjacency matrix, which is an N x N Boolean matrix, where N is the number of nodes [3]. At a given location in the matrix, for example at matrix[i][j], a true value would indicate that there is an edge between vertex i and vertex j. The matrix will be symmetric if the graph is undirected since we will need to have a true value in location matrix[i][j] and location matrix[j][i] [3].
Why does it matter what implementation you choose to use? Your choice of implementation method should be driven by the algorithm you want to apply to the graph [3]. An adjacency list is best for applications where you need to iterate through the neighbors of a specific node very quickly, because you will leverage the advantages of a list data structure [3]. The adjacency matrix will not perform so well if you need to look through all the neighbors of a specific node, because you will need to iterate through all the nodes in the graph to find all the neighbors of one specific node [3].
Algorithms for operating on graphs, starting with breadth-first search (BFS) and depth-first search (DFS), are an entire topic unto themselves. I will definitely cover BFS and DFS in a separate post in the future, and perhaps other topics as well.
Footnotes:
*1 I sort of touched on this in my discussion of graph-based GANS where we used an adjacency tensor, which is basically an adjacency matrix approach on steroids, because our matrix is now three dimensional.
References:
[1] “How to implement a graph in Python.” Edpresso. https://www.educative.io/edpresso/how-to-implement-a-graph-in-python Visited 19 Jan 2021.
[2] Klein, Bernd. “Graphs in Python.” Python Advanced Course Topics. https://www.python-course.eu/graphs_python.php Visited 19 Jan 2021.
[3] Laakmann McDowell, G. Cracking the Coding Interview, 6th edition. 2016. CareerCup, LLC.