Week of May 17 Paper Reading
With the start of summer, and the transition to the final two years of the PhD program (yikes!), I have been thinking a lot about what I want to focus on learning during these next two years. One thing I want to do is read papers more frequently so that I stay on top of what is happening in my field. I have never been very good at doing this but now that I have completed most of the requirements in the PhD program (classes, TA assignments, etc.) I have more time to dedicate to doing this well. I’m going to try to write a blog post every week summarizing the papers I have read in that time, and my target is to read 4-5 papers a week for right now. This is going to be my first attempt at recording what I’ve read and learned.
Paper 1: Structure motif-centric learning framework for inorganic crystalline systems by Banjade et al. [1]
This paper presents a novel approach to learning to predict the material properties (like bandgap and formation energy) for a dataset of approximately 22,000 inorganic crystalline compounds (specifically, metal ions). First the paper shows that the structure of a “motif” of a crystalline structure (i.e. the base unit that is repeated to form the structure) contains key information that helps cluster the samples in the dataset. They do this using t-SNE. The results of this clustering do not seem to be used in the second part of the paper, so I am a little confused about the purpose of this section. Next, the authors present a dual-graph approach (i.e. two graphs, one to represent the motif itself on the atomic level, and one to represent how the motif assembles) that can learn to make predictions about the properties of the structures input to the model. The model is called AMDNet and has been shown to outperform other standard models used in the field [1].
I like this paper because it presents a framework for showing that the data you have collected on a structure is relevant for capturing important information about the structure. It also presents a possible model that I could adapt to my self-assembly system to make predictions about what structures it will form. I would like to read references 31 and 48 if I have time.
Paper 2: Motif-Driven Contrastive Learning of Graph Representations by Zhang et al. [2]
This is reference 48 from paper 1 above (reference 31 turned out to be outside my research area). This is a more recent paper than [1] that builds on the idea that using graph motifs can be helpful in training graph neural networks (GNNs) to complete various tasks. Specifically in this paper, the authors present 3 key contributions. First, they argue that their approach to using domain-specific graph motifs (in this case, they are working with small molecules so a motif could be a Benzene ring, for instance) allows them to gain improvements in contrastive learning performance over other methods that operate at the node level or use randomly selected subgraphs. Secondly, they show that they can learn to embed a graph motif in a continuous representation, which allows for differentiation and learning via gradient descent. And finally, they show that after pre-training a GNN with these motifs via contrastive learning, the GNNs perform better than the state-of-the-art in chemical property prediction tasks [2].
This paper was interesting to me for two main reasons: first, I think the idea of leveraging motifs in the training process could be useful to my work too. And secondly, the paper does a good job of demonstrating how their novel contributions add to the performance gains of the model by using an ablation study [2]. I might try to use a similar approach in a future paper, too.
Paper 3: Universal Invariant and Equivariant Graph Neural Networks by Keriven and Peyre [3]
This week I attended a talk by the Scientific Machine Learning Webinar Series at CMU about using GNNs to model fluid flow. One of the questions during the Q&A period mentioned the idea of invariant and equivariant GNNs, and this idea was entirely new to me. I found this paper while I was looking for answers, and it gave a good overview of the concept, as well as a detailed set of mathematical proofs [3].
The key idea here is that GNNs should be either invariant or equivariant. GNNs that are invariant to permutations in the nodes of the input graph will return the same output regardless of the order of input. For example, if I am providing a graph as input using an adjacency matrix, it should not matter what order the nodes are listed in the matrix - my GNN should return the same output value every time regardless of the node ordering. This property is applicable when we are using a GNN to process a graph as input and provide some scalar as output. One example of this could be using a GNN to predict the solubility of small molecules when their atomic structure is provided as input [3].
Similarly, GNNs are equivariant if the output changes to follow permutations in the input. This is applicable when the GNN is going to provide a graph as output. For instance, imagine that you want to take in a modular robot structure represented as a graph and return a graph representing commands that should be sent to each module (many thanks to my colleague Julian Whitman for this idea). In this situation, if you change the order in which you are representing the modules on the robot, you want the GNN to understand there has been a change and modify the output commands so they are still associated with the appropriate module [3].
This paper provides a mathematical proof that certain GNNs are invariant or equivariant. Specifically, they prove “that a GNN described by a single set of parameters can approximate uniformly well a function that acts on graphs of varying size.” The authors explain that they are building upon a fundamental result for classical MLPs developed in the 1980s and 1990s, called the universal approximation theorem [3]. I want to read more about this next because I do not know anything about the significance of this result and it sounds important.
To be honest, I did not read through the mathematical details of the proof in great detail. I also got the impression during the webinars that these two properties have inspired different approaches to training or building GNNs, but I do not fully understand that idea yet. I heard that Prof. Rose Yu is thinking about the importance of invariance and equivariance in her research on GNNs, so perhaps I should start by reading some of her work to get a better understanding of the concepts’ significance.
References:
[1] Banjade, H. R., Hauri, S., Zhang, S., Ricci, F., Gong, W., Hautier, G., Vucetic, S., & Yan, Q. (2021). Structure motif–centric learning framework for inorganic crystalline systems. Science Advances, 7(17), eabf1754. https://doi.org/10.1126/sciadv.abf1754
[2] Zhang, S., Hu, Z., Subramonian, A., & Sun, Y. (n.d.). Motif-Driven Contrastive Learning of Graph Representations; Motif-Driven Contrastive Learning of Graph Representations. https://doi.org/10.1145/1122445.1122456
[3] Keriven, N., & Peyré, G. (n.d.). Universal Invariant and Equivariant Graph Neural Networks. arXiv:1905.04943.