Large scale GNN training with DGL, SAMPL Talk 2021, 12, 02

Views: 10

SAMPL Talk 2021, 12, 02 Title: Largescale GNN training with DGL Speaker: Da Zheng (AWS AI) Abstract: Graph neural networks (GNN) have shown great success in learning from graphstructured data. They are widely used in various applications, such as recommendation, fraud detection, and search. In these domains, the graphs are typically large, containing hundreds of millions of nodes and several billions of edges. To scale graph neural network training on large graphs, we adopt hybrid CPU, GPU minibatch training, in which we store graph data and sample nodes and their neighbors in CPU, and perform minibatch computation in GPUs. In this talk, I will discuss the optimizations for GNN minibatch training in two aspects. First, I will discuss our effort of scaling GNN training to a cluster of CPU and GPUs. We develop multiple optimizations to address the challenges in distributed hybrid CPU, GPU training (reduce data movement and balance the load in minibatch computation). With these optimization