GraphAnalysis.org: Workshop on Graphs, Architectures, Programming, and Learning (GrAPL 2020)

GrAPL 2020: Workshop on Graphs, Architectures, Programming, and Learning

Virtual

19 May 2020

Scope and Goals:

GrAPL is the result of the combination of two IPDPS workshops

GABB: Graph Algorithms Building Block
GraML: Workshop on The Intersection of Graph Algorithms and Machine Learning

Data analytics is one of the fastest growing segments of computer science. Much of the recent focus in Data Analytics has emphasized machine learning. This is understandable given the success of deep learning over the last decade. However, many real-world analytic workloads are a mix of graph and machine learning methods. Graphs play an important role in the synthesis and analysis of relationships and organizational structures, furthering the ability of machine-learning methods to identify signature features. Given the difference in the parallel execution models of graph algorithms and machine learning methods, current tools, runtime systems, and architectures do not deliver consistently good performance across data analysis workflows. In this workshop we are interested in Graphs, how their synthesis (representation) and analysis is supported in hardware and software, and the ways graph algorithms interact with machine learning. The workshop’s scope is broad which is a natural outgrowth of the wide range of methods used in large-scale data analytics workflows.

This workshop seeks papers on the theory, model-based analysis, simulation, and analysis of operational data for graph analytics and related machine learning applications. We are particularly interested in papers that:

Provide tractability performance analysis in terms of complexity, time-to-solution, problem size, and quality of solution for systems that deal with mixed data analytics workflows.
Discuss the problem domains and problems addressable with graph methods, machine learning methods, or both;
Discuss programming models and associated frameworks such as Pregel, Galois, Boost, GraphBLAS, GraphChi, etc., for building large multi-attributed graphs;
Discuss how frameworks for building graph algorithms interact with those for building machine learning algorithms;
Discuss hardware platforms specialized for addressing large, dynamic, multi-attributed graphs and associated machine learning;

Besides regular papers, papers describing work-in-progress or incomplete but sound, innovative ideas related to the workshop theme are also encouraged.

Location:

This workshop is co-located with IPDPS 2020, held 18-22 May 2020, virtual. Registration information for IPDPS2020 can be found at here.

Program:

Keynote Talk 1:

Deep Graph Library: Overview, Updates, and Future Developments

George Karypis
(AWS Deep Learning Science & University of Minnesota)

Learning from graph and relational data plays a major role in many applications including social network analysis, marketing, e-commerce, information retrieval, knowledge modeling, medical and biological sciences, engineering, and others. In the last few years, Graph Neural Networks (GNNs) have emerged as a promising new supervised learning framework capable of bringing the power of deep representation learning to graph and relational data. This ever-growing body of research has shown that GNNs achieve state-of-the-art performance for problems such as link prediction, fraud detection, target-ligand binding activity prediction, knowledge-graph completion, and product recommendations.

Deep Graph Library (DGL) is an open source development framework for writing and training GNN-based models. It is designed to simplify the development of such models by using graph-based abstractions while at the same time achieving high computational efficiency and scalability by relying on optimized sparse matrix operations and existing highly optimized standard deep learning frameworks (e.g., MXNet, PyTorch, and TensorFlow). This talk provides an overview of DGL, describes some recent developments related to high-performance multi-GPU, multi-core, and distributed training, and describes our future development roadmap.

George Karypis is a Distinguished McKnight University Professor and an ADC Chair of Digital Technology at the Department of Computer Science & Engineering at the University of Minnesota, Twin Cities. His research interests span the areas of data mining, high performance computing, information retrieval, collaborative filtering, bioinformatics, cheminformatics, and scientific computing. His research has resulted in the development of software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering-based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO), finding frequent patterns in diverse datasets (PAFI), and for protein secondary structure prediction (YASSPP). He has coauthored over 280 papers on these topics and two books (“Introduction to Protein Structure Prediction: Methods and Algorithms” (Wiley, 2010) and “Introduction to Parallel Computing” (Publ. Addison Wesley, 2003, 2nd edition)). In addition, he is serving on the program committees of many conferences and workshops on these topics, and on the editorial boards of the IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery from Data, Data Mining and Knowledge Discovery, Social Network Analysis and Data Mining Journal, International Journal of Data Mining and Bioinformatics, the journal on Current Proteomics, Advances in Bioinformatics, and Biomedicine and Biotechnology. He is a Fellow of the IEEE.

Keynote Talk 2:

The GraphIt Universal Graph Framework: Achieving High-Performance across Algorithms, Graph Types, and Architectures

Saman Amarasinghe (Massachusetts Institute of Technology)
Video

In recent years, large graphs with billions of vertices and trillions of edges have emerged in many domains, such as social network analytics, machine learning, physical simulations, and biology. However, optimizing the performance of graph applications is notoriously difficult due to irregular memory access patterns and load imbalance across cores. The performance of graph programs depends highly on the algorithm, the size, and structure of the input graphs, as well as the features of the underlying hardware. No single set of optimizations or single hardware platform works well across all applications.

In this talk, I will present the GraphIt Universal Graph Framework, a domain-specific language (DSL) that achieves consistent high-performance across different algorithms, graphs, and architectures, while offering an easy-to-use high-level programming model. GraphIt language decouples the program specification (algorithm language) from performance optimizations (scheduling language), and the GraphIt compiler separates hardware-independent transformations from multiple architecture-specific backends (GraphVMs). I will describe how GraphIt achieves up to 4.8x speedup over state-of-the-art graph frameworks on CPUs and GPUs, while reducing the lines of code by up to an order of magnitude.

Saman P. Amarasinghe is a Professor in the Department of Electrical Engineering and Computer Science at Massachusetts Institute of Technology and a member of its Computer Science and Artificial Intelligence Laboratory (CSAIL) where he leads the Commit compiler group. Under Saman's guidance, the Commit group developed the StreamIt, StreamJIT, PetaBricks, Halide, Simit, MILK, Cimple, TACO, GraphIt, Tiramisu, BioStream and Seq domain specific languages and compilers, DynamoRIO dynamic instrumentation system, Superword level parallelism for SIMD vectorization, Program Shepherding to protect programs against external attacks, the OpenTuner extendable autotuner, and the Kendo deterministic execution system. He was the co-leader of the Raw architecture project. Saman was the founder of Determina Corporation, and a co-founder of Lanka Internet Services Ltd., and Venti Technologies Corporation. Saman received his BS in Electrical Engineering and Computer Science from Cornell University in 1988, and his MSEE and Ph.D. from Stanford University in 1990 and 1997, respectively. He is an ACM Fellow.

Time	Event


8:00~8:45	Session 1 Welcome message Algorithms and Applications Kronecker Graph Generation with Ground Truth for 4-Cycles and Dense Structure in Bipartite Graph [paper] Trevor Steil (University of Minnesota), Scott McMillan (SEI, Carnegie Mellon University), Geoffrey Sanders (LLNL), Roger Pearce (LLNL), Benjamin Priest (LLNL) A scalable graph generation algorithm to sample over a given shell distribution [paper] M. Yusuf Özkaya (Georgia Institute of Technology), Muhammed Fatih Balin (Georgia Institute of Technology), Ali Pinar (SNL), Ümit V. Çatalyürek (Georgia Institute of Technology) An incremental GraphBLAS solution for the 2018 TTC Social Media case study [paper] Márton Elekes (Budapest University of Technology and Economics), Gábor Szárnyas (Budapest University of Technology and Economics) Linear Algebraic Louvain Method in Python [paper] Tze Meng Low (Carnegie Mellon University), Daniele Spampinato (Carnegie Mellon University), Scott McMillan (SEI, Carnegie Mellon University), Michel Pelletier (FPX, LLC)
8:45~9:00	Break
9:00~9:45	Session 2 Keynote - The GraphIt Universal Graph Framework: Achieving High-Performance across Algorithms, Graph Types and Architectures Saman Amarasinghe (Massachusetts Institute of Technology) API's and Implementations Parallelizing Maximal Clique Enumeration on Modern Manycore Processors [paper] Jovan Blanuša (IBM Research - Zürich, EPFL), Radu Stoica (IBM Research - Zürich), Paolo Ienne (EPFL), Kubilay Atasu (IBM Research - Zürich) A Roadmap for the GraphBLAS C++ API [paper] Benjamin A. Brock (UC Berkeley), Aydın Buluç (LBNL), Timothy G. Mattson (Intel), Scott McMillan (SEI, Carnegie Mellon University), José E. Moreira (IBM) Considerations for a Distributed GraphBLAS API [paper] Benjamin A. Brock (UC Berkeley), Aydın Buluç (LBNL), Timothy G. Mattson (Intel), Scott McMillan (SEI, Carnegie Mellon University), José E. Moreira (IBM), Roger Pearce (LLNL), Oguz Selvitopi (LBNL), Trevor Steil (University of Minnesota) 75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices [paper] Jeremy Kepner (MIT Lincoln Laboratory)

Details and Dates

Due to the COVID-19 emergency, the physical IPDPS 2020 conference has been canceled. Papers and static presentations, including the GrAPL workshop material, will be made available to all IPDPS registrants by Friday May 15th.

The GrAPL organizing committee has planned an exciting online program, consisting in two LIVE 45-minute Q&S sessions on May 18 (starting at 8:00 AM PDT, 3:00 PM UTC, 5:00 PM CET) with the keynote speakers and the invited talks according to the schedule below. The schedule below contains videos of the keynote talks and 3-5 lighting talks of accepted papers authors pitching the GrAPL community to read the papers and prepare to ask questions at the online sessions.

Register for free at the IPDPS website to get instructions on how to access papers and static presentations for GrAPL: http://www.ipdps.org

To attend the Zoom Sessions, we ask participants to register in advance at the following link: https://tinyurl.com/grapl2020 The organizing committee will then provide the link to the session.

Workshop Organizers:

General Co-Chairs:

Scott McMillan (CMU SEI)
Manoj Kumar (IBM)

Program Co-Chairs:

Danai Koutra (University of Michigan, Ann Arbor)
Mahantesh Halappanavar (PNNL)

GrAPL's Little Helpers:

Tim Mattson (Intel)
Antonino Tumeo (PNNL)

Technical Program committee members (in addition to the chair):

Nesreen K Ahmed, Intel Research and Intel AI, USA
Sasikanth Avancha, Intel Labs - Parallel Computing Lab, India
Aydin Buluç, Lawrence Berkeley National Lab, USA
Timothy A. Davis, University of Florida, USA
Jana Doppa, Washington State University, USA
John Gilbert, University of California at Santa Barbara, USA
Sergio Gómez, Universitat Rovira i Virgili, Catalonia
Will Hamilton, McGill University, Mila, Canada
Stratis Ioannidis, Northeastern University, Boston, USA
Bharat Kaul, Intel Labs - Parallel Computing Labs, India
Kamesh Madduri, The Pennsylvania State University, USA
Henning Meyerhenke, Humboldt University of Berlin, Germany
Indranil Roy, Natural Intelligence, USA
Robert Rallo, Pacific Northwest National Lab, USA
P. Sadayappan, University of Utah, USA
Yizhou Sun, University of California, Los Angeles, USA
Flavio Vella, Free University of Bozen, Italy

Steering committee:

David A. Bader (NEw Jersey Institute of Technology)
Aydın Buluç (LBNL)
John Feo (PNNL)
John Gilbert (UC Santa Barbara)
Tim Mattson (Intel)
Ananth Kalyanaraman (Washington State University)
Jeremy Kepner (MIT Lincoln Labs)
Antonino Tumeo (PNNL)

HPC Graph Analysis