GraphLab: A New Framework For Parallel Machine Learning

Yucheng Low,Joseph E. Gonzalez,Aapo Kyrola,Danny Bickson,Carlos Guestrin,Joseph M Hellerstein

Published 2010 in Conference on Uncertainty in Artificial Intelligence

ABSTRACT

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems.

PUBLICATION RECORD

Publication year
2010
Venue
Conference on Uncertainty in Artificial Intelligence
Publication date
2010-06-25
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1408.2041
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Gaussian belief propagation : theory and application (פעפוע אמונות גאוסייני.)
2009cited by this paper
Stochastic gradient boosted distributed decision trees
2009cited by this paper
Predicting Risk from Financial Reports with Regression
2009cited by this paper
Residual Splash for Optimally Parallelizing Belief Propagation
2009cited by this paper
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce
2009cited by this paper
Distributed Parallel Inference on Large Factor Graphs
2009cited by this paper
Fully distributed EM for very large datasets
2008cited by this paper
MapReduce: simplified data processing on large clusters
2008cited by this paper
Pig latin: a not-so-foreign language for data processing
2008cited by this paper
Dryad: distributed data-parallel programs from sequential building blocks
2007cited by this paper
An Interior-Point Method for Large-Scale $\ell_1$-Regularized Least Squares
2007cited by this paper
Residual Belief Propagation: Informed Scheduling for Asynchronous Message Passing
2006cited by this paper
Map-Reduce for Machine Learning on Multicore
2006cited by this paper
Learning to Extract Entities from Labeled and Unlabeled Text
2005cited by this paper
Analyzing the effectiveness and applicability of co-training
2000influential reference
Penalized Regressions: The Bridge versus the Lasso
1998cited by this paper
Partial Solutions Manual Parallel and Distributed Computation : Numerical Methods
1997cited by this paper
Regression Shrinkage and Selection via the Lasso
1996cited by this paper
Probabilistic reasoning in intelligent systems: Networks of plausible inference
1991cited by this paper
Parallel and distributed computation
1989cited by this paper
Bounds for certain multiprocessing anomalies
1966cited by this paper

CITED BY

CGA: Accelerating BFS Through an Sparsity-Aware Adaptive Framework on Heterogeneous Platforms
2026cites this paper
Metrics for spin-based computing
2025cites this paper
Ripple: Asynchronous Programming for Spatial Dataflow Architectures
2025cites this paper
Scalable Diversity-Aware Feature Scoring for Biomedical Big Data via Hypercube-Based Density Estimation
2025cites this paper
Taijigraph: an Out-Of-Core Graph Processing System Enhanced with Computational Storage
2025cites this paper
LSM-Community: A Graph Storage System Exploiting Community Structure in Social Networks
2025cites this paper
Near optimal edge partitioning via intersecting families
2025cites this paper
Scaling down to scale up: benchmarking single-machine graph processing frameworks in a hardware-constrained environment
2025cites this paper
ACGraph: An Efficient Asynchronous Out-of-Core Graph Processing Framework
2025cites this paper
PANNS: Enhancing Graph-based Approximate Nearest Neighbor Search through Recency-aware Construction and Parameterized Search
2025cites this paper
Fast Adaptive Approximate Nearest Neighbor Search with Cluster-Shaped Indices
2025influential citation
Distributed and Adaptive Partitioning for Large Graphs in Geo-Distributed Data Centers
2025cites this paper
NOVA: A Novel Vertex Management Architecture for Scalable Graph Processing
2025cites this paper
High-Performance Graph Storage and Mutation for Graph Processing and Streaming
2025cites this paper
Graph Crypto-Stego System for Securing Graph Data Using Association Schemes
2024cites this paper
Secure shortest distance queries over encrypted graph in cloud computing
2024cites this paper
BCSR on GPU: A Way Forward Extreme-scale Graph Processing on Accelerator-enabled Frontier Supercomputer
2024cites this paper
LocalTGEP: A Lightweight Edge Partitioner for Time-Varying Graph
2024cites this paper
Swift: A Multi-FPGA Framework for Scaling Up Accelerated Graph Analytics
2024cites this paper
An effective algorithm for genealogical graph partitioning
2024cites this paper
Optimising Queries for Pattern Detection Over Large Scale Temporally Evolving Graphs
2024cites this paper
Comparative Study of Large Graph Processing Systems
2024cites this paper
VPCS: Verifiable Query Scheme for Privacy-preserving Constrained Shortest Path over Encrypted Graph Data
2024cites this paper
The Selection Problem in Multi-Query Optimization: a Comprehensive Survey
2024cites this paper
Dynamic-ACTS - A Dynamic Graph Analytics Accelerator For HBM-Enabled FPGAs
2024cites this paper
TGLite: A Lightweight Programming Framework for Continuous-Time Temporal Graph Neural Networks
2024cites this paper
Parallel optimization and application of unstructured sparse triangular solver on new generation of Sunway architecture
2024cites this paper
Graph Computation with Adaptive Granularity
2024influential citation
Core Graph: Exploiting Edge Centrality to Speedup the Evaluation of Iterative Graph Queries
2024cites this paper
A Configurable Framework for High-Performance Graph Storage and Mutation
2024cites this paper
Chasing Parallelism in Aggregating Graph Queries
2024cites this paper
Knowledge graph based reasoning in medical image analysis: A scoping review
2024cites this paper
Efficient Edge Computing: A Survey of High-Throughput Concurrent Processing Strategies for Graph Data
2024cites this paper
SharkGraph: A Time Series Distributed Graph System
2023cites this paper
OneGraph: a cross-architecture framework for large-scale graph computing on GPUs based on oneAPI
2023cites this paper
MEGA Evolving Graph Accelerator
2023cites this paper
Flip: Data-centric Edge CGRA Accelerator
2023cites this paper
ByteGAP: A Non-continuous Distributed Graph Computing System using Persistent Memory
2023cites this paper
Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines
2023cites this paper
A lightweight distributed processing computer system
2023cites this paper
Balanced parallel triangle enumeration with an adaptive algorithm
2023cites this paper
Modern Data Pricing Models: Taxonomy and Comprehensive Survey
2023cites this paper
The Evolution of Distributed Systems for Graph Neural Networks and Their Origin in Graph Processing and Deep Learning: A Survey
2023cites this paper
The Analysis of Distributed Computing Systems with Machine Learning
2023cites this paper
iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures
2023cites this paper
Distributed and deep vertical federated learning with big data
2023cites this paper
Efficiently Exploiting Irregular Parallelism Using Keys at Scale
2023cites this paper
Betty: Enabling Large-Scale GNN Training with Batch-Level Graph Partitioning
2023cites this paper
ACTS: A Near-Memory FPGA Graph Processing Framework
2023cites this paper
An unsupervised learning-guided multi-node failure-recovery model for distributed graph processing systems
2023influential citation
Liberator: A Data Reuse Framework for Out-of-Memory Graph Computing on GPUs
2023influential citation
Real Time Data Processing and Predictive Analytics Using Cloud Based Machine Learning
2023cites this paper
Research on communication mechanism optimization based on distributed graph computing environment
2023cites this paper
Efficient Multi-GPU Graph Processing with Remote Work Stealing
2023cites this paper
CommonGraph: Graph Analytics on Evolving Data
2023cites this paper
Parallel and Distributed Machine Learning Techniques for Anomaly Detection Systems
2023cites this paper
Saba: Rethinking Datacenter Network Allocation from Application's Perspective
2023cites this paper
Using Local Cache Coherence for Disaggregated Memory Systems
2023cites this paper
Multi-Task Processing in Vertex-Centric Graph Systems: Evaluations and Insights
2023influential citation
PieRank: Embedded Large-Scale Sparse Matrix Processing
2023cites this paper
Detection of fickle trolls in large-scale online social networks
2022cites this paper
A Machine Learning Framework for Predicting Sports Results Based on Multi-Frame Mining
2022cites this paper
An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs*
2022cites this paper
Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism
2022cites this paper
FUSED-PAGERANK: Loop-Fusion based Approximate PageRank
2022cites this paper
Programming big data analysis: principles and solutions
2022influential citation
Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph Analytics
2022cites this paper
MeNDA: a near-memory multi-way merge solution for sparse transposition and dataflows
2022cites this paper
WawPart: Workload-Aware Partitioning of Knowledge Graphs
2022cites this paper
A parallel compression framework for fractal images using DCT block classification
2022cites this paper
Carbink: Fault-Tolerant Far Memory
2022cites this paper
UniCon: A unified star-operation to efficiently find connected components on a cluster of commodity hardware
2022cites this paper
Application of Matrix Algorithm Based on Graph Theory in Real-time Fault Diagnosis Knowledge Perfection Detection of Spacecraft Telemetry Data
2022cites this paper
A Model for Scalable and Balanced Accelerators for Graph Processing
2022cites this paper
Prediction and Management of Regional Economic Scale Based on Machine Learning Model
2022cites this paper
Machine Learning-based Selection of Graph Partitioning Strategy Using the Characteristics of Graph Data and Algorithm
2022cites this paper
The Development and Application of Artificial Intelligence Chips
2022cites this paper
Adaptive Partitioning for Large-Scale Graph Analytics in Geo-Distributed Data Centers
2022cites this paper
A survey on machine learning in array databases
2022cites this paper
A survey of continuous subgraph matching for dynamic graphs
2022cites this paper
TileSpMSpV: A Tiled Algorithm for Sparse Matrix-Sparse Vector Multiplication on GPUs
2022cites this paper
Correct Compilation of Semiring Contractions
2022cites this paper
A Fast Data Structure for Dynamic Graphs Based on Hash-Indexed Adjacency Blocks
2022cites this paper
gSuite: A Flexible and Framework Independent Benchmark Suite for Graph Neural Network Inference on GPUs
2022cites this paper
Streaming Sparse Graphs using Efficient Dynamic Sets
2021cites this paper
Algorithm analysis based on machine learning in alarm information of metering system
2021cites this paper
A Hybrid Synchronization Mechanism for Parallel Sparse Triangular Solve
2021cites this paper
swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight
2021cites this paper
Magas: matrix-based asynchronous graph analytics on shared memory systems
2021influential citation
A Novel Map Reduced Based Parallel Feature Selection and Extreme Learning for Micro Array Cancer Data Classification
2021cites this paper
Local Graph Edge Partitioning
2021cites this paper
Bibliometrics of Machine Learning Research Using Homomorphic Encryption
2021cites this paper
Natural Language Processing Techniques to Reveal Human-Computer Interaction for Development Research Topics
2021cites this paper
Privacy and efficiency guaranteed social subgraph matching
2021cites this paper
GRAPE for fast and scalable graph processing and random-walk-based embedding
2021cites this paper
LightFed: An Efficient and Secure Federated Edge Learning System on Model Splitting
2021cites this paper
Big Data Analysis in Bioinformatics
2021cites this paper
Unsupervised MKL in Multi-layer Kernel Machines
2021cites this paper
An Improved and Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs
2021cites this paper
JetStream: Graph Analytics on Streaming Data with Event-Driven Hardware Accelerator
2021cites this paper