Scalable Single-Source SimRank Computation for Large Graphs

Xingkun Gao,Nianyuan Bao,Jie Liu,Jie Tang,Gangshan Wu

Published 2016 in International Conference on Parallel and Distributed Systems

ABSTRACT

SimRank is an effective similarity measure between vertices in a graph, which has become a fundamental technique in graph analytics. Despite its popularity, computation of SimRank is often costly in both space and time, especially with the ever growing scale of graph data nowadays. In this paper, we focus on the computation of Single-Source SimRank: given a query vertex, return the similarities between this vertex and any other vertices in the graph. The traditional centralized SimRank algorithms are not efficient for this problem. To fully utilize the computing power of modern distributed systems, we propose sssSimRank, an efficient distributed algorithm based on the random walk model. Our algorithm achieves scalability via minimizing the total number, the space cost, and the matching time of random walks. We implement our approach on the popular distributed processing platform Spark. Experimental results demonstrate the effectiveness, efficiency and scalability of our method.

PUBLICATION RECORD

Publication year
2016
Venue
International Conference on Parallel and Distributed Systems
Publication date
2016-12-01
Fields of study
Computer Science
Identifiers
DOI 10.1109/ICPADS.2016.0143
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Scalable similarity search for SimRank
2014cited by this paper
Assessing single-pair similarity over graphs by aggregating first-meeting probabilities
2014cited by this paper
Towards efficient SimRank computation on large networks
2013cited by this paper
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
2012influential reference
On Top-k Structural Similarity Search
2012cited by this paper
Delta-SimRank computing on MapReduce
2012cited by this paper
Fast Single-Pair SimRank Computation
2010influential reference
A space and time efficient algorithm for SimRank computation
2010cited by this paper
Parallel SimRank computation on large graphs with iterative aggregation
2010cited by this paper
Fast computation of SimRank for static and dynamic information networks
2010cited by this paper
Accuracy estimate and optimization techniques for SimRank computation
2008influential reference
Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications
2005cited by this paper
Scaling link-based similarity search
2005cited by this paper
Entity Resolution in Graphs
2005cited by this paper
SimRank: a measure of structural-context similarity
2002influential reference
The PageRank Citation Ranking : Bringing Order to the Web
1999cited by this paper
Finding Related Pages in the World Wide Web
1999cited by this paper
Measures of the Amount of Ecologic Association Between Species
1945cited by this paper
Etude comparative de la distribution florale dans une portion des Alpes et des Jura
year unknowncited by this paper

CITED BY

IbLT: An effective granular computing framework for hierarchical community detection
2021cites this paper