Boosted seed oversampling for local community ranking

Emmanouil Krasanakis,Emmanouil Schinas,S. Papadopoulos,Y. Kompatsiaris,A. Symeonidis

Published 2020 in Information Processing & Management

ABSTRACT

Abstract Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priorly known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities. In our experiments, boosted and simple seed oversampling yielded better rank quality than the previous neighborhood inflation heuristic, which adds the neighborhoods of original seed nodes to seeds.

PUBLICATION RECORD

Publication year
2020
Venue
Information Processing & Management
Publication date
2020-03-01
Fields of study
Computer Science
Identifiers
DOI 10.1016/J.IPM.2019.06.002
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Mean Field Analysis of Personalized PageRank with Implications for Local Graph Clustering
2018cited by this paper
Inductive Representation Learning on Large Graphs
2017cited by this paper
Local Higher-Order Graph Clustering
2017cited by this paper
Random Walk with Restart on Large Graphs Using Block Elimination
2016cited by this paper
Community detection in networks: A user guide
2016cited by this paper
Block models and personalized PageRank
2016cited by this paper
The ground truth about metadata and community detection in networks
2016cited by this paper
Personalized PageRank Estimation and Search: A Bidirectional Approach
2015cited by this paper
One-Class Classification with Extreme Learning Machine
2015cited by this paper
BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs
2015cited by this paper
Robust Local Community Detection: On Free Rider Effect and Its Elimination
2015cited by this paper
Neighbourhood sampling in bagging for imbalanced data
2015cited by this paper
FAST-PPR: scaling personalized pagerank estimation for large graphs
2014cited by this paper
Learning Deep Representations for Graph Clustering
2014cited by this paper
Cost-sensitive decision tree ensembles for effective imbalanced classification
2014cited by this paper
Heat kernel based community detection
2014cited by this paper
A Theoretical Analysis of NDCG Type Ranking Measures
2013cited by this paper
A Theoretical Analysis of NDCG Ranking Measures
2013cited by this paper
Overlapping community detection using seed set expansion
2013cited by this paper
GMAC: A Seed-Insensitive Approach to Local Community Detection
2013cited by this paper
Local Community Detection Using Link Similarity
2012cited by this paper
Community-Affiliation Graph Model for Overlapping Network Community Detection
2012cited by this paper
Defining and evaluating network communities based on ground-truth
2012cited by this paper
A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches
2012cited by this paper
Vertex neighborhoods, low conductance cuts, and good seeds for local community methods
2012cited by this paper
A Method for Local Community Detection by Finding Core Nodes
2012cited by this paper
Community detection in Social Media
2012cited by this paper
Overlapping community detection in networks: The state-of-the-art and comparative study
2011cited by this paper
LEARNING OVERLAPPING COMMUNITIES IN COMPLEX NETWORKS VIA NON-NEGATIVE MATRIX FACTORIZATION
2011cited by this paper
Partitioning Breaks Communities
2011cited by this paper
A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms
2011cited by this paper
Fast Incremental and Personalized PageRank
2010cited by this paper
RUSBoost: A Hybrid Approach to Alleviating Class Imbalance
2010cited by this paper
Empirical comparison of algorithms for network community detection
2010cited by this paper
Semi-Supervised Classification of Network Data Using Very Few Labels
2010cited by this paper
A Local Graph Partitioning Algorithm Using Heat Kernel Pagerank
2009cited by this paper
Community detection in graphs
2009cited by this paper
Learning to Rank by Optimizing NDCG Measure
2009cited by this paper
ArnetMiner: extraction and mining of academic social networks
2008cited by this paper
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters
2008cited by this paper
One-Class Collaborative Filtering
2008cited by this paper
An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons
2008cited by this paper
The heat kernel as the pagerank of a graph
2007cited by this paper
Measurement and analysis of online social networks
2007cited by this paper
Near linear time algorithm to detect community structures in large-scale networks.
2007cited by this paper
Fast Random Walk with Restart and Its Applications
2006cited by this paper
Statistical mechanics of community detection.
2006cited by this paper
Modularity and community structure in networks.
2006cited by this paper
Statistical Comparisons of Classifiers over Multiple Data Sets
2006cited by this paper
Local Graph Partitioning using PageRank Vectors
2006cited by this paper
The dynamics of viral marketing
2005cited by this paper
Uncovering the overlapping community structure of complex networks in nature and society
2005cited by this paper
Boosting Weak Ranking Functions to Enhance Passage Retrieval for Question Answering
2004cited by this paper
Automatic multimedia cross-modal correlation discovery
2004cited by this paper
Mining with rarity: a unifying framework
2004cited by this paper
Finding communities in linear time: a physics approach
2003cited by this paper
Learning with Local and Global Consistency
2003cited by this paper
AUC: a Statistically Consistent and more Discriminating Measure than Accuracy
2003cited by this paper
One-Class SVMs for Document Classification
2002cited by this paper
IR evaluation methods for retrieving highly relevant documents
2000cited by this paper
An Efficient Boosting Algorithm for Combining Preferences
1998cited by this paper
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997cited by this paper
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
1982cited by this paper
Ieee Transactions on Knowledge and Data Engineering 1 Overlapping Community Detection Using Neighborhood-inflated Seed Expansion
year unknowncited by this paper
Noname manuscript No. (will be inserted by the editor) Ranking, Boosting, and Model Adaptation
year unknowncited by this paper
Ieee Transactions on Knowledge and Data Engineering 1 Resampling-based Ensemble Methods for Online Class Imbalance Learning
year unknowncited by this paper

CITED BY

A depth-first search approach to detect the community structure of weighted networks using the neighbourhood proximity measure
2024cites this paper
Prior Signal Editing for Graph Filter Posterior Fairness Constraints
2021influential citation
pygrank: A Python Package for Graph Node Ranking
2021cites this paper
Foreword to the special issue on mining actionable insights from social networks
2020cites this paper
Applying Fairness Constraints on Graph Node Ranks Under Personalization Bias
2020cites this paper
Unsupervised evaluation of multiple node ranks by reconstructing local structures
2020cites this paper
Stopping Personalized PageRank without an Error Tolerance Parameter
2020cites this paper
LinkAUC: Unsupervised Evaluation of Multiple Network Node Ranks Using Link Prediction
2019cites this paper