Optimal Bipartite Network Clustering

Published 2018 in Journal of machine learning research

ABSTRACT

We consider the problem of bipartite community detection in networks, or more generally the network biclustering problem. We present a fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice. Under mild regularity conditions, we establish the weak consistency of the procedure (i.e., the convergence of the misclassification rate to zero) under a general bipartite stochastic block model. We show that the procedure is optimal in the sense that it achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants. The optimal rate we obtain sharpens some of the existing results and generalizes others to a wide regime of average degree growth. As a special case, we recover the known exact recovery threshold in the $\log n$ regime of sparsity. To obtain the general consistency result, as part of the provable version of the algorithm, we introduce a sub-block partitioning scheme that is also computationally attractive, allowing for distributed implementation of the algorithm without sacrificing optimality. The provable version of the algorithm is derived from a general blueprint for pseudo-likelihood biclustering algorithms that employ simple EM type updates. We show the effectiveness of this general class by numerical simulations.

PUBLICATION RECORD

Publication year
2018
Venue
Journal of machine learning research
Publication date
2018-03-15
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1803.06031
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Analysis of spectral clustering algorithms for community detection: the general bipartite setting
2018influential reference
Community Detection in Hypergraphs: Optimal Statistical Limit and Efficient Algorithms
2018cited by this paper
Non-Asymptotic Chernoff Lower Bound and Its Application to Community Detection in Stochastic Block Model.
2018cited by this paper
Spectral clustering in the dynamic stochastic block model
2017cited by this paper
Community detection and stochastic block models: recent developments
2017influential reference
Optimal rates for community estimation in the weighted stochastic block model
2017cited by this paper
Matched bipartite block model with covariates
2017cited by this paper
Theoretical and Computational Guarantees of Mean Field Variational Inference for Community Detection
2017cited by this paper
Community Detection in Degree-Corrected Block Models
2016cited by this paper
Performance of a community detection algorithm based on semidefinite programming
2016cited by this paper
Community detection in networks: A user guide
2016cited by this paper
Optimal Estimation and Completion of Matrices with Biclustering Structures
2015influential reference
Achieving Optimal Misclassification Proportion in Stochastic Block Models
2015influential reference
Minimax Rates of Community Detection in Stochastic Block Models
2015cited by this paper
Multisection in the Stochastic Block Model using Semidefinite Programming
2015cited by this paper
A spectral method for community detection in moderately sparse degree-corrected stochastic block models
2015cited by this paper
Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery
2015influential reference
A semidefinite program for unbalanced multisection in the stochastic block model
2015cited by this paper
Mathematical Foundations of Infinite-Dimensional Statistical Models
2015cited by this paper
Random Laplacian Matrices and Convex Relaxations
2015cited by this paper
Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery
2015cited by this paper
Semidefinite programs on sparse random graphs and their application to community detection
2015cited by this paper
A simple proof of Stirling’s formula for the gamma function
2015cited by this paper
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions
2015cited by this paper
Consistency Thresholds for the Planted Bisection Model
2014cited by this paper
Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs
2014cited by this paper
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
2014cited by this paper
Inferring structure in bipartite networks using the latent blockmodel and exact ICL
2014cited by this paper
Efficiently inferring community structure in bipartite networks
2014cited by this paper
A Simple SVD Algorithm for Finding Hidden Partitions
2014cited by this paper
Exact Recovery in the Stochastic Block Model
2014cited by this paper
Community detection in sparse networks via Grothendieck’s inequality
2014cited by this paper
Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms
2014cited by this paper
On semidefinite relaxations for the block model
2014cited by this paper
Community detection thresholds and the weak Ramanujan property
2013cited by this paper
Consistency of spectral clustering in stochastic block models
2013cited by this paper
Consistency of Spectral Clustering in Sparse Stochastic Block Models
2013cited by this paper
Spectral redemption in clustering sparse networks
2013cited by this paper
Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim
2012cited by this paper
Pseudo-likelihood methods for community detection in large sparse networks
2012influential reference
Consistent Adjacency-Spectral Partitioning for the Stochastic Block Model When the Model Parameters Are Unknown
2012cited by this paper
Asymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels
2012cited by this paper
Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications
2011cited by this paper
Consistency of maximum-likelihood and variational estimators in the Stochastic Block Model
2011cited by this paper
Consistency of community detection in networks under degree-corrected stochastic block models
2011cited by this paper
Spectral clustering and the high-dimensional stochastic blockmodel
2010cited by this paper
Identification of Regulatory Modules in Time Series Gene Expression Data Using a Linear Time Biclustering Algorithm
2010cited by this paper
A nonparametric view of network models and Newman–Girvan and other modularities
2009cited by this paper
Bipartite network projection and personal recommendation.
2007cited by this paper
Elements of Information Theory
2005cited by this paper
Information-theoretic co-clustering
2003cited by this paper
Discovering statistically significant biclusters in gene expression data
2002cited by this paper
Co-clustering documents and words using bipartite spectral graph partitioning
2001cited by this paper
Estimation and Prediction for Stochastic Blockstructures
2001cited by this paper
Biclustering of Expression Data
2000cited by this paper
Linear Assignment Problems and Extensions
1999cited by this paper
The tail of the hypergeometric distribution
1979cited by this paper
Direct Clustering of a Data Matrix
1972cited by this paper
The Poisson Approximation to the Poisson Binomial Distribution
1960cited by this paper
A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations
1952influential reference

CITED BY

Optimizing Multilayer Networks Through Time-Dependent Decision-Making: A Comparative Study
2025cites this paper
Using human mobility data to quantify experienced urban inequalities
2025cites this paper
Ultra Energy-Efficient Butterfly Counting in Bipartite Networks via Algorithm-Architecture Co-Optimization
2025cites this paper
Joint estimation of asymmetric community numbers in directed networks
2025cites this paper
Model-Based Co-Clustering in Customer Targeting Utilizing Large-Scale Online Product Rating Networks
2024cites this paper
Counting Butterflies over Streaming Bipartite Graphs with Duplicate Edges
2024cites this paper
Statistical Guarantees for Consensus Clustering
2023cites this paper
Graphon Estimation in bipartite graphs with observable edge labels and unobservable node labels
2023cites this paper
Cohesive Subgraph Search Over Large Heterogeneous Information Networks
2022cites this paper
Studying Asymmetric Structure in Directed Networks by Overlapping and Non-Overlapping Models
2022cites this paper
Bipartite Mixed Membership Distribution-Free Model. A novel model for community detection in overlapping bipartite weighted networks
2022cites this paper
Identifiability and parameter estimation of the overlapped stochastic co-block model
2022cites this paper
Minimax optimal clustering of bipartite graphs with a generalized power method
2022cites this paper
Estimating Graph Dimension with Cross-validated Eigenvalues
2021cites this paper
Subspace Estimation from Unbalanced and Incomplete Data Matrices: 𝓁2, ∞ Statistical Guarantees
2021cites this paper
A simpler spectral approach for clustering in directed networks
2021cites this paper
Sparse Partial Least Squares for Coarse Noisy Graph Alignment
2021cites this paper
Boolean Matrix Factorization via Nonnegative Auxiliary Optimization
2021cites this paper
Directed mixed membership stochastic blockmodel
2021cites this paper
Rate-Optimal Subspace Estimation on Random Graphs
2021cites this paper
Biclustering and Boolean Matrix Factorization in Data Streams
2020cites this paper
Adjusted chi-square test for degree-corrected block models
2020cites this paper
IMPROVED CLUSTERING ALGORITHMS FOR THE BIPARTITE STOCHASTIC BLOCK MODEL
2020cites this paper
Recent Developments in Boolean Matrix Factorization
2020influential citation
Rate optimal Chernoff bound and application to community detection in the stochastic block models
2020cites this paper
Subspace Estimation from Unbalanced and Incomplete Data Matrices: $\ell_{2,\infty}$ Statistical Guarantees.
2019cites this paper
O ct 2 01 9 Subspace Estimation from Unbalanced and Incomplete Data Matrices : l 2 , ∞ Statistical Guarantees
2019cites this paper
Analysis of spectral clustering algorithms for community detection: the general bipartite setting
2018influential citation
Non-Asymptotic Chernoff Lower Bound and Its Application to Community Detection in Stochastic Block Model.
2018cites this paper
Heteroskedastic PCA: Algorithm, optimality, and applications
2018cites this paper
Matched bipartite block model with covariates
2017cites this paper