Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression

Published 2012 in Symposium on the Theory of Computing

ABSTRACT

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for common linear algebra problems. We show that, given a matrix A ∈ Rn x d with n >> d and a p ∈ [1, 2), with a constant probability, we can construct a low-distortion embedding matrix Π ∈ RO(poly(d)) x n that embeds Ap, the lp subspace spanned by A's columns, into (RO(poly(d)), |~cdot~|p); the distortion of our embeddings is only O(poly(d)), and we can compute Π A in O(nnz(A)) time, i.e., input-sparsity time. Our result generalizes the input-sparsity time l2 subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for l2. These input-sparsity time lp embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as (1 pm ε)-distortion lp subspace embedding and relative-error lp regression. For l2, we show that a (1+ε)-approximate solution to the l2 regression problem specified by the matrix A and a vector b ∈ Rn can be computed in O(nnz(A) + d3 log(d/ε) /ε^2) time; and for lp, via a subspace-preserving sampling procedure, we show that a (1 pm ε)-distortion embedding of Ap into RO(poly(d)) can be computed in O(nnz(A) ⋅ log n) time, and we also show that a (1+ε)-approximate solution to the lp regression problem minx ∈ Rd |A x - b|p can be computed in O(nnz(A) ⋅ log n + poly(d) log(1/ε)/ε2) time. Moreover, we can also improve the embedding dimension or equivalently the sample size to O(d3+p/2 log(1/ε) / ε2) without increasing the complexity.

PUBLICATION RECORD

Publication year
2012
Venue
Symposium on the Theory of Computing
Publication date
2012-10-10
Fields of study
Mathematics, Physics, Computer Science
Identifiers
DOI 10.1145/2488608.2488621 arXiv 1210.3135
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

The L1-norm best-fit hyperplane problem
2013cited by this paper
OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings
2012influential reference
Low-Rank Approximation and Regression in Input Sparsity Time
2012influential reference
Subspace embeddings for the L1-norm with applications
2011cited by this paper
Fast approximation of matrix coherence and statistical leverage
2011cited by this paper
LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems
2011influential reference
An almost optimal unrestricted fast Johnson-Lindenstrauss transform
2010influential reference
A sparse Johnson: Lindenstrauss transform
2010cited by this paper
Blendenpik: Supercharging LAPACK's Least-Squares Solver
2010cited by this paper
CUR matrix decompositions for improved data analysis
2009cited by this paper
A fast randomized algorithm for overdetermined linear least-squares regression
2008cited by this paper
Faster least squares approximation
2007cited by this paper
Sampling algorithms and coresets for ℓp regression
2007cited by this paper
Improved Approximation Algorithms for Large Matrices via Random Projections
2006cited by this paper
Sampling algorithms for l2 regression and applications
2006cited by this paper
Subgradient and sampling algorithms for l1 regression
2005cited by this paper
Embedding the diamond graph in Lp and dimension reduction in L1
2004cited by this paper
A bound on the deviation probability for sums of non-negative random variables.
2003cited by this paper
Polynomial Interior Point Cutting Plane Methods
2003cited by this paper
On the Impossibility of Dimension Reduction in l1
2003cited by this paper
On the impossibility of dimension reduction in l1
2003cited by this paper
Dimension reduction in the /spl lscr//sub 1/ norm
2002cited by this paper
Chapter 8 - Local Operator Theory, Random Matrices and Banach Spaces
2001cited by this paper
Stable Distributions. Models for Heavy Tailed Data
2001cited by this paper
FA ] 3 0 Ju l 2 00 4 Embedding the diamond graph in L p and dimension reduction in L 1
1990cited by this paper
Approximation of zonoids by zonotopes
1989cited by this paper
A Method for Simulating Stable Random Variables
1976cited by this paper
Calcul des Probabilites
1926cited by this paper
Calcul des Probabilités
year unknowncited by this paper
Stat260/cs294: Randomized Algorithms for Matrices and Data
year unknowncited by this paper

CITED BY

A Numerical Analysis of Sketched Linear Squares Problems and Stopping Criteria for Iterative Solvers
2026cites this paper
A Unified Zeroth-Order Optimization Framework via Oblivious Randomized Sketching
2025cites this paper
A High Performance GPU CountSketch Implementation and Its Application to Multisketching and Least Squares Problems
2025cites this paper
Efficient Least-Squares State Estimation Using Uniform Sampling
2025cites this paper
GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign Sketches
2025cites this paper
High-precision randomized preconditioned iterative methods for the random feature method
2025cites this paper
A Count Sketch Randomized Average Block Kaczmarz Method for Solving Highly Overdetermined Linear Systems
2025cites this paper
Fast Rank Adaptive CUR via a Recycled Small Sketch
2025influential citation
Streaming Algorithms For ℓp Flows and ℓp Regression
2025cites this paper
Panda: partially approximate newton methods for distributed minimax optimization with unbalanced dimensions
2025cites this paper
The Ubiquitous Sparse Matrix-Matrix Products
2025cites this paper
Private Sketches for Linear Regression
2025cites this paper
Importance Sampling for Nonlinear Models
2025cites this paper
Accelerated Kaczmarz methods via randomized sketch techniques for solving consistent linear systems
2025cites this paper
Randomized biorthogonalization through a two-sided Gram-Schmidt process
2025cites this paper
Efficient QR-based Column Subset Selection through Randomized Sparse Embeddings
2025cites this paper
Online Algorithms with Limited Data Retention
2024cites this paper
Faster Sampling Algorithms for Polytopes with Small Treewidth
2024cites this paper
Distributed Differentially Private Data Analytics via Secure Sketching
2024cites this paper
Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity
2024cites this paper
Randomized LU-Householder CholeskyQR
2024cites this paper
Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning
2024cites this paper
High-precision randomized iterative methods for the random feature method
2024cites this paper
Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning
2024cites this paper
Distributed Least Squares in Small Space via Sketching and Bias Reduction
2024cites this paper
Sketchy Moment Matching: Toward Fast and Provable Data Selection for Finetuning
2024cites this paper
Adaptive Parallelizable Algorithms for Interpolative Decompositions via Partially Pivoted LU
2023cites this paper
Optimal Embedding Dimension for Sparse Subspace Embeddings
2023cites this paper
Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition
2023cites this paper
Least-Mean-Squares Coresets for Infinite Streams
2023cites this paper
Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training
2023cites this paper
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
2023cites this paper
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
2023cites this paper
SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data
2023cites this paper
Online Adaptive Mahalanobis Distance Estimation
2023cites this paper
Analysis of Randomized Householder-Cholesky QR factorization with multisketching
2023cites this paper
Krylov Methods are (nearly) Optimal for Low-Rank Approximation
2023cites this paper
New Subset Selection Algorithms for Low Rank Approximation: Offline and Online
2023influential citation
Federated Empirical Risk Minimization via Second-Order Method
2023cites this paper
A Nearly-Optimal Bound for Fast Regression with 𝓁∞ Guarantee
2023cites this paper
Solving Dense Linear Systems Faster Than via Preconditioning
2023cites this paper
An Online and Unified Algorithm for Projection Matrix Vector Multiplication with Application to Empirical Risk Minimization
2023cites this paper
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
2023cites this paper
Elliptic PDE learning is provably data-efficient
2023cites this paper
On using affine sketches for multiple-response dynamic graph regression
2022cites this paper
Faster Randomized Interior Point Methods for Tall/Wide Linear Programs
2022cites this paper
Dynamic Tensor Product Regression
2022cites this paper
Sketching Meets Differential Privacy: Fast Algorithm for Dynamic Kronecker Projection Maintenance
2022cites this paper
Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings
2022cites this paper
An Efficient Algorithm for Computing the Approximate t-URV and its Applications
2022influential citation
Tight Bounds for ℓ1 Oblivious Subspace Embeddings
2022influential citation
Online Lewis Weight Sampling
2022cites this paper
Efficient Bounds and Estimates for Canonical Angles in Randomized Subspace Approximations
2022cites this paper
On a Connection Between Fast and Sparse Oblivious Subspace Embeddings
2022cites this paper
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
2022cites this paper
Optimal subsampling for large‐sample quantile regression with massive data
2022cites this paper
Fast Regression for Structured Inputs
2022cites this paper
p-Generalized Probit Regression and Scalable Maximum Likelihood Estimation via Sketching and Coresets
2022cites this paper
Sketching Algorithms and Lower Bounds for Ridge Regression
2022cites this paper
High-Dimensional Geometric Streaming in Polynomial Space
2022cites this paper
Dynamic Least-Squares Regression
2022cites this paper
Sketching for First Order Method: Efficient Algorithm for Low-Bandwidth Channel and Vulnerability
2022cites this paper
On randomized sketching algorithms and the Tracy–Widom law
2022cites this paper
Low-rank approximation with 1/𝜖1/3 matrix-vector products
2022cites this paper
On Coresets for Fair Regression and Individually Fair Clustering
2022cites this paper
Principled interpolation of Green's functions learned from data
2022cites this paper
Iterative Double Sketching for Faster Least-Squares Optimization
2022cites this paper
pylspack: Parallel Algorithms and Data Structures for Sketching, Column Subset Selection, Regression, and Leverage Scores
2022cites this paper
Sparse Regression Faster than 𝑑 𝜔
2021cites this paper
Sketched quantile additive functional regression
2021cites this paper
Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update
2021cites this paper
No Fine-Tuning, No Cry: Robust SVD for Compressing Deep Networks
2021cites this paper
A Very Sketchy Talk
2021cites this paper
Fast Sketching of Polynomial Kernels of Polynomial Degree
2021cites this paper
Streaming and Distributed Algorithms for Robust Column Subset Selection
2021cites this paper
Overview of accurate coresets
2021cites this paper
Hashing embeddings of optimal dimension, with applications to linear least squares
2021cites this paper
CONTINUAL LEARNING WITH SKETCHED STRUCTURAL REGULARIZATION
2021cites this paper
Exponentially Improved Dimensionality Reduction for 𝓁1: Subspace Embeddings and Independence Testing
2021cites this paper
Learning a Latent Simplex in Input-Sparsity Time
2021cites this paper
Estimating leverage scores via rank revealing methods and randomization
2021cites this paper
Accumulation of Sub-Sampling Matrices with Applications to Statistical Computation
2021cites this paper
Simpler is better: a comparative study of randomized pivoting algorithms for CUR and interpolative decompositions
2021cites this paper
Simpler is better: A comparative study of randomized algorithms for computing the CUR decomposition
2021cites this paper
Lifelong Learning with Sketched Structural Regularization
2021cites this paper
Non-PSD Matrix Sketching with Applications to Regression and Optimization
2021cites this paper
An Introduction to Johnson-Lindenstrauss Transforms
2021cites this paper
Near-Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time
2021cites this paper
Few-Shot Data-Driven Algorithms for Low Rank Approximation
2021cites this paper
Fast Graph Neural Tangent Kernel via Kronecker Sketching
2021influential citation
Improved iteration complexities for overconstrained p-norm regression
2021cites this paper
Faster $p$-Norm Regression Using Sparsity
2021cites this paper
Fast & Accurate Randomized Algorithms for Linear Systems and Eigenvalue Problems
2021cites this paper
Query Complexity of Least Absolute Deviation Regression via Robust Uniform Convergence
2021cites this paper
Faster Randomized Methods for Orthogonality Constrained Problems
2021cites this paper
Nearly sharp structured sketching for constrained optimization
2020cites this paper
A User-Friendly Computational Framework for Robust Structured Regression with the L2 Criterion
2020cites this paper
Randomized Linear Algebra Approaches to Estimate the von Neumann Entropy of Density Matrices
2020cites this paper
Compressed Deep Networks: Goodbye SVD, Hello Robust Low-Rank Approximation
2020influential citation
Explicitly Defined Sampling Categories for ESA on a Bipartite Graph
2020cites this paper