Lower bounds on locality sensitive hashing

Published 2005 in SCG '06

ABSTRACT

Given a metric space (X,dX), c≥1, r>0, and p,q ≡ [0,1], a distribution over mappings H : X → N is called a (r,cr,p,q)-sensitive hash family if any two points in X at distance at most r are mapped by H to the same value with probability at least p, and any two points at distance greater than cr are mapped by H to the same value with probability at most q. This notion was introduced by Indyk and Motwani in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been used extensively for this purpose. The performance of these algorithms is governed by the parameter ⊇=log(1/p)/log(1/q), and constructing hash families with small ⊇ automatically yields improved nearest neighbor algorithms. Here we show that for X=l1 it is impossible to achieve ⊇ ≤ 1/2c. This almost matches the construction of Indyk and Motwani which achieves ⊇ ≤ 1/c.

PUBLICATION RECORD

Publication year
2005
Venue
SCG '06
Publication date
2005-10-29
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1145/1137856.1137881 arXiv cs/0510088
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Entropy based nearest neighbor search in high dimensions
2005cited by this paper
Locality-sensitive hashing scheme based on p-stable distributions
2004influential reference
A replacement for Voronoi diagrams of near linear size
2001cited by this paper
Inequalities in Fourier analysis
1975cited by this paper
Étude des coefficients de Fourier des fonctions de $L^p(G)$
1970cited by this paper

CITED BY

GPU-Accelerated ANNS: Quantized for Speed, Built for Change
2026influential citation
CleANN: Efficient Full Dynamism in Graph-based Approximate Nearest Neighbor Search
2025cites this paper
Neural auto-association with optimal Bayesian learning
2024cites this paper
Learning to Hash for Recommendation: A Survey
2024cites this paper
Mixture of Experts Residual Learning for Hamming Hashing
2023cites this paper
Average distortion embeddings, nonlinear spectral gaps, and a metric John theorem (after Assaf Naor)
2022cites this paper
Approximate Nearest Neighbor ( ANN ) Search in High Dimensions
2022cites this paper
Private Approximate Nearest Neighbor Search with Sublinear Communication
2022cites this paper
Nearly Optimal Property Preserving Hashing
2022cites this paper
Approximate Nearest Neighbor ( ANN ) Search in High Dimensions
2022cites this paper
A data-driven situation-aware framework for predictive analysis in smart environments
2022cites this paper
Optimal Las Vegas Approximate Near Neighbors in ℓp
2022cites this paper
Clone detection through srcClone: A program slicing based approach
2021cites this paper
Lightweight Private Similarity Search
2021cites this paper
Lower bounds on lattice sieving and information set decoding
2021cites this paper
A Faster Algorithm for Finding Closest Pairs in Hamming Metric
2021cites this paper
Dynamic Enumeration of Similarity Joins
2021cites this paper
A Survey on Deep Hashing Methods
2020cites this paper
Faster Deterministic and Las Vegas Algorithms for Offline Approximate Nearest Neighbors in High Dimensions
2020cites this paper
Subsets and supermajorities Optimal hashing-based set similarity search
2020influential citation
Locality Sensitive Hashing for Set-Queries, Motivated by Group Recommendations
2020cites this paper
Return of the Lernaean Hydra: Experimental Evaluation of Data Series Approximate Similarity Search
2019cites this paper
SETH vs Approximation
2019cites this paper
Optimal Las Vegas Approximate Near Neighbors in l_p
2019cites this paper
Subsets and Supermajorities: Optimal Hashing-based Set Similarity Search
2019cites this paper
Approximation and Hardness: Beyond P and NP
2019cites this paper
Subsets and Supermajorities: Unifying Hashing-based Set Similarity Search
2019cites this paper
On the Distortion of Locality Sensitive Hashing
2019cites this paper
Graph-based Nearest Neighbor Search: From Practice to Theory
2019cites this paper
Algorithms for Similarity Search and Pseudorandomness
2019influential citation
2 M ar 2 01 8 Hardness of Approximate Nearest Neighbor Search
2018cites this paper
A new coding-based algorithm for finding closest pair of vectors
2018cites this paper
Approximate Nearest Neighbor Search in High Dimensions
2018cites this paper
Optimal Las Vegas Approximate Near Neighbors in 𝓁p
2018cites this paper
Local Density Estimation in High Dimensions
2018cites this paper
Why locality sensitive hashing works: A practical perspective
2018cites this paper
Parallel Hashing Using Representative Points in Hyperoctants
2018cites this paper
On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic
2018cites this paper
Algorithms above the noise floor
2018cites this paper
A New Algorithm for Finding Closest Pair of Vectors
2018cites this paper
Hardness of approximate nearest neighbor search
2018cites this paper
Privacy Preserving Query over Encrypted Multidimensional Massive Data in Cloud Storage
2018cites this paper
SKETCHING AND EMBEDDING ARE EQUIVALENT FOR
2018cites this paper
Randomized Primitives for Big Data Processing
2018cites this paper
New Algorithmic Tools for Distributed Similarity Search and Edge Estimation
2018cites this paper
Real-time community detection in full social networks on a laptop
2018cites this paper
Explaining the Success of Nearest Neighbor Methods in Prediction
2018cites this paper
43 NEAREST NEIGHBORS IN HIGH-DIMENSIONAL SPACES
2017cites this paper
LSH Forest: Practical Algorithms Made Theoretical
2017cites this paper
Hypercube LSH for approximate near neighbors
2017cites this paper
Distributed PCP Theorems for Hardness of Approximation in P
2017cites this paper
High-dimensional similarity search and sketching: algorithms and hardness
2017influential citation
Distance-Sensitive Hashing
2017influential citation
Lattice-based Locality Sensitive Hashing is Optimal
2017cites this paper
Lower Bounds on Time-Space Trade-Offs for Approximate Near Neighbors
2016influential citation
A Survey on Learning to Hash
2016cites this paper
Nearest Neighbor Search in the Metric Space of a Complex Network for Community Detection
2016cites this paper
Set similarity search beyond MinHash
2016cites this paper
Partial least squares for face hashing
2016cites this paper
Hashing-based clustering in high dimensional data
2016cites this paper
Handbook of Big Data
2016cites this paper
Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors
2016cites this paper
Real-Time Community Detection in Large Social Networks on a Laptop
2016cites this paper
Search problems in cryptography : from fingerprinting to lattice sieving
2016cites this paper
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
2016cites this paper
A Framework for Similarity Search with Space-Time Tradeoffs using Locality-Sensitive Filtering
2016cites this paper
Efficient Associative Computation with Discrete Synapses
2016cites this paper
Massively-Parallel Similarity Join, Edge-Isoperimetry, and Distance Correlations on the Hypercube
2016cites this paper
Explicit Correlation Amplifiers for Finding Outlier Correlations in Deterministic Subquadratic Time
2016cites this paper
Practical linear-space Approximate Near Neighbors in high dimension
2016cites this paper
Fast Cross-Polytope Locality-Sensitive Hashing
2016cites this paper
Scalability and Total Recall with Fast CoveringLSH
2016cites this paper
IMPLEMENTATION OF ACTIVE STORAGE IN EFFICIENT VIRTUAL FILE SYSTEM
2015cites this paper
On the Complexity of Inner Product Similarity Join
2015cites this paper
Tradeoffs for nearest neighbors on the sphere
2015cites this paper
Fast Approximate near Neighbor Algorithm by Clustering in High Dimensions
2015cites this paper
Smooth Tradeoffs between Insert and Query Complexity in Nearest Neighbor Search
2015cites this paper
Optimal Data-Dependent Hashing for Approximate Near Neighbors
2015influential citation
Practical and Optimal LSH for Angular Distance
2015cites this paper
Selective Hashing: Closing the Gap between Radius Search and k-NN Search
2015cites this paper
Finding Correlations in Subquadratic Time, with Applications to Learning Parities and the Closest Pair Problem
2015cites this paper
Cs 229r: Algorithms for Big Data 2 Sparse Jl from Last Time
2015cites this paper
Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing
2015influential citation
Nearest Neighbor search in Complex Network for Community Detection
2015cites this paper
A Faster Subquadratic Algorithm for Finding Outlier Correlations
2015cites this paper
Multi-Granularity Locality-Sensitive Bloom Filter
2015cites this paper
In the last lecture we discussed how distributional JL implies Gordon's theorem, and began our discussion of sparse JL. We wrotek xk 2 = T A TAx and bounded the expression using Hanson- Wright in terms of the Frobenius norm.
2015cites this paper
Large Scale Nearest Neighbor Search - Theories, Algorithms, and Applications
2014cites this paper
A Directed Isoperimetric Inequality with application to Bregman Near Neighbor Lower Bounds
2014influential citation
Sketching and Embedding are Equivalent for Norms
2014cites this paper
Algorithms for High Dimensional Data
2014cites this paper
Hashing for Similarity Search: A Survey
2014cites this paper
High-dimensional indexing technologies for large scale content-based image retrieval: a review
2013cites this paper
Beyond Locality-Sensitive Hashing
2013cites this paper
Approximating Minimization Diagrams and Generalized Proximity Search
2013cites this paper
Computational Geometric Learning Approximate Nearest Neighbor Search for Points on Lower Dimensional Flats
2013cites this paper
Euclidean spanners in high dimensions
2013cites this paper
SIMP: accurate and efficient near neighbor search in high dimensional spaces
2012influential citation
Multimedia semantics-aware query-adaptive hashing with bits reconfigurability
2012influential citation
Distributed approximate spectral clustering for large-scale datasets
2012cites this paper