Outlier Detection Using Nonconvex Penalized Regression

Published 2010 in arXiv.org

ABSTRACT

This article studies the outlier detection problem from the standpoint of penalized regression. In the regression model, we add one mean shift parameter for each of the n data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual L1 penalty yields a convex criterion, but fails to deliver a robust estimator. The L1 penalty corresponds to soft thresholding. We introduce a thresholding (denoted by Θ) based iterative procedure for outlier detection (Θ–IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We describe the connection between Θ–IPOD and M-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression coefficients. A data-dependent choice can be made based on the Bayes information criterion. The tuned Θ–IPOD shows outstanding performance in identifying outliers in various situations compared with other existing approaches. In addition, Θ–IPOD is much faster than iteratively reweighted least squares for large data, because each iteration costs at most O(np) (and sometimes much less), avoiding an O(np2) least squares estimate. This methodology can be extended to high-dimensional modeling with p ≫ n if both the coefficient vector and the outlier pattern are sparse.

PUBLICATION RECORD

Publication year
2010
Venue
arXiv.org
Publication date
2010-06-14
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1198/jasa.2011.tm10390 arXiv 1006.2592
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Regularization of Wavelets Approximations
2011cited by this paper
Outlier detection and least trimmed squares approximation using semi-definite programming
2010cited by this paper
Sure independence screening for ultrahigh dimensional feature space Discussion
2008cited by this paper
Thresholding-based iterative selection procedures for model selection and shrinkage
2008influential reference
Extended Bayesian information criteria for model selection with large model spaces
2008cited by this paper
The sparsity and bias of the Lasso selection in high-dimensional linear regression
2008cited by this paper
Discussion of "Sure independence screening for ultra-high dimensional feature space" by Fan and Lv.
2008cited by this paper
Wavelet methods in statistics: Some recent developments and their applications
2007influential reference
Robust estimation and wavelet thresholding in partially linear models
2007cited by this paper
Robust variable selection using least angle regression and elemental set sampling
2007cited by this paper
Robust Estimation and Wavelet Thresholding in Partial Linear Models
2006influential reference
Robust Statistics: Theory and Methods
2006cited by this paper
On Model Selection Consistency of Lasso
2006cited by this paper
Sure independence screening for ultrahigh dimensional feature space
2006cited by this paper
A Fast Algorithm for S-Regression Estimates
2006cited by this paper
Introduction to Robust Estimation and Hypothesis Testing
2005cited by this paper
Robust Regression and Outlier Detection
2005influential reference
Introduction to Robust Estimation and Hypothesis Testing (2nd ed.)
2005cited by this paper
Applied Linear Regression
2005influential reference
Computing LTS Regression for Large Data Sets
2005cited by this paper
Addendum: Regularization and variable selection via the elastic net
2005cited by this paper
Robust Diagnostic Regression Analysis
2002cited by this paper
A class of robust and fully efficient regression estimators
2002cited by this paper
Regularization of Wavelet Approximations
2001cited by this paper
A comparative analysis of multiple outlier detection procedures in the linear regression model
2001cited by this paper
A Fast Procedure for Outlier Diagnostics in Large Regression Problems
1999influential reference
A clustering algorithm for identifying multiple outliers in linear regression
1998cited by this paper
Multivariate Bayesian variable selection and prediction
1998cited by this paper
Measurement, Regression, and Calibration
1996influential reference
Controlling the false discovery rate: a practical and powerful approach to multiple testing
1995cited by this paper
The Detection of Influential Subsets in Linear Regression by Using an Influence Matrix
1995cited by this paper
Ideal spatial adaptation by wavelet shrinkage
1994cited by this paper
A Bounded Influence, High Breakdown, Efficient Regression Estimator
1993cited by this paper
Procedures for the Identification of Multiple Outliers in Linear Models
1993cited by this paper
On One-Step GM Estimates and Stability of Inferences in Linear Regression
1992cited by this paper
HIGH BREAKDOWN-POINT AND HIGH EFFICIENCY ROBUST ESTIMATES FOR REGRESSION
1987cited by this paper
Robust statistics: the approach based on influence functions
1986cited by this paper
Applied Linear Regression (2nd ed.).
1986cited by this paper
Location of Several Outliers in Multiple-Regression Data Using Elemental Sets
1984cited by this paper
Least Median of Squares Regression
1984cited by this paper
ROBUST REGRESSION BY MEANS OF S-ESTIMATORS
1984cited by this paper
Rectangular Confidence Regions for the Means of Multivariate Normal Distributions
1967cited by this paper
The Analysis of Disturbances in Regression Analysis
1965cited by this paper
Computational Statistics and Data Analysis
year unknowncited by this paper

CITED BY

Distribution-insensitive influential point detection for high dimensional regression model
2026cites this paper
Sample Complexity Bounds for Robust Mean Estimation with Mean-Shift Contamination
2026cites this paper
Least trimmed squares regression with missing values and cellwise outliers
2026cites this paper
Distributed robust estimation and inference with contaminated data
2026cites this paper
Simultaneous detection of change points and outliers
2026cites this paper
Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination
2025cites this paper
Robust outlier-adjusted mean-shift estimation of state-space models
2025cites this paper
Robust Outlier Detection and Low-Latency Concept Drift Adaptation for Data Stream Regression: A Dual-Channel Architecture
2025cites this paper
Heavy Lasso: sparse penalized regression under heavy-tailed noise via data-augmented soft-thresholding
2025cites this paper
Robust Variable Selection for the Varying Coefficient Partially Nonlinear Models
2025cites this paper
A greedy and optimistic clustering for leveraging individual covariate uncertainty
2025cites this paper
Homogeneity and Sparsity Pursuit Using Robust Adaptive Fused Lasso
2025cites this paper
$\ell_0$-Regularized Item Response Theory Model for Robust Ideal Point Estimation
2025cites this paper
Influential observations detection by random projection in high-dimensional multivariate response linear model
2025cites this paper
How to Evaluate Participant Contributions in Decentralized Federated Learning
2025cites this paper
A Physical-Statistical Framework on Complex Mechanical System Fault Isolation
2025cites this paper
Robust Global Fr'echet Regression via Weight Regularization
2025influential citation
A robust M-estimation framework for spatial autoregressive models: loss function design and optimization strategies
2025cites this paper
Training-free Graph Anomaly Detection: A Simple Approach via Singular Value Decomposition
2025cites this paper
Robust Spatiotemporal Epidemic Modeling with Integrated Adaptive Outlier Detection
2025cites this paper
Bayesian Outlier Detection for Matrix-variate Models
2025cites this paper
A Self-scaled Approximate $\ell_0$ Regularization Robust Model for Outlier Detection
2025influential citation
Robust estimation of regression models with potentially endogenous outliers via a modern optimization lens
2025cites this paper
Transfer Learning in Regression with Influential Points
2025influential citation
Dynamic weak-significant variable selection of error component model via robust MM algorithm
2025cites this paper
Entangled Mean Estimation in High Dimensions
2025cites this paper
Introducing HYBRID and ENSEMBLE: novel nonconvex penalization strategies for robust variable selection under missing data
2025cites this paper
New Insight of Spatial Scan Statistics via Regression Model
2025cites this paper
The half-quadratic approach for high-dimensional robust M-estimation
2025cites this paper
Outlier detection in classification based on feature-selection-based regression
2024cites this paper
Robust Regression under Adversarial Contamination: Theory and Algorithms for the Welsch Estimator
2024cites this paper
Towards Global Optimal Visual In-Context Learning Prompt Selection
2024cites this paper
Is Seeing Believing? A Practitioner’s Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies
2024cites this paper
Most Influential Subset Selection: Challenges, Promises, and Beyond
2024cites this paper
An integrated perspective of robustness in regression through the lens of the bias-variance trade-off
2024cites this paper
Rapid outlier detection, model selection and variable selection using penalized likelihood estimation for general spatial models
2024cites this paper
Fast algorithm for sparse least trimmed squares via trimmed-regularized reformulation
2024cites this paper
Adaptively Robust and Sparse K-means Clustering
2024cites this paper
Identification of Influential Observations for High-Dimensional Regression
2024cites this paper
Random clustering-based outlier detector
2024cites this paper
Multi-task learning via robust regularized clustering with non-convex group penalties
2024cites this paper
Embedding-Driven Data Distillation for 360-Degree IQA With Residual-Aware Refinement
2024cites this paper
Optimal Estimation of the Null Distribution in Large-Scale Inference
2024cites this paper
Outlier detection in spatial error models using modified thresholding-based iterative procedure for outlier detection approach
2024influential citation
Data Quality Awareness: A Journey from Traditional Data Management to Data Science Systems
2024cites this paper
Challenges of cellwise outliers
2023cites this paper
Analysis and Approximate Inference of Large and Dense Random Kronecker Graphs
2023cites this paper
A Survey on High-Dimensional Subspace Clustering
2023cites this paper
Robustifying Likelihoods by Optimistically Re-weighting Data
2023cites this paper
High-dimensional outlier detection and variable selection via adaptive weighted mean regression
2023cites this paper
mFILS: Tri-Selection via Convex and Nonconvex Regularizations
2023cites this paper
Weak pattern recovery for SLOPE and its robust versions
2023cites this paper
Analysis and Approximate Inference of Large Random Kronecker Graphs
2023cites this paper
Statistica Sinica
2023cites this paper
Algorithmic generalization ability of PALM for double sparse regularized regression
2023cites this paper
Simultaneous outlier detection and variable selection for spatial Durbin model
2023cites this paper
CoSP: co-selection pick for a global explainability of black box machine learning models
2023cites this paper
Robust penalized least squares of depth trimmed residuals regression for high-dimensional data
2023cites this paper
Beta Quantile Regression for Robust Estimation of Uncertainty in the Presence of Outliers
2023cites this paper
Trustworthy regularized huber regression for outlier detection
2023cites this paper
CR-Lasso: Robust cellwise regularized sparse regression
2023cites this paper
Outlier detection in regression: conic quadratic formulations
2023cites this paper
Review of Some Robust Estimators in Multiple Linear Regressions in the Presence of Outlier(s)
2023cites this paper
Outlier detection in statistical modeling via multivariate adaptive regression splines
2023cites this paper
Estimation of sparse linear regression coefficients under $L$-subexponential covariates
2023cites this paper
Robust logistic regression with shift parameter estimation
2023influential citation
Knockoffs-SPR: Clean Sample Selection in Learning With Noisy Labels
2023influential citation
Spatio-temporal clustering analysis using generalized lasso with an application to reveal the spread of Covid-19 cases in Japan
2023cites this paper
Improving robustness of case-based reasoning for early-stage construction cost estimation
2023cites this paper
Fast and Scalable Unsupervised Deep Subgraph Anomaly Detection
2023cites this paper
Weak pattern convergence for SLOPE and its robust versions
2023cites this paper
Estimation Methods of the Multiple-Group One-Dimensional Factor Model: Implied Identification Constraints in the Violation of Measurement Invariance
2022cites this paper
Sample-Wise Combined Missing Effect Model with Penalization
2022cites this paper
A Better Computational Framework for L 2 E Regression
2022cites this paper
Adaptive and Robust Multi-task Learning
2022cites this paper
Machine-Learning Prediction of Vegard's Law Factor and Volume Size Factor for Binary Substitutional Metallic Solid Solutions
2022cites this paper
Sparse Reduced Rank Huber Regression in High Dimensions
2022cites this paper
Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels
2022cites this paper
Robust censored regression with $$\ell _1$$ ℓ 1 -norm regularization
2022cites this paper
Parameter estimation of three-parameter Weibull probability model based on outlier detection
2022cites this paper
Unsupervised Deep Subgraph Anomaly Detection
2022cites this paper
Learning of Dynamical Systems under Adversarial Attacks - Null Space Property Perspective
2022cites this paper
Robust and Tuning-Free Sparse Linear Regression via Square-Root Slope
2022cites this paper
Robust variable selection via nonconcave penalties with an upgraded parsimonious dynamic covariance modeling
2022cites this paper
On the Co-Selection of Vision Transformer Features and Images for Very High-Resolution Image Scene Classification
2022cites this paper
Robust regression using probabilistically linked data
2022cites this paper
Comparing the Robustness of the Structural after Measurement (SAM) Approach to Structural Equation Modeling (SEM) against Local Model Misspecifications with Alternative Estimation Approaches
2022cites this paper
Robust and Sparse Estimation of Linear Regression Coefficients with Heavy-tailed Noises and Covariates
2022cites this paper
Robust Regression Analysis in Analyzing Financial Performance of Public Sector Banks: A Case Study of India
2022cites this paper
A Sharper Computational Tool for Regression
2022cites this paper
Outlier Robust and Sparse Estimation of Linear Regression Coefficients
2022cites this paper
Detection of Cell Separation-Induced Gene Expression Through a Penalized Deconvolution Approach
2022cites this paper
Robust censored regression with (cid:2) 1 -norm regularization
2022cites this paper
Ridership prediction and anomaly detection in transportation hubs: an application to New York City
2022cites this paper
Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection
2022cites this paper
Robust AOA-Based Target Localization for Uniformly Distributed Noise via ℓp-ℓ1 Optimization
2022cites this paper
Sequential Convex Programming Revisited
2021influential citation
A Robust Version of the Empirical Likelihood Estimator
2021cites this paper
Residual's influence index (RINFIN), bad leverage and unmasking in high dimensional L2‐regression
2021cites this paper
LPNN‐based approach for LASSO problem via a sequence of regularized minimizations
2021cites this paper