Adjustment for Confounding using Pre-Trained Representations

Rickmer Schulte,David Rügamer,Thomas Nagler

Published 2025 in International Conference on Machine Learning

ABSTRACT

There is growing interest in extending average treatment effect (ATE) estimation to incorporate non-tabular data, such as images and text, which may act as sources of confounding. Neglecting these effects risks biased results and flawed scientific conclusions. However, incorporating non-tabular data necessitates sophisticated feature extractors, often in combination with ideas of transfer learning. In this work, we investigate how latent features from pre-trained neural networks can be leveraged to adjust for sources of confounding. We formalize conditions under which these latent features enable valid adjustment and statistical inference in ATE estimation, demonstrating results along the example of double machine learning. We discuss critical challenges inherent to latent feature learning and downstream parameter estimation arising from the high dimensionality and non-identifiability of representations. Common structural assumptions for obtaining fast convergence rates with additive or sparse linear models are shown to be unrealistic for latent features. We argue, however, that neural networks are largely insensitive to these issues. In particular, we show that neural networks can achieve fast convergence rates by adapting to intrinsic notions of sparsity and dimension of the learning problem.

PUBLICATION RECORD

Publication year
2025
Venue
International Conference on Machine Learning
Publication date
2025-06-17
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2506.14329 arXiv 2506.14329
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images
2024cited by this paper
Efficient adjustment for complex covariates: Gaining efficiency with DOPE
2024influential reference
DoubleMLDeep: Estimation of Causal Effects with Multimodal Data
2024cited by this paper
Inference for regression with variables generated from unstructured data
2024cited by this paper
End-To-End Causal Effect Estimation from Unstructured Natural Language Data
2024cited by this paper
Causal inference through multi-stage learning and doubly robust deep neural networks
2024cited by this paper
ROSE Random Forests for Robust Semiparametric Efficient Estimation
2024influential reference
Integrating Earth Observation Data into Causal Inference: Challenges and Opportunities
2023cited by this paper
Debiasing Machine-Learning- or AI-Generated Regressors in Partial Linear Models
2023cited by this paper
Image-based Treatment Effect Heterogeneity
2022cited by this paper
Causal Transformer for Estimating Counterfactual Outcomes
2022cited by this paper
Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa
2022cited by this paper
Datasets: A Community Library for Natural Language Processing
2021cited by this paper
Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction
2021cited by this paper
The Intrinsic Dimension of Images and Its Impact on Learning
2021cited by this paper
DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python
2021cited by this paper
On the rate of convergence of fully connected deep neural network regression estimates
2021influential reference
TorchXRayVision: A library of chest X-ray datasets and models
2021cited by this paper
Estimation of a regression function on a manifold by fully connected deep neural networks
2021cited by this paper
5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding
2020cited by this paper
On the limits of cross-domain generalization in automated X-ray prediction
2020cited by this paper
DEEP NEURAL NETWORKS FOR ESTIMATION AND INFERENCE
2020influential reference
Embedding Learning
2020cited by this paper
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
2020cited by this paper
CausalML: Python Package for Causal Machine Learning
2020cited by this paper
Adapting Neural Networks for the Estimation of Treatment Effects
2019cited by this paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019cited by this paper
Intrinsic dimension of data representations in deep neural networks
2019cited by this paper
Using Embeddings to Correct for Unobserved Confounding
2019cited by this paper
Deep ReLU network approximation of functions on a manifold
2019cited by this paper
Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds
2019cited by this paper
Adaptive Approximation and Generalization of Deep Neural Network with Intrinsic Dimensionality
2019cited by this paper
On the Intrinsic Dimensionality of Image Representations
2018cited by this paper
Causal Inference with Noisy and Missing Covariates via Matrix Factorization
2018cited by this paper
Asymptotic statistics
2018influential reference
Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification
2018cited by this paper
Smooth Manifolds
2018cited by this paper
Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks
2017cited by this paper
Double/debiased machine learning for treatment and structural parameters
2017influential reference
Double/Debiased/Neyman Machine Learning of Treatment Effects
2017cited by this paper
Nonparametric regression using deep neural networks with ReLU activation function
2017cited by this paper
Identifying Causal Effects With Proxy Variables of an Unmeasured Confounder.
2016cited by this paper
Densely Connected Convolutional Networks
2016influential reference
Low Bias Local Intrinsic Dimension Estimation from Expected Simplex Skewness
2015cited by this paper
Measurement bias and effect restoration in causal inference
2014cited by this paper
Testing the Manifold Hypothesis
2013cited by this paper
The International Journal of Biostatistics Targeted Maximum Likelihood Learning
2011cited by this paper
Targeted Learning: Causal Inference for Observational and Experimental Data
2011cited by this paper
Why Does Unsupervised Pre-training Help Deep Learning?
2010cited by this paper
ImageNet: A large-scale hierarchical image database
2009cited by this paper
Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness
2009cited by this paper
Local polynomial regression on unknown manifolds
2007cited by this paper
Feature selection, L1 vs. L2 regularization, and rotational invariance
2004cited by this paper
Maximum Likelihood Estimation of Intrinsic Dimension
2004influential reference
Semiparametric Efficiency in Multivariate Regression Models with Missing Data
1995cited by this paper
ROOT-N-CONSISTENT SEMIPARAMETRIC REGRESSION
1988cited by this paper
Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy.
1986cited by this paper
Additive Regression and Other Nonparametric Models
1985cited by this paper
Optimal Global Rates of Convergence for Nonparametric Regression
1982cited by this paper
An Algorithm for Finding Intrinsic Dimensionality of Data
1971cited by this paper

CITED BY

Optimal neural network approximation of smooth compositional functions on sets with low intrinsic dimension
2026cites this paper
Orthogonal Representation Learning for Estimating Causal Quantities
2025cites this paper