A novel bi-level clustering optimization approach to balance treatment of crash data.

Tanveer Ahmed,Vikash V. Gayah

Published 2025 in Accident Analysis and Prevention

ABSTRACT

Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure's effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-60 of 60 references · Page 1 of 1

CITED BY