The dynamic and complex nature of distributed systems makes fault localization extremely difficult, frequently leading to extended outages and higher operating expenses. A deep learning-based fault localization framework that combines Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Neural Network (CNN), LSTM+CNN, and Autoencoder+LSTM models is proposed in this study. These models undergo extensive preprocessing, including log parsing, feature extraction using TF-IDF and Word2Vec, and min-max normalisation, before being trained and assessed on five benchmark datasets: HDFS, OpenStack, Spark, Hadoop, and BGL. To ensure robustness, the methodology incorporates a 5-fold cross-validation strategy, model-specific architecture tuning, and 1-D sequence modelling. According to experimental results, CNN performs best overall on the HDFS dataset, with an Mean Squared Error (MSE) of 0.00002 and an Coefficient of Determination (R2 Score) Score of 0.996. CNN continuously beats other models in accuracy and performance across all datasets. The key contributions of this study are: 1) a thorough fault localization framework built with deep learning for distributed systems; 2) a comparison of five cutting-edge architectures on five real-world datasets; and 3) statistically validated performance benchmarks backed by Wilcoxon signed-rank tests and t-tests. These contributions provide useful information for implementing accurate and scalable fault localization in distributed computing environments found in the real world.
Leveraging Deep Learning for Fault Detection and Localization in Distributed Systems
Debolina Ghosh,Jay Prakash Singh
Published 2025 in IEEE Access
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
IEEE Access
- Publication date
Unknown publication date
- Fields of study
Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-31 of 31 references · Page 1 of 1
CITED BY
Showing 1-1 of 1 citing papers · Page 1 of 1