Uncertainty in Deep Learning for EEG under Dataset Shifts

Mats Tveter,Thomas Tveitstøl,Christoffer Hatlestad-Hall,Hugo L Hammer,Ira R. J. Hebold Haraldsen

Published 2025 in bioRxiv

ABSTRACT

Objective As artificial intelligence (AI) is increasingly integrated into medical diagnostics, it is essential that predictive models provide not only accurate outputs but also reliable estimates of uncertainty. In clinical applications, where decisions have significant consequences, understanding the confidence behind each prediction is as critical as the prediction itself. Uncertainty modelling plays a key role in improving trust, guiding decision-making, and identifying unreliable outputs, particularly under dataset shift or in out-of-distribution settings. The primary aim of uncertainty metrics is to align model confidence closely with actual predictive performance, ensuring confidence estimates dynamically adjust to reflect increasing errors or decreasing reliability of predictions. This study investigates how different ensemble learning strategies affect both performance and uncertainty estimation in a clinically relevant task: classifying Normal, Mild Cognitive Impairment, and Dementia from electroencephalography (EEG) data. Approach We evaluated the performance and uncertainty of ensemble methods and Monte Carlo dropout on a large EEG dataset. The models were assessed in three settings: (1) in-distribution performance on a held-out test set, (2) generalisation to three out-of-distribution datasets, and (3) performance under gradual, EEG-specific dataset shifts simulating noise, drift, and frequency perturbation. Main results Ensembles consisting of multiple independently trained models, such as deep ensembles, consistently achieved higher performance in both the in-distribution test set and the out-of-distribution datasets. These models also produced more informative and responsive uncertainty estimates under various types of EEG dataset shifts. Significance These results highlight the benefits of ensemble diversity and independent training to build robust and uncertainty-aware EEG classification models. The findings are particularly relevant for clinical applications, where reliability under distribution shift and transparent uncertainty are essential for safe deployment.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-64 of 64 references · Page 1 of 1

CITED BY