FOVA: Offline Federated Reinforcement Learning With Mixed-Quality Data

Published 2025 in IEEE Transactions on Networking

ABSTRACT

Offline Federated Reinforcement Learning (FRL), a marriage of federated learning and offline reinforcement learning, has attracted increasing interest recently. Albeit with some advancement, we find that the performance of most existing offline FRL methods drops dramatically when provided with mixed-quality data, that is, the logging behaviors (offline data) are collected by policies with varying qualities across clients. To overcome this limitation, this paper introduces a new vote-based offline FRL framework, named FOVA. It exploits a vote mechanism to identify high-return actions during local policy evaluation, alleviating the negative effect of low-quality behaviors from diverse local learning policies. Besides, building on advantage-weighted regression (AWR), we construct consistent local and global training objectives, significantly enhancing the efficiency and stability of FOVA. Further, we conduct an extensive theoretical analysis and rigorously show that the policy learned by FOVA enjoys strict policy improvement over the behavioral policy. Extensive experiments corroborate the significant performance gains of our proposed algorithm over existing baselines on widely used benchmarks.

PUBLICATION RECORD

Publication year
2025
Venue
IEEE Transactions on Networking
Publication date
2025-12-02
Fields of study
Computer Science
Identifiers
DOI 10.1109/TON.2025.3637043 arXiv 2512.02350
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
2024cited by this paper
Federated Offline Policy Optimization with Dual Regularization
2024influential reference
Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data
2024cited by this paper
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
2024influential reference
NetLLM: Adapting Large Language Models for Networking
2024cited by this paper
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
2024cited by this paper
Federated Offline Reinforcement Learning With Multimodal Data
2024cited by this paper
Federated Offline Reinforcement Learning with Proximal Policy Evaluation
2024cited by this paper
Multi-Dimensional QoS Evaluation and Optimization of Mobile Edge Computing for IoT: A Survey
2024cited by this paper
Distributed Deep Reinforcement Learning-Based Gradient Quantization for Federated Learning Enabled Vehicle Edge Computing
2024cited by this paper
PoPeC: PAoI-Centric Task Offloading With Priority Over Unreliable Channels
2023cited by this paper
Federated Temporal Difference Learning with Linear Function Approximation under Environmental Heterogeneity
2023cited by this paper
MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator
2023cited by this paper
Path Planning Technique for Mobile Robots: A Review
2023cited by this paper
Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL
2023cited by this paper
DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning
2023cited by this paper
FedCache: A Knowledge Cache-Driven Federated Learning Architecture for Personalized Edge Intelligence
2023cited by this paper
On UAV Serving Node Deployment for Temporary Coverage in Forest Environment: A Hierarchical Deep Reinforcement Learning Approach
2023cited by this paper
Distributed Offline Policy Optimization Over Batch Data
2023cited by this paper
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration
2023cited by this paper
Client Selection for Federated Policy Optimization with Environment Heterogeneity
2023cited by this paper
The Blessing of Heterogeneity in Federated Q-learning: Linear Speedup and Beyond
2023cited by this paper
Federated Ensemble-Directed Offline Reinforcement Learning
2023influential reference
FedFormer: Contextual Federation with Attention in Reinforcement Learning
2022cited by this paper
Federated Offline Reinforcement Learning for Autonomous Systems
2022influential reference
FedKL: Tackling Data Heterogeneity in Federated Reinforcement Learning by Penalizing KL Divergence
2022cited by this paper
Federated Offline Reinforcement Learning
2022cited by this paper
Level-5 Autonomous Driving—Are We There Yet? A Review of Research Literature
2022cited by this paper
Model-Based Offline Meta-Reinforcement Learning with Regularization
2022cited by this paper
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems
2022cited by this paper
Federated Reinforcement Learning with Environment Heterogeneity
2022cited by this paper
Mildly Conservative Q-Learning for Offline Reinforcement Learning
2022cited by this paper
Deep learning, reinforcement learning, and world models
2022cited by this paper
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
2021cited by this paper
COMBO: Conservative Offline Model-Based Policy Optimization
2021influential reference
Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices
2021cited by this paper
A Minimalist Approach to Offline Reinforcement Learning
2021cited by this paper
Kernel Continual Learning
2021cited by this paper
Federated Reinforcement Learning: Techniques, Applications, and Open Challenges
2021cited by this paper
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
2021cited by this paper
Offline Reinforcement Learning with Implicit Q-Learning
2021cited by this paper
Uncertainty-Based Ofﬂine Reinforcement Learning with Diversiﬁed Q-Ensemble
2021cited by this paper
Federated Reinforcement Learning for Training Control Policies on Multiple IoT Devices
2020cited by this paper
Deep Reinforcement Learning for Autonomous Driving: A Survey
2020cited by this paper
When Deep Reinforcement Learning Meets Federated Learning: Intelligent Multitimescale Resource Management for Multiaccess Edge Computing in 5G Ultradense Network
2020cited by this paper
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
2020cited by this paper
Federated Deep Reinforcement Learning for Internet of Things With Decentralized Cooperative Edge Caching
2020cited by this paper
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
2020cited by this paper
Communication-Efficient Federated Learning for Wireless Edge Intelligence in IoT
2020cited by this paper
Conservative Q-Learning for Offline Reinforcement Learning
2020influential reference
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
2019cited by this paper
China
2019cited by this paper
Federated Transfer Reinforcement Learning for Autonomous Driving
2019cited by this paper
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
2019influential reference
Edge Computing for Autonomous Driving: Opportunities and Challenges
2019cited by this paper
Reinforcement Learning: Theory and Algorithms
2019cited by this paper
Federated Optimization in Heterogeneous Networks
2018cited by this paper
Deep Learning Based Caching for Self-Driving Cars in Multi-Access Edge Computing
2018cited by this paper
Device vs Edge Computing for Mobile Services: Delay-Aware Decision Making to Minimize Power Consumption
2017cited by this paper
Constrained Policy Optimization
2017cited by this paper
Communication-Efficient Learning of Deep Networks from Decentralized Data
2016cited by this paper
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
2016cited by this paper
Notes on Kullback-Leibler Divergence and Likelihood
2014cited by this paper
MuJoCo: A physics engine for model-based control
2012cited by this paper
The edge of intelligence
2009cited by this paper
REGAL: A Regularization based Algorithm for Reinforcement Learning in Weakly Communicating MDPs
2009cited by this paper
Near-optimal Regret Bounds for Reinforcement Learning
2008cited by this paper
Fitted Q-iteration by Advantage Weighted Regression
2008cited by this paper
Reinforcement learning by reward-weighted regression for operational space control
2007cited by this paper
Reinforcement Learning: An Introduction
1998cited by this paper
Reinforcement Learning: A Survey
1996cited by this paper

CITED BY

FORLER: Federated Offline Reinforcement Learning with Q-Ensemble and Actor Rectification
2026cites this paper