Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

K. Zhang,Zhuoran Yang,Han Liu,Tong Zhang,T. Başar

Published 2018 in International Conference on Machine Learning

ABSTRACT

We consider the problem of \emph{fully decentralized} multi-agent reinforcement learning (MARL), where the agents are located at the nodes of a time-varying communication network. Specifically, we assume that the reward functions of the agents might correspond to different tasks, and are only known to the corresponding agent. Moreover, each agent makes individual decisions based on both the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. To this end, we propose two decentralized actor-critic algorithms with function approximation, which are applicable to large-scale MARL problems where both the number of states and the number of agents are massively large. Under the decentralized structure, the actor step is performed individually by each agent with no need to infer the policies of others. For the critic step, we propose a consensus update via communication over the network. Our algorithms are fully incremental and can be implemented in an online fashion. Convergence analyses of the algorithms are provided when the value functions are approximated within the class of linear functions. Extensive simulation results with both linear and nonlinear function approximations are presented to validate the proposed algorithms. Our work appears to be the first study of fully decentralized MARL algorithms for networked agents with function approximation, with provable convergence guarantees.

PUBLICATION RECORD

Publication year
2018
Venue
International Conference on Machine Learning
Publication date
2018-02-23
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1802.08757
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

The Reactor: A Sample-Efficient Actor-Critic Architecture
2017cited by this paper
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
2017cited by this paper
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
2017cited by this paper
Distral: Robust multitask reinforcement learning
2017cited by this paper
Deep Reinforcement Learning: An Overview
2017cited by this paper
Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning
2017cited by this paper
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
2017cited by this paper
Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability
2017cited by this paper
Mastering the game of Go without human knowledge
2017cited by this paper
Cooperative Multi-agent Control Using Deep Reinforcement Learning
2017influential reference
Counterfactual Multi-Agent Policy Gradients
2017cited by this paper
Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning
2017cited by this paper
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
2017influential reference
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
2016cited by this paper
Safe and Efficient Off-Policy Reinforcement Learning
2016cited by this paper
Mastering the game of Go with deep neural networks and tree search
2016cited by this paper
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
2016cited by this paper
Nonlinear Gossip
2016cited by this paper
Achieving Geometric Convergence for Distributed Optimization Over Time-Varying Graphs
2016cited by this paper
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
2016cited by this paper
Sample Efficient Actor-Critic with Experience Replay
2016cited by this paper
Asynchronous Methods for Deep Reinforcement Learning
2016cited by this paper
An Actor-Critic Algorithm for Sequence Prediction
2016cited by this paper
Decentralized Q-Learning for Stochastic Teams and Games
2015cited by this paper
Deep Learning
2015cited by this paper
Continuous control with deep reinforcement learning
2015cited by this paper
Policy evaluation with temporal differences: a survey and comparison
2015influential reference
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
2015cited by this paper
High-Dimensional Continuous Control Using Generalized Advantage Estimation
2015cited by this paper
Human-level control through deep reinforcement learning
2015cited by this paper
Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies
2014cited by this paper
Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games
2014cited by this paper
Deterministic Policy Gradient Algorithms
2014cited by this paper
Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches
2013influential reference
Reinforcement learning in robotics: A survey
2013cited by this paper
Diffusion Strategies Outperform Consensus Strategies for Distributed Estimation Over Adaptive Networks
2012cited by this paper
Performance of a Distributed Stochastic Approximation Algorithm
2012cited by this paper
Distributed Optimal Power Flow for Smart Microgrids
2012influential reference
QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations
2012influential reference
Achieving Controllability of Electric Loads
2011cited by this paper
Distributed delayed stochastic optimization
2011cited by this paper
Diffusion Adaptation Strategies for Distributed Optimization and Learning Over Networks
2011influential reference
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
2011influential reference
Cooperative Convex Optimization in Networked Systems: Augmented Lagrangian Algorithms With Directed Gossip Communication
2010cited by this paper
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
2010influential reference
Broadcast Gossip Algorithms for Consensus
2009influential reference
A Convergent Online Single Time Scale Actor Critic Algorithm
2009cited by this paper
Distributed Subgradient Methods for Multi-Agent Optimization
2009influential reference
Natural actor-critic algorithms
2009influential reference
Stochastic Approximation: A Dynamical Systems Viewpoint
2008influential reference
Diffusion recursive least-squares for distributed estimation over adaptive networks
2008cited by this paper
Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization
2008cited by this paper
Multi-task reinforcement learning: a hierarchical Bayesian approach
2007cited by this paper
Incremental Natural Actor-Critic Algorithms
2007cited by this paper
Distributed Consensus Algorithms in Sensor Networks: Quantized Data and Random Link Failures
2007cited by this paper
Randomized gossip algorithms
2006influential reference
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
2006cited by this paper
Stability of Stochastic Approximation under Verifiable Conditions
2005cited by this paper
A scheme for robust distributed sensor fusion based on average consensus
2005cited by this paper
Distributed optimization in sensor networks
2004cited by this paper
Information flow and cooperative control of vehicle formations
2004influential reference
Off‐Policy Actor‐Criticアルゴリズムによる強化学習
2004influential reference
Stochastic Approximation and Recursive Algorithms and Applications
2003cited by this paper
Multi-Agent Reinforcement Learning:a critical survey
2003cited by this paper
Natural Actor-Critic
2003cited by this paper
Networked Robots: Flying Robot Navigation using a Sensor Net
2003influential reference
Nash Q-Learning for General-Sum Stochastic Games
2003cited by this paper
Coverage control for mobile sensing networks
2002influential reference
Reinforcement learning of coordination in cooperative multi-agent systems
2002cited by this paper
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
2002cited by this paper
Coordinated Reinforcement Learning
2002cited by this paper
A COOPERATIVE MULTI-AGENT TRANSPORTATION MANAGEMENT AND ROUTE GUIDANCE SYSTEM
2002cited by this paper
Value-function reinforcement learning in Markov games
2001cited by this paper
Learning Algorithms for Markov Decision Processes with Average Cost
2001cited by this paper
A Natural Policy Gradient
2001cited by this paper
The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
2000cited by this paper
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
2000influential reference
Direct gradient-based reinforcement learning
2000cited by this paper
Policy Gradient Methods for Reinforcement Learning with Function Approximation
1999influential reference
Actor-Critic Algorithms
1999influential reference
Dynamics of stochastic approximation algorithms
1999cited by this paper
Average cost temporal-difference learning
1999influential reference
Natural Gradient Works Efficiently in Learning
1998cited by this paper
Reinforcement Learning: An Introduction
1998cited by this paper
An Analysis of Temporal-Difference Learning with Function Approximation
1998cited by this paper
An analysis of temporal-difference learning with function approximation
1997cited by this paper
Planning, Learning and Coordination in Multiagent Decision Processes
1996cited by this paper
Markov Games as a Framework for Multi-Agent Reinforcement Learning
1994cited by this paper
Markov Decision Processes: Discrete Stochastic Dynamic Programming
1994cited by this paper
Applications of a Kushner and Clark lemma to general classes of stochastic algorithms
1984cited by this paper
Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms
1984influential reference
wchastic. approximation methods for constrained and unconstrained systems
1978influential reference
Discrete Parameter Martingales
1975influential reference
Natural Gradient Works Eciently in Learning
year unknowncited by this paper

CITED BY

IMAGINE: Intelligent Multi-Agent Godot-based Indoor Networked Exploration
2026cites this paper
Intelligent Energy Efficiency and Service Reliability Optimization for UAV-Aided Terrestrial Networks
2026cites this paper
Consensus-Based Distributed Reinforcement Learning With Primal–Dual Update for Networked Microgrids On-Line Coordination
2026influential citation
A Unified Framework for Locality in Scalable MARL
2026cites this paper
Resilient Multi-Agent Reinforcement Learning for Tiered Mixed Autonomy
2026cites this paper
Intelligent Semantic Communication Scheme Integrating ISAC for Low-Altitude Intelligent Networks
2026cites this paper
Pareto-Aware Dual-Preference Optimization for Task-Oriented Dialogue
2026cites this paper
Adaptive Policy Switching for Multi-Agent ASVs in Multi-Objective Aquatic Cleaning Environments
2026cites this paper
User Scheduling and Trajectory Design for Heterogeneous UAV Communication Networks With CNN-Assisted DRL
2026cites this paper
Fully-Decentralized MADDPG with Networked Agents
2025influential citation
Networked Communication for Decentralised Cooperative Agents in Mean-Field Control
2025cites this paper
A Novel Perspective of Energy Management Strategies on Multistack Fuel Cell Hybrid Electric Vehicles: Trends and Challenges
2025cites this paper
Spatial-temporal intention representation with multi-agent reinforcement learning for unmanned surface vehicles strategies learning in asset guarding task
2025cites this paper
Shooting Large-scale Traffic Engineering by Combining Deep Learning and Optimization Approach
2025cites this paper
Signal Whisperers: Enhancing Wireless Reception Using DRL-Guided Reflector Arrays
2025cites this paper
Byzantine-Resilient Decentralized Parallel Policy Gradient
2025cites this paper
Decentralized Counterfactual Multi-Agent Actor-Critic Algorithms
2025influential citation
Bayesian Ego-graph inference for Networked Multi-Agent Reinforcement Learning
2025influential citation
Energy-Aware MARL for Coordinated Data Collection in Multi-AUV Systems
2025cites this paper
SEROS: Shared exploration and reward optimization task scheduling strategy for multi-agent collaboration in edge computing networks
2025cites this paper
Actor-Critic Learning for Risk-Constrained Linear Quadratic Regulation
2025influential citation
Regret Lower Bounds for Decentralized Multi-Agent Stochastic Shortest Path Problems
2025cites this paper
Improving String Stability in Cooperative Adaptive Cruise Control Through Multiagent Reinforcement Learning With Potential-Driven Motivation
2025cites this paper
Multi-Agent Dynamically Networked and Decentralized Pursuit-Evasion
2025cites this paper
An Introduction to Reinforcement Learning Methods for Strategic Decisions in Competitive and Cooperative Multi-Agent Games
2025cites this paper
PrELIN: Provably Efficient Local-Information Networked Multi-Agent Reinforcement Learning
2025cites this paper
Signalling and social learning in swarms of robots
2025cites this paper
Provably Robust Federated Reinforcement Learning
2025cites this paper
Enhancing feeder bus service coverage with Multi-Agent Reinforcement Learning: A case study in Hong Kong
2025cites this paper
Exploring Communication in Multi-Agent Reinforcement Learning Under Agent Malfunction
2025cites this paper
Integrated blockchain and federated learning for the cybersecurity of distributed energy resources
2025cites this paper
Enabling Pareto-Stationarity Exploration in Multi-Objective Reinforcement Learning: A Multi-Objective Weighted-Chebyshev Actor-Critic Approach
2025cites this paper
A Decentralized Actor–Critic Algorithm With Entropy Regularization and Its Finite-Time Analysis
2025cites this paper
Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning
2025influential citation
Distributed policy evaluation over multi-agent network with communication delays
2025cites this paper
Policy Optimization in Multi-Agent Settings Under Partially Observable Environments
2025influential citation
H-NeiFi: Non-Invasive and Consensus-Efficient Multi-Agent Opinion Guidance
2025cites this paper
Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review
2025influential citation
Structure-Exploiting Reinforcement Learning for Networked Systems
2025cites this paper
Model-Free Q-Learning for Output Feedback Nash Strategy of Decentralized Nonzero-Sum Games
2025cites this paper
Federated Multi-Agent Reinforcement Learning for Privacy-Preserving and Energy-Aware Resource Management in 6G Edge Networks
2025cites this paper
Task Offloading in Vehicular Edge Computing using Deep Reinforcement Learning: A Survey
2025cites this paper
Game-Theoretic Understandings of Multi-Agent Systems with Multiple Objectives
2025cites this paper
A Communication-Efficient Decentralized Actor-Critic Algorithm
2025cites this paper
O-DQR: A Multi-Agent Deep Reinforcement Learning for Multihop Routing in Overlay Networks
2025cites this paper
Decentralized Intelligence for Centralized Control: Multi-Agent Reinforcement Learning for SD-WAN
2025cites this paper
Community-based Multi-Agent Reinforcement Learning with Transfer and Active Exploration
2025influential citation
Nash Equilibrium-Driven Adaptive Behavior in Swarm Intelligence with Self-Organizing Maps
2025cites this paper
Multi-Agent Formation Navigation Using Diffusion-Based Trajectory Generation
2025cites this paper
Risk-Aware Multi-Agent Reinforcement Learning for Cooperative Decision Making
2025cites this paper
Foundation models and intelligent decision-making: Progress, challenges, and perspectives
2025cites this paper
Taming Byzantine Adversaries in Decentralized Multi-Agent Reinforcement Learning
2025cites this paper
Distributed Neural Policy Gradient Algorithm for Global Convergence of Networked Multiagent Reinforcement Learning
2025cites this paper
Distributed Policy Evaluation with Local Updates over Time-Varying Communication Network
2025cites this paper
MARFT: Multi-Agent Reinforcement Fine-Tuning
2025cites this paper
Multiagent Reinforcement Learning for Constrained Markov Decision Processes by Consensus-Based Primal–Dual Method
2025influential citation
Collaborative Swarm Robotics for Sustainable Environment Monitoring and Exploration: Emerging Trends and Research Progress
2025cites this paper
Intelligent Offloading in Vehicular Edge Computing: A Comprehensive Review of Deep Reinforcement Learning Approaches and Architectures
2025cites this paper
Cooperative Multi-Agent Planning with Adaptive Skill Synthesis
2025cites this paper
JLOS: A Cooperative UAV-Based Optical Wireless Communication With Multi-Agent Reinforcement Learning
2025cites this paper
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning
2025cites this paper
Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning
2025cites this paper
Multi-Agent AMIX-DAPG of Dual-Arm Robot for Long Horizon Lifecare Tasks
2025cites this paper
Learning Closed-Loop Parametric Nash Equilibria of Multi-Agent Collaborative Field Coverage
2025cites this paper
Multi-Agent Reinforcement Learning for Graph Discovery in D2D-Enabled Federated Learning
2025cites this paper
Solving Multi-Agent Safe Optimal Control with Distributed Epigraph Form MARL
2025cites this paper
Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR
2025cites this paper
Research on collaborative adversarial strategies for drone swarms based on deep reinforcement learning
2025cites this paper
Neighbor-Based Decentralized Training Strategies for Multi-Agent Reinforcement Learning
2025cites this paper
Multi-Agent Reinforcement Learning With Decentralized Distribution Correction
2025influential citation
A Scale-Independent Multi Agent Deep Reinforcement Learning Framework for Wireless Resource Allocation in UAV-Based Logistics Networks
2025cites this paper
Trust in Smart City Mobility Applications: A Multi-Agent System Perspective
2025cites this paper
FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems
2025cites this paper
Ego-centric Learning of Communicative World Models for Autonomous Driving
2025cites this paper
f-Divergence Policy Optimization in Fully Decentralized Cooperative MARL
2025cites this paper
Multistep Q-Learning-Based Optimal Consensus Control of Linear Discrete-Time Multiagent Systems
2025cites this paper
Distributed Multiagent Reinforcement Learning Approach for Multiserver Multiuser Task Offloading
2025cites this paper
CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale
2025cites this paper
Towards safe control parameter tuning in distributed multi-agent systems
2025cites this paper
Deception in Game Theory and Control: A Tutorial
2025cites this paper
A Framework for Objective-Driven Dynamical Stochastic Fields
2025cites this paper
Factorizing value function with hierarchical residual Q-network in multi-agent reinforcement learning
2025cites this paper
Policy Consensus-Based Distributed Deterministic Multi-Agent Reinforcement Learning Over Directed Graphs
2025influential citation
Multi-Agent Cooperative Pursuit Algorithm for UGVs Based on MASAC
2025cites this paper
Learning Individual Potential-Based Rewards in Multiagent Reinforcement Learning
2025cites this paper
Learn to Schedule: Data Freshness-Oriented Intelligent Scheduling in Industrial IoT
2025influential citation
Finite-Time Analysis of Heterogeneous Federated Temporal Difference Learning
2025cites this paper
Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration
2025cites this paper
Structured Cooperative Multi-Agent Reinforcement Learning: a Bayesian Network Perspective
2025cites this paper
MAT-Agent: Adaptive Multi-Agent Training Optimization
2025cites this paper
Signal attenuation enables scalable decentralized multi-agent reinforcement learning over networks
2025cites this paper
GRA: Graph-based Reward Aggregation for cooperative multi-agent reinforcement learning
2025cites this paper
Consensus-based Decentralized Multi-agent Reinforcement Learning for Random Access Network Optimization
2025cites this paper
An Efficient Approach for Cooperative Multi-Agent Learning Problems
2024cites this paper
How Collective Intelligence Emerges in a Crowd of People Through Learned Division of Labor: A Case Study
2024cites this paper
An entity graph with spatial correlation for networked multi-agentreinforcement learning
2024cites this paper
Hierarchical Policy Optimization for Cooperative Multi-Agent Reinforcement Learning
2024cites this paper
An Advantage-based Optimization Method for Reinforcement Learning in Large Action Space
2024cites this paper
Achieving collective welfare in multi-agent reinforcement learning via suggestion sharing
2024cites this paper
Matrix-Scaled Consensus on Switching Networks
2024cites this paper