Improving Real-World Applicability of Networked Mean-Field Games using Function Approximation and Empirical Mean-Field Estimation
Keywords: mean-field games, networked communication, real-world applicability, multi-agent systems, function approximation, deep learning
TL;DR: Networked communication has recently been introduced to mean-field games to improve their real-world applicability; we make further progress towards real-world scalability by additionally introducing deep learning and empirical mean-field estimation.
Abstract: The mean-field game framework can be used to approximate the solutions of games involving very large populations of agents, which is useful in real-world applications where other multi-agent algorithms struggle to scale. Recent algorithms allow decentralised agents, possibly connected via a communication network, to learn equilibria in MFGs from a single, non-episodic run of the empirical system. While this is more reflective of real-world situations than prior MFG works, these recent approaches are only given for tabular settings. This computationally limits the size of the feasible state/action spaces, and also means the algorithms cannot generalise beyond policies depending only on the agent’s local state to so-called ‘population-dependent’ policies. We address these limitations for real-world applicability by introducing function approximation to the existing setting, drawing on the Munchausen Online Mirror Descent method that has previously been employed only in finite-horizon, episodic, centralised settings. While this permits us to include the population’s mean-field distribution in the observation for players’ policies, it is unrealistic to assume decentralised agents would have access to this global information: we therefore additionally provide new algorithms that allow agents to locally estimate the global empirical distribution, and to improve this estimate via inter-agent communication. We prove theoretically that exchanging policy information helps networked agents outperform both independent and even centralised agents in function-approximation settings. Our experiments demonstrate this happening empirically, and show that the communication network allows decentralised agents to estimate the mean field for population-dependent policies.
Submission Number: 19
Loading