Formation Game
theory developed by Máté Badó Miklós
A mean-field formalization of a multi-agent formation task. Read the math, then run agents on a 30×30 grid under any of six coupling variants and watch them self-organise into a target pattern from random initial positions.
1. The setup
Many agents occupy locations on a grid. The goal is for the population to take on a particular spatial shape (a ring, a letter, a target density). Each agent only pays attention to local payoffs and moves at a small cost. The basic question: when can such a population organise into a target pattern under purely local, decentralised updates?
The key modelling trick is to stop tracking individual agents and instead track their empirical distribution:
where is the location of agent . In the large-population limit , with a density on the spatial domain . The desired shape is a target density , and the formation objective is to make approximate .
2. Finite-agent formation game
Let be a finite grid. Each agent occupies a cell . The occupation count is . A natural formation error compares the population to the target:
Suppose agent moves from to . The marginal utility of the move is
The first two terms are the global error reduction; is the movement cost. Defining gives a potential game: any unilateral payoff-improving move increases . The local best-response update is
where is the local action set within perception/movement radius .
Below: agents on a 12×12 grid, moving one cell per tick under exactly this rule. The shading is a centred Gaussian target.
3. The large-population limit
In the limit , individual identities don't matter — only the density does. Write a representative agent's state and control , with mean field :
Each agent has an individual cost , where and take the target as a parameter.
4. Mean-field control
If agents are cooperative and share an objective, the limit is mean-field control: a central planner picks a feedback for the whole population, and the density evolves by Fokker–Planck:
For target-density formation, a natural running cost is
where is a perception/smoothing kernel . If is local, agents only see their neighbourhood, matching the local-information regime explored in the simulator below.
5. Mean-field game (HJB–FP)
If agents are non-cooperative, the limit is a mean-field game: each agent solves its own optimisation while the mean field is taken as given. An equilibrium satisfies the coupled Hamilton–Jacobi–Bellman / Fokker–Planck system:
The decentralised feedback used by each finite agent yields an -Nash equilibrium with as . That's the game-theoretic justification for the mean-field limit: instead of solving an intractable -agent strategic problem, solve a representative-agent fixed point and get approximately-Nash decentralised strategies for free.
6. Cost variants
Different formation tasks come from different choices of and . The simulator below lets you swap between four of these.
§6.1 Target-density. Match a prescribed exactly.
§6.2 Congestion-avoiding. Move toward useful regions but punish crowding.
§6.3 Local information. Replace with a localised kernel whose support is a ball of radius — directly the "perception radius" slider below.
§6.5 Coverage. Spread out to cover a domain weighted by importance :
§6.9 Minority / anti-coordination. Prefer underused cells; useful when is uniform or when you want soft target-following.
Simulator
The grid is . Pick a target shape, pick a cost variant, hit play, and watch the agents self-organise from random initial positions. The formation error in the readout normalises to so you can compare runs.
target pattern
cost variant
What to look for. Compare target-density against anti-coord on the ring pattern — both produce something ring-ish, but the anti-coord variant tolerates much fuzzier convergence. Try a small perception radius ( or ) on the T pattern: agents get stuck in local optima because they can't see far enough to "find" the shape.
Based on Mean-Field Formalization of a Formation Game, theory developed by Máté Badó Miklós, June 2026. The simulator implements the finite-agent best-response rule from §2 with four of the coupling variants from §6. Perception-radius slider exposes the local-information regime (§6.3).