Field Notes · v0.1 · prototype
An interactive field guide

Field notes on a thinking grid

How a colony of two-line-of-code ants can find food around a maze — and why, surprisingly, the intelligence lives in the floor and not the feet.

James Robert Somers · log entry 01 inspired by Peter Ashwell · linkedin
Two colonies, one grid. The movement, the fighting, the territory — every bit of it is arithmetic on the floor.

In the autumn of 2011, my friend Peter Ashwell and I entered Google's AI Challenge. The game was Ants — write a program that controlled a colony fighting other colonies over food, territory, and hills, on a map it had never seen, with half a second per turn to think. We weren't serious about winning. We were just having fun, and I was lucky enough to be learning out loud next to someone with an unusual range — the kind of person who, in a single conversation, would move from ant pheromone trails to statistical mechanics to the right way to structure a Python module, and make all of it feel like one subject.

The approach that follows is his. I've wanted to write about it for fourteen years, partly because it's beautiful on its own terms, and partly because working through it with Peter was one of the most enjoyable collaborations I've had. This is my attempt to pass it on.

Section one · the field

A single piece of food, and its rumour

A note before we start. If you've read about ants and algorithms before, you may have met the pheromone-trail kind — ants deposit marks, the marks fade, other ants follow them. This isn't that. Here the ants don't write anything. The food does. See the history section at the end if you want to know how the two traditions relate.

Start with almost nothing. A small grid of empty cells. Every cell holds one number — call it an energy, though rumour is closer to what it does. One cell — the food — is pinned at 1.0. The rest start at zero.

Now the rule. Once per turn, every non-food cell looks at its four neighbours, takes the average of how much they exceed it, and updates itself by a small fraction of that difference. In code it's one line; on paper it's a claim about gossip. "If my neighbours, on average, know something I don't, I should know a little of it too." The food cell never listens. It only broadcasts.

Press step. Watch what happens.

1.1   one source, no obstacles, optional exploration layer
iter 0 max 1.00 reach 0
Click any cell to plant or remove a food source.

That's the whole trick, in miniature. The food doesn't send a signal. It doesn't know anyone's there. It just sits at 1.0 while, through nothing but arithmetic, the cells around it come to hold weaker copies of it. Two steps away, a fraction. Ten steps away, fainter. A hundred steps, a whisper.

Imagine an ant anywhere on this grid. It doesn't need a map or a plan. It looks at its four neighbours, picks the largest, and walks there. Every turn. The ant ends up at the food — not because it searched, but because the grid told it which way was warmer.

Now toggle explore on. A second field, in gold, fills in everywhere the food field isn't. Every unvisited cell is, in its small way, attractive too. Two fields, same grid, neither aware of the other. An ant standing on this grid doesn't have to choose between food and exploration — it just adds the two values at each neighbouring cell and walks to the largest. We'll come back to that in §4. First: what happens when we give the grid something to work around?


Section two · walls

Obstacles don't block the rumour. They reroute it.

Add a wall. A line of cells that refuse to participate — they don't hold energy, they don't pass it on. The rumour has to go around.

Nothing in the rule changed. Every non-wall cell still averages its non-wall neighbours. But because the wall is silent, the energy flows the long way: through the gap, over the top.

Plant food (or a wall) with the tools, then run the diffusion.

2.1   the gradient finds the opening on its own
Toggle the gradient overlay once the field has settled. Each arrow points to the neighbour with the highest energy — the way an ant at that cell would walk. Every arrow together is the colony's plan, and no one drew it.

There's no pathfinder here. No search. Every cell runs the same four-line rule, in parallel, every tick — and the shortest passable route emerges as the direction of steepest ascent. It costs the same to compute whether you have one ant on the board or a thousand. A* is O(n) per ant; diffusion is O(cells), full stop. The ants are free.

And the ants themselves

Same field, ants on top. Eight of them, each obeying the same two-line rule: look at the four neighbours, pick the biggest, walk there. The gradient does the navigation; the ants are just little arrows pointed at the steepest ascent. They're drawn in pencil here — less because the simulation demands it than because that's how a field guide looks.

2.2   eight ants walking the gradient you just drew
turn 0 fed 0
Same field, same rule. The ants are a way to see the gradient.

Section three · the zoom

What the rule actually does, in one cell

Before we stack anything on top, it's worth slowing down for one cell. The exhibit below is a 5 × 5 zoom — small enough that we can look at the numbers directly. The food sits in the middle. The rest of the cells have run the diffusion enough times to settle into a stable shape.

Hover any non-edge cell. The four arrows that appear show its neighbours contributing their differences to the update. The formula on the right fills in with the actual values. That's the whole step.

3.1   hover a cell to see the arithmetic
hover a cell →
N E S W
center
new = 0.96 × ( c + mean( N−c, E−c, S−c, W−c ) )
=
hovered
Hover with a mouse, tap-hold on touch. Edge cells are skipped — they have fewer than four neighbours, so the mean works slightly differently.

A few things to notice. The differences can be negative. If a neighbour has less energy than the center, its contribution pulls the cell's value down. That's why, even as the food's rumour spreads outward, the energy at any given cell stops rising once it's in equilibrium with the cells around it. The field isn't a wave. It's a pressure.

The other thing. The rule is exactly the same for every cell. There's no special logic for "cells near the food" or "cells near the edge" or "cells on the diagonal." There's only the four-neighbour average, applied uniformly. All the structure you saw in §1 and §2 — the bloom, the gradient, the flow around walls — came out of that one local operation, run over and over, in parallel.

Which raises the question: what happens if we run two of these operations on the same grid at once?


Section four · stacking

Two fields, one grid, and an ant that adds

Food isn't the only thing an ant cares about. It also wants to cover ground it hasn't seen — because that's where food it doesn't yet know about might be. So we run a second diffusion, on the same grid, with a different source pattern: every cell that hasn't been looked at emits a little attraction of its own. Unexplored territory becomes, on the margin, inviting.

Below, the same twelve-by-eighteen grid is shown three times. The top panel shows the food field alone. The middle shows the explore field alone. The bottom shows both fields overlaid — and, crucially, an ant's decision at any given cell is based on the sum of the two values at each of its neighbours. Click to plant food. Pull the slider to rebalance the ant's preferences between the two fields.

4.1   two diffusions on one grid — and an ant walking the sum
foodteal, source at the pellets
exploreamber, low floor on every unvisited cell
sumwhat the ant actually sees — click to plant food
turn 0 fed 0
The slider rescales the two fields before they're summed. Slide left to make the ants myopically food-driven; slide right to make them restless explorers. Nothing in the diffusion changed — only how much the ant cares about each layer.

This is the move Peter was building toward. A single ant's behaviour — "walk to the neighbour with the highest value" — never changed. What changed is what the value is at each neighbour. By putting a cheap linear combination between the grid and the ant, you get a whole new kind of ant. Slide the weight toward food: greedy, beelines toward pellets, ignores unmapped territory. Slide the other way: a scout, fans out into the dark, barely notices food under its feet.

In Peter's actual code, there were more layers than two. A hill protection layer that repelled enemy ants from your home. An enemy attraction layer that pulled attackers toward their hill. A friendly spread layer that pushed ants away from each other so they covered ground. Each one a separate grid running the same four-line rule, each with its own weight. The ants themselves were never more than "look at the neighbours, add up the layers, walk to the biggest sum." And that was enough.

next Turning these weights into personalities — each ant carrying its own vector, its own version of the sum.

Section five · personality

Different weights, different kinds of ant

The slider in §4 moved every ant at once — a colony of identical preferences, drifting between greedy and curious in lockstep. But that's not how Peter's ants worked. Each ant carried its own pair of weights. Some were hungry — a high wFood — and they beelined for any pellet they'd heard of. Others were restless — high wExplore — and they fanned out into the dark. They all ran the same four-line decision rule. What differed was how much each of them cared about each layer.

Nothing on the grid changed. The diffusion stepped the same way. The decision — "look at the four neighbours, pick the highest" — ran the same way. The only change: the sum each ant computed used its own personal weights. wFood · foodField + wExplore · exploreField — same arithmetic, different constants per ant.

5.1   eight ants, eight weight vectors, one grid
turn 0 fed 0
presets
Each ant draws its own weights once at spawn, from the colony means with noise scaled by the jitter slider. At jitter = 0 the colony is homogeneous — like §4. Turn jitter up and individuals drift apart: a greedy one here, a scout there, all sampled from the same numerical distribution.

Once you have this move, you get role diversity for free. In Peter's real bot there were hill protectors, foragers, and soldiers — all the same eight lines of decision logic, running with different weight profiles. Adding a new role didn't mean writing new behaviour. It meant naming a new set of dials.


Section six · danger

A source that repels — and an ant that routes around it

Everything so far has been about attraction. Food broadcasts, the field blooms, ants climb the gradient toward warmer cells. What about something an ant should avoid?

In Peter's bot there was an enemy gene — a field that identified hostile ants and broadcast them as sources, the same way food did. Except the weight an ant applied to this field was negative. When summing the neighbours, enemy energy subtracted. The ant still picked the neighbour with the highest sum; the high-enemy cells just looked, to that sum, like pits.

Below, one ant, one food pellet, and one stationary enemy sitting near the straight-line path between them. The food field is teal, as before. The enemy field is in sienna. Press play and watch the ant navigate.

6.1   one ant, one pellet, one hazard — routing by sum
turn 0 fed 0
Toggle the enemy field off to see what the ant is reacting to. With no sienna visible the detour looks like caution; turning it back on resolves the mystery in one glance.

Three things are worth noticing. First, nothing about the ant changed. Same two lines — look at the four neighbours, pick the biggest sum, walk there. The contents of the sum got one more term, with a minus sign in front of it. That's it.

Second, the enemy field diffuses the same way food does. Sienna doesn't need a special "danger propagation" routine. It's just a source on the same grid machinery, read with a negative coefficient.

Third — and this is the move that makes the framework earn its keep — the weight is a dial. Drag the slider down toward zero and the ant stops caring; it marches straight into the enemy. Drag it up and the ant becomes timid, taking a wide berth. Same field, same ant, different parameter. A brave forager and a cautious scout are the same program with different numbers.

next Two colonies on the same map. Everything you've seen, scaled up — and combat that falls out of the arithmetic without anyone writing combat logic.

Section seven · two colonies

Two sides, one grid, and a fight nobody programmed

Everything up to here has been one colony. Now put two on the map, each with its own hill, each with its own copy of the weighted-sum rule. What happens when they meet?

The setup: a larger grid, a red hill on the left, a blue hill on the right, food pellets scattered between them. A single ant starts at each hill. When one of your ants walks onto a pellet, it eats it — the pellet vanishes and a new ant hatches at your hill. Food respawns elsewhere on the map, so the supply stays roughly constant. A colony that collects food faster grows faster.

Each ant evaluates four terms when it picks its next cell — attract toward food, attract toward the enemy hill, a force-ratio term that pulls back when it's locally outnumbered (and does nothing when it isn't, so the attack-the-enemy-hill pull takes over as local numbers catch up), and a spread-out term so the colony covers ground rather than clumping. Same two-line ant as every previous exhibit. The only thing that's different is the sign and size of each term.

Combat resolves on its own. An ant dies if it's outnumbered at close range — specifically, if any enemy in range has fewer enemies around them than the ant has around itself. Nothing in the movement logic knows about combat; nothing in the combat logic knows about fields. They share a grid, and the interesting behaviour lives in the interference pattern.

7.1   two hills, two colonies, one emerging fight
turn 0 red alive 0 blue alive 0 hills razed 0–0
Both colonies share the same four dials, so the match is symmetric by default. Drag one dial at a time and watch the style of play change — a low force-ratio makes ants oblivious to numbers (careless advances, pointless deaths); a very high spread-out makes them drift lonely into enemy territory.

A few things are worth noticing. Numerical advantage wins fights — an ant that's outnumbered at close range dies, so a colony that gets locally concentrated first can afford to close and the other can't. That's not a tactic anyone wrote; it's the combat rule meeting the force-ratio term. Second, fronts stabilize in a way that isn't a plan — each side's ratio term pulls its ants back when the nearby enemy clump is bigger than its own, and pushes them forward when the balance flips. Third, razes happen through the flanks. The attacker that gets through is the one with the clearest gradient toward the enemy hill and the fewest enemies in the way. That's just the arithmetic.

Peter's actual bot had a few more fields than this — a dedicated hill-defence term, a friendly-spread term tuned independently from the enemy-avoidance term, a food-ownership signal to avoid two ants chasing the same pellet. Adding any of those is the same move: one more field on the same grid, one more term in the sum. The framework keeps paying off.

next What a whole game of this looks like at steady state. No tuning, no controls, just the machinery doing what machinery does.

Section eight · full colony

What the whole machine does when left alone

Everything so far has been an exhibit — one move at a time, isolated for clarity. The bot Peter actually wrote ran this machinery at scale, for four or five hundred turns a game, against opponents doing the same thing in slightly different shapes. What follows is approximately that. Bigger grid, more ants, food that respawns, no dials you can touch. Just watch it run.

Everything on the screen comes from the same small parts you've already seen. The same four-line diffusion on each field. The same two-line ant that adds its neighbours up and walks to the highest sum. The same one-line combat resolution. No plans, no state machines, no pathfinder, no tactics. The scouting, the fronts, the flanks, the occasional raze — all of it is parameters and arithmetic, nothing else.

8.1   the whole thing, running
red0 blue0
turn 0 red 0 blue 0
When a hill falls the game continues for a short coda, then the board resets and another one starts. Leave it on — every match looks different; every match looks like a match.

Every match is different. Every match looks the same. That's what I think is beautiful about the trick, and why I've wanted to write it down for fourteen years. Peter built this in a few hundred lines of Python in the autumn of 2011, and everything the ants do — the cooperation that isn't cooperation, the defence that isn't defence, the roles that aren't roles — lives in the grid underneath them, the field that knows the answers before anyone walks anywhere.


Appendix · history

Where this technique came from

The idea of treating a goal as a source that radiates into surrounding space, and an obstacle as something that repels, goes back at least to Oussama Khatib's 1986 paper on robot arm collision avoidance. His "artificial potential field" was a workaround for a practical problem — how do you stop a Puma 560 from swinging into a table — and it worked because the gradient descent felt, to the arm, like rolling downhill into a valley.

Twenty years later, at the University of Colorado, Alexander Repenning reframed the same trick as an answer to an object-oriented programming question. In his 2006 OOPSLA paper Collaborative Diffusion: Programming Antiobjects, he asked: why should a Pac-Man ghost run its own pathfinder every frame? Instead, let the maze tiles compute their distance to Pac-Man, in parallel, and let the ghost just read off the nearest neighbour. The computation moves from the chaser to the ground it walks on. Repenning showed the same technique could run a whole soccer team.

In June 2011, Alex Champandard published "The Core Mechanics of Influence Mapping" on GameDev.net — a concise summary of how this family of techniques, by then well established in RTS games like Killzone, actually works under the hood. Four months later, Google opened the Ants Challenge. The forum filled with people trying to port the idea to a grid of three hundred ants. Peter's version is the cleanest I've seen.

A note on stigmergy

There's a neighbouring technique that's easy to confuse with this one. In stigmergy — the approach behind Ant Colony Optimization, Dorigo's 1992 work, and most of the ant simulators you'll find online — ants lay down pheromone as they walk, and the pheromone evaporates over time. Trails emerge from positive feedback: a path that's walked more accumulates more pheromone, which attracts more ants. The ants write the environment.

The technique in this piece is different. Here the ants don't write anything. The food does. Sources broadcast into a static field, the field diffuses, ants read the gradient. If you've been reading about ants and algorithms for a while and you're thinking "isn't this just ACO?" — it isn't. The vocabulary overlaps but the machinery runs the opposite direction. The two techniques can be combined, and in commercial game AI often are, but this piece is strictly about the diffusion-from-static-source version.

References