Galton Boards Everywhere
Summing hidden branches beats textbook odds at explaining the world’s smoothness
I loved looking at visitor stats for my educational website. For two straight decades, the start of each school year brought successively higher peaks, and more eyes on it inspired further improvement. By the second Monday of August, I could line up the numbers with that same Monday last year and predict the size of the first September spike to within 1%. Zoom out and the same eerie regularity was there at every scale. Wednesdays matched Wednesdays, and year after year October swelled into the same familiar hump.
The world has a billion people looking online to do their homework, many finding my site. How could I guess how many would arrive between 8 and 9pm on some Thursday a week away, to within a percent? That doesn’t seem possible.
Textbook probability handwaves this away. “Law of large numbers. Central limit theorem. It all averages out.” That names the phenomenon but doesn’t explain it. Why should it average out? Why do a billion independent procrastination decisions conspire to draw a smooth hill on my analytics dashboard instead of jagged noise?
First, we need to understand why the normal distribution is that shape. It’s not a parabola or other familiar curve. The equation to draw it is pretty complicated. Here it is in its simplest form, centered at 0 with a standard deviation of 1.
Pretty impenetrable for such a universal result that appears everywhere. Why isn’t it a triangle with its point in the middle? Why that particular bulge?
It’s not obvious at first glance because deep down, the bell curve is a summation result. In calculus, we learned to approximate an area by chopping it into thin bars and adding them up. In the limit, those sums become an integral. Usually that makes the integral feel like the “real” object and the sums like rough approximations. This time, it’s reversed. The smooth normal curve is what you get after you’ve already averaged over a huge number of discrete cases. The real work is being done by simple counts of how many discrete histories land in each slot.
That’s it. No special functions. No infinite limits. Just “how many distinct sequences of lefts and rights end here?” times ½ for each decision.
You’ve seen the balls dropped through the pegboard, bouncing back and forth, and ultimately landing in the shape of a normal distribution. I’ve never seen its importance explained properly. Like all those visual Pythagorean “proofs” they toss at us, it’s sold as yet another trick to creating the same shape. “Look, the normal distribution appears in nature!” and they move on. They undersold it, because it’s the true physical implementation of what creates the curve.
The key thing to notice is the balls were all dropped from the same midpoint. They wanted to fall straight down, but hit obstacles over and over again. Each short drop, they again had a chance to shift right or left. The bell shape isn’t because randomness likes the middle. It’s because there are more distinct paths that land in the middle. You can go left-then-right or right-then-left and end up in the same slot. You can only go left-then-left to end up on the edge.
What made it click for me was to stop thinking about the balls and start thinking about the board as a tiny many‑worlds generator. Each peg is a fork where reality could have gone one of two ways. Each full path from top to bottom is a “micro‑world” in which the ball happened to bounce left then right then right then left. When you look at the bins at the bottom, you’re not seeing one run. You’re seeing the summed‑together shadow of all those micro‑worlds. The smoothness of the curve is not one universe behaving nicely. It’s what you get when you add up everything that could have happened. My web stats were doing the same thing.
A student doesn’t schedule themselves to visit the website at exactly 8:03pm. Their plan is a fuzzy “sometime before bed.” Between dinner and midnight, their evening hits pegs. Practice runs long. A friend calls. A sibling hogs the computer. A game finishes early. A parent yells from down the hall. Each tiny interruption or early finish nudges their homework time forward or back by a few minutes.
Each student’s day is a swarm of these could‑have‑beens. Why is there a clean hump at 8–9pm instead of noise? For the same reason the Galton board bulges in the middle. There are many, many more sequences of nudges that land you near a midpoint. The middle is thick because there are more ways to be moderately delayed or advanced than there are to be delayed by every hump (far right) or encounter no humps at all (far left).
From the site owner’s perspective, that combinatorial pileup is what I was seeing. I wasn’t predicting individual people. I was summing over the geometry of possible evenings. Once you have enough students, the “sum over universes” version and the “law of large numbers” version become the same. But only one of them feels like it answers why the curve is smooth down to the hour.
The same geometry of paths is why October kept winning the traffic contest. Somewhere in the fall, almost every chemistry class has to march through the periodic table. Some schools hit it in late September, others in November, and that chapter leans on earlier topics while later topics lean on it. Assemblies, sick days, snow days, surprise quizzes in other subjects—every scheduling bump shoves that unit forward or backward. Across thousands of classrooms, those nudges smear out the exact date but don’t erase the effect. They pile into one broad, reliable October peak, flowing through an unseen Galton board.
So far we’ve only used the Galton board in the forward direction. Start from a single point, sprinkle random nudges, and watch the fan‑out of possibilities pile into a bell curve. A lot of real problems don’t feel like that. They feel backwards. You’re already in a very specific situation and you’re asking “how likely is it that X caused this?” or “how surprised should I be by this pattern?”
That’s Bayes’ rule territory. The Galton board gives you a way to feel Bayes instead of just memorizing it. Forward, you ask “If a hypothesis is true, what pattern of evidence would I typically see?” That’s dropping balls from the top and watching which bins light up. Backward, you fix a particular bin at the bottom and say “I’m in this bin. What paths up the board could have led here?”
This is especially important when your “evidence” is your existence. Imagine an alien civilization that sends out probes. It’s unlikely their effect on observers like us is neutral. They either wipe out life or they grab and unify it into some tightly controlled network. If they work as intended and one shows up here, the number of humans like you sitting around goes to approximately zero.
Naively, you might think if these probes are incredibly reliable, and if I see one, it’s probably working. That’s urn logic. If 99.999% of balls in the urn are functional probes and you draw a ball at random, you should assume you drew a functional one. The Galton board tells a different story when you run it backwards.
Picture the space of possible futures of Earth as a board. At some pegs, civilizations launch things. At others, probes arrive. In a huge fraction of branches where a probe reaches a planet like ours, it does what it was designed to do. Those branches snip off almost all of the paths that lead to “humans pondering alien hardware while reading this.”
Now start from your actual bin on the far, far edge of that board. Run your attention back up the board and prune everything that doesn’t contain you. Any branch where the probe successfully sterilized or assimilated humanity fails that test. Those universes don’t have you in them. What’s left are the incredibly rare failures. Among those surviving branches, almost every probe you see is broken, because in the branches where they weren’t, you aren’t around to observe them.
The forward question is “Given the probe’s design, how often do probes work?” while the backward question is “Given that I’m a normal observer still alive to look at a probe, how often did this one work?” Those are not the same, and your intuition can easily conflate them without a picture like this.
The same backward-reasoning mistake shows up at smaller scales. Suppose you write a post about a famous product. It gets traction. The actual creator emails you and asks for your address so they can send you something. You remember that “half of people online who ask for your address are scammers” and feel a jolt of fear.
That rule of thumb comes from a different situation: strangers, out of the blue, asking random people for addresses. That’s urn thinking. You imagine an urn full of “address requests,” half good and half bad, and assume this one was drawn from it. Flipping back to urn logic is like turning the board upside down and pretending only the last peg matters.
On your actual Galton board, something happened first. You publicly wrote about this person in front of a lot of people. That peg funnels a huge number of paths into “the real creator sees it and replies.” Very few paths route through “a scammer, in character, coincidentally shows up right after.” Proper backward pruning says start from the full bin of “I wrote about X and then X emailed me about that post” and delete histories that don’t contain that whole pattern. Once you do that, most of the “random scammer” worlds were never on your board in the first place.
If all you ever do is roll dice and flip coins, the forward view is enough. “What are the odds of getting 7 heads in 10 flips?” is a question you can answer by pure combinatorics or a closed‑form formula. You don’t have to think about paths or conditioning on being you. But most of the questions we actually argue about are not of that type. “Why are my web stats so smooth?” “Should we expect to see functional alien probes?” “Is this email suspicious given what happened yesterday?” All of those are really about summing over lots of possible micro‑histories, and pruning based on what you already know.
You could argue this is reinventing the wheel with extra steps and imaginary universes, and you’d be right. You can do all this without ever drawing a pegboard, but for me, that never stuck. “It averages out” is meaningless without what’s doing the averaging. “Update on evidence” doesn’t help unless you see which histories get deleted and why.
All of life is shadows on a board you cannot see. Forward, paths fan out through pegs that nudge you earlier or later, taller or shorter, richer or poorer. Our height, exam scores, and resting heart rate all take the same shape because there are far more micro‑stories that land you near the middle than micro‑stories that push you all the way into the extremes.
Backward, you stand in one bin, erase every history that doesn’t pass through it, and ask what’s left. The middle is crowded because there are more ways to get there. There are Galton boards everywhere for those with the eyes to see. Once you start noticing them, probability stops looking like a bag of tricks and starts looking like what it really is: counting paths, then being honest about which ones could have led you here.
Tomorrow, how to navigate the Galton board of your life when you struggle to see the good branches.
