AGI Beefs

Every few months there's a new round of arguing about AGI, and we talk about "timelines" as if the thing is definitely coming and the only open question is when, while still not really agreeing on what it would even mean if it arrived. The most recent flare-up is Demis Hassabis vs Yann LeCun, where LeCun says general intelligence is "complete BS" because human cognition is specialized for the physical world and our sense of generality is an illusion produced by being unable to imagine what we're blind to, and Hassabis disagrees, saying brains are "approximate Turing Machines" capable of learning anything computable, pointing to chess and mathematics and civilization as the evidence.

I think they're both wrong, and not in the boring "truth is in the middle" way. They're having the wrong argument entirely. Both of them treat intelligence as a function, something that takes inputs and produces outputs and can be measured on benchmarks and in principle copied between substrates and scaled up. They disagree about whether the function is "general" or "specialized" but they agree on what kind of thing they're talking about, and that's the part I want to push on.

I get why the function frame is appealing, it's clean, you can talk about intelligence in the abstract, compare systems on leaderboards, imagine porting minds between substrates the way you'd port a binary. But I don't think intelligence is a function at all. It's a process, a continuous history for a particular lump of matter, entangled with its environment in ways that don't survive abstraction.

So when people say "we'll just copy the weights", I want to push back, because the intelligence isn't in the state, it's the ongoing process that produced the state and keeps modifying it. You clone the weights and what you get is a snapshot. The intelligence was the training run, the continuous adaptation, the being-in-the-world (or being-in-whatever-environment).

Which doesn't mean software can't ever be intelligent. Software that runs continuously, learns in real-time, and is coupled to an environment where something is genuinely at stake, maybe that counts. But that's not what we're building. What we're building are frozen artifacts (the weights get fixed after training, there's no continuous learning happening, no real-time entanglement with anything, nothing at stake for the model in any sense that matters).

I like to think of current AI systems as cognitive optics, lenses and mirrors that reflect and refract and amplify human cognition. They see the way telescopes see, which is to say, they don't, they just bend light that's already there. The complaint that "AIs reproduce our biases" isn't a bug to be patched out, it's the whole thing they do, the only thing they could do. The training data is human, the reward signal is human, the prompts are human, the whole apparatus is a hall of mirrors pointed back at us, sometimes usefully warped, sometimes just flatteringly so.

What this means is that LLMs are doing compression rather than explanation. The weights encode statistical regularities, which patterns co-occur in training data, which sequences follow which, and that's a different thing from understanding why anything works. ARC is the test case I keep coming back to here, because it was designed specifically to resist memorization and demand genuine abstraction, and LLMs fail at it precisely because compression isn't the same as understanding. The scaling thesis says more parameters, more data, more compute, and eventually intelligence will fall out, and I think that's a bet rather than an engineering certainty, and I think the bet is wrong. Better optics are still optics.

Obvious objection at this point: if I want continuous learning and environmental coupling, fine, just simulate it, build a digital environment with entropy and evolution and death, evolve agents inside it, let them have stakes. If you build software that actually runs continuously, that actually adapts, that has something on the line (even something simulated), that's at least a different kind of thing from what we're building now, and whether simulated stakes are enough to ground real intelligence is a real open question to me. But it's also worth saying that nobody is seriously trying this. The research program we have is to scale pattern-matching on static datasets and hope generality falls out.

So to be clear, I'm not saying AGI is impossible. I'm saying the current approach won't get there. Frozen artifacts can be staggeringly capable inside their training distribution, but I don't think they can be genuinely general, however impressive the demos look. Could software be general? Probably. But what it would look like is something quite different from what we're building, something that learns continuously, that's coupled to an environment in real-time, and that has actual stakes — and we don't really know how to build any of those things yet, let alone all three at once.