The Eternal Sloptember
Source: Hacker News
Introduction
I’m calling it now: the adoption of AI agents into software development will be one of the most costly mistakes in the field’s history. Agents cannot program, and it’s taking longer and longer to realize that they can’t. They are highly sophisticated statistical models designed to mimic the distribution of programming. The output is broken, but in a way that’s getting harder to detect—exactly what you’d expect from an increasingly accurate statistical model.
Personal Experiments with AI Agents
At first I rejected this. I bought into the Twitter explanation of status anxiety, defining some of my self‑worth by my programming abilities, so it made sense to get defensive around that perceived loss. I wondered whether I could deny that the models can code for as long as I could to preserve my ego.
It’s clear that they can solve math problems I couldn’t hope to solve even if I devoted my life to them. So why can’t they program? Maybe I’m just not a good enough programmer to recognize their “genius.”
I really tried for the last six months:
- I wrote some parts of tinygrad with agents:
test/mockgpu/amd/emu.py - I reversed a USB ↔ PCIe chip with agents:
asm2464pd-firmware
Each time I suspected I could have done it better and faster manually. The agent front‑loads all the progress, then gives you a slot‑machine lever to pull in hopes of polishing the result. It never quite gets there.
I’ve tried all the different models, harnesses, and prompts. The “you are using it wrong” argument feels like the slot‑machine analogy: you have to bet on multiple lines after a cherry, so no wonder you aren’t winning.
Utility and Limitations
I’m not saying AI isn’t useful—it clearly is. It’s a better Google for most searches, and for quick prototypes where polish isn’t required it’s absurdly fast. But it is not a software engineer at the level required by any company I’ve worked at. The key aspect is knowing when to use it and when not to.
Impact on Organizations and the Industry
I thought more about the self‑worth preservation angle. Tools like AFL have found more bugs than LLMs, and nobody felt threatened by them. Chess and Go are more popular than ever. I cannot wait until I have armies of robot associates I can trust to clean up my code! I don’t fear loss of status; I suspect this is a psy‑op to sell agents. Fear of loss is one of the only ways to make big companies move, and in that fear they are making a big mistake.
Agents will end up hurting large organizations more than high‑performing individuals or small teams. In my observations over the past six months, high‑performing people still error‑correct and can spot “slop” in AI‑generated code. They spend time exploring, exploiting, and tuning the outer loops—when to use agents, when to trust them, how to integrate their output. I haven’t seen them move to a model where they blindly accept every line, except in very confined domains.
Large organizations, however, have slower feedback loops and less alignment. Bottom performers lack the self‑check and can produce ten‑times more output with agents. The average output of such organizations—and ultimately the world—will be flooded with low‑quality code. More code, more apps, more features will be generated, creating a golden era of “buckets of slop” and a dark age for gems of quality.
Future Outlook
Apple is reportedly pushing AI on all its engineers. When people think abstractly, they assume AI will handle everything, but concrete examples matter. Will macOS get better or worse in the next two years?
When people see an artifact, they assume the creator had a human state of mind. That assumption no longer holds. AI‑produced artifacts can break in ways that weren’t previously possible, rendering old proxies of quality—syntax, grammar—useless. The subtle statistical differences become obvious when you try to interact with and build on the artifact in human ways.
I’m now aligned with the LeCun/Marcus camp on LLMs: I don’t think models like these will ever be able to program in the true sense; the process matters. Deep learning remains a solution, but real programming agents will need world models, not the current “RL‑VR” approach that merely comments out failing tests and declares them passing.
Conclusion
The real story of this era will be who manages to avoid harming themselves in their AI psychosis.