Last August, some of the best cybersecurity teams in the business gathered in Las Vegas for DARPA’s AI Cyber Challenge (AIxCC). The setup was simple enough: scan 54 million lines of real software code that had been injected with artificial flaws. The teams were supposed to find those planted bugs. And they did.
But here’s the thing that caught everyone off guard. Their automated tools went beyond the planted bugs. They found more than a dozen real vulnerabilities that DARPA hadn’t inserted at all. That’s not just a win for the contestants—it’s a wake-up call for anyone who thinks AI can be neatly contained.
Then came the real earthquake. Anthropic dropped Claude Mythos this month, and suddenly the conversation shifted from “AI can find bugs” to “AI can find bugs faster than any human team, and it’s getting better every week.” The numbers are staggering. Claude Mythos reportedly identifies vulnerabilities at a rate that makes traditional pen-testing look like a hobby. I’ve been in this space long enough to know that every few years someone claims a breakthrough, but this one feels different.
What worries me isn’t the capability itself. It’s who gets access. The term “script kiddie” has been around since the 90s—someone who uses existing tools without understanding them, often causing chaos by accident. Now imagine those same kids with Claude Mythos or something similar. They won’t need to understand buffer overflows or SQL injection. They’ll just ask the AI to find them.
DARPA’s challenge proved that AI can find bugs humans missed. Anthropic’s model proved it can do it at scale. The logical next step is that someone will weaponize this. Not a nation-state actor with unlimited resources, but a bored teenager in a basement with a stolen API key.
We’ve seen this pattern before. When Metasploit went mainstream, the barrier to entry for exploits dropped dramatically. When Shodan made it easy to find vulnerable devices, the same thing happened. AI bug-finding tools are about to do the same, but faster and with less effort required.
The irony is that the same tools could patch those bugs automatically. DARPA’s challenge showed that automated repair is possible. But the incentives aren’t aligned. Finding a zero-day is worth a lot of money on the black market. Patching it is just good hygiene.
I don’t have a neat solution here. Better authentication for AI models? Maybe. Stricter API usage policies? Sure, but those get circumvented. The genie is out of the bottle, and the script kiddies are lining up to make a wish.
The only thing that gives me some hope is that the defenders are getting better tools too. The same AI that finds vulnerabilities can also detect exploitation attempts faster than any human analyst. It’s an arms race, and for once, the good guys might have a chance to keep up.
But I’m not holding my breath. The attack of the killer script kiddies is coming, and we’re not ready.
Comments (0)
Login Log in to comment.
Be the first to comment!