Google's new TPUs are built for agents, not just bigger models

Google has been quietly doing its own thing on the AI hardware front for years. While everyone else fights over Nvidia’s latest H100s or B200s, Google keeps iterating on its custom Tensor Processing Units. The seventh-gen Ironwood TPU landed in 2025, and now we’re looking at the eighth generation. But this isn’t just a faster chip with more numbers.

Google is splitting its new TPUs into two distinct flavors: the TPU8t for training and the TPU8i for inference. This isn’t a marketing gimmick. The company is making a bet that the “agentic era”—where AI systems don’t just generate text but take actions, use tools, and operate autonomously—requires fundamentally different hardware than what we’ve been using.

Training frontier models has always been the bottleneck. The TPU8t is Google’s attempt to shrink that timeline from months to weeks. That’s a big claim, but if it holds up, it changes the economics of building large models. Training costs are the single biggest barrier to entry in this space, and anything that cuts that down is worth paying attention to.

The TPU8i, on the other hand, is for the inference side—running those trained models in production. This is where the agent angle gets interesting. Agents don’t just answer a single query and stop. They loop, they call external APIs, they reason over multiple steps. Inference for agents is a different workload than inference for a chatbot. Google seems to have designed the 8i with that continuous, interactive pattern in mind.

I’ve seen a lot of companies talk about “AI-first” hardware, but Google has the advantage of actually running massive production workloads on its own chips. They know where the pain points are. The TPU line has been through enough iterations that they have real data on what works and what doesn’t.

That said, Google is still competing against Nvidia’s software ecosystem, which is miles ahead. CUDA and the surrounding tooling are sticky. If Google wants TPUs to be more than just an internal advantage, they need to make the developer experience genuinely compelling. Faster chips only matter if people can actually use them without rewriting everything.

The agentic era framing feels a bit like marketing spin, but the underlying point is valid. AI workloads are diversifying. One-size-fits-all accelerators won’t cut it forever. Google is betting that specialization wins, and they’re putting their money where their mouth is.

Google’s new TPUs are built for agents, not just bigger models

Comments (0)