As VP of Product at Google Cloud, Michael Gerstenhaber works primarily on Vertex, the company’s unified platform for deploying enterprise artificial intelligence. It gives him a high-level view of how companies are actually using AI models and what still needs to be done to unlock the potential of agent AI.
As I spoke with Michael, one thought in particular struck me that I hadn’t heard before. As he said, AI models are pushing three limits at once: raw intelligence, response time, and a third quality that has less to do with raw capabilities than price—whether the model can be deployed cheaply enough to run at a massive and unpredictable scale. It’s a new way of thinking about the possibilities of a model, and especially valuable for anyone trying to push frontier models in a new direction.
This interview has been edited for length and clarity.
Why don’t you start by walking us through your experience with AI and what you do at Google?
I’ve been in AI for about two years now. I was at Anthropic for a year and a half, I’ve been at Google for almost half a year now. I run Vertex, Google’s developer platform. Most of our customers are engineers building their own applications. They want access to agent models. They want access to the agent platform. They want access to the conclusions of the smartest models in the world. I provide it to them, but I don’t provide the apps themselves. This is for Shopify, Thomson Reuters and our various customers to provide on their own domains.
What brought you to Google?
I think Google is unique in the world in that we have everything from the interface to the infrastructure layer. We can build data centers. We can buy electricity and build power plants. We have our own chips. We have our own model. We have an inference layer that we control. We have an agent layer that we control. We have an API for memory, for writing interleaved code. In addition, we have a compliance and management agent. And then we even have a chat interface with Gemini enterprise and Gemini chat for consumers, right? So part of the reason I came here is because I saw Google as being uniquely vertically integrated, and that’s a strength for us.
Techcrunch event
Boston, MA
|
June 9, 2026
It’s strange because despite all the differences between the companies, I feel like all three big labs are really nearby abilities. Is it just a race for more intelligence or is it more complicated?
I see three boundaries. Models like the Gemini Pro are tuned for raw intelligence. Think about writing code. You just want the best code you can get, it doesn’t matter if it takes 45 minutes because I have to maintain it, I have to put it into production. I only want the best.
Then there is another limit with latency. If I’m doing customer support and I need to know how to apply policies, you need intelligence to apply those policies. Are you allowed to make returns? Can I upgrade my seat on the plane? But it doesn’t matter how right you are if it took 45 minutes to get an answer. So for these cases, you want the smartest product within that latency budget, because more intelligence doesn’t matter once the person gets bored and hangs up.
And then there’s the last bucket where someone like Reddit or Meta wants to moderate the entire internet. They have big budgets, but they can’t take enterprise risk unless they know how it scales. They don’t know how many poisonous posts there will be today or tomorrow. So they have to limit their budget to a model with the highest intelligence they can afford, but in a scalable way to an infinite number of subjects. And that’s why price becomes very, very important.
One of the things I’ve been wondering about is why agent systems are taking so long to catch on. It feels like the models are there and I’ve seen some incredible demos, but we’re not seeing the major changes that I would have expected a year ago. What do you think is holding it back?
The technology is basically two years old and there is still a lot of missing infrastructure. We don’t have patterns for auditing what agents are doing. We don’t have patterns for authorizing data to an agent. There are these designs that will take work to bring into production. And production is always an indicator of what technology is capable of. So two years isn’t long enough to see what intelligence supports in manufacturing, and that’s where people struggle.
I think it has moved uniquely quickly in software engineering because it fits well into the software development life cycle. We have a dev environment where it’s safe to break things, and then we promote from dev to a test environment. The process of writing code at Google requires two people to audit that code and both confirm that it is good enough to put the Google brand in the background and give it to our customers. So we have a lot of those human-in-the-loop processes that make the implementation extremely low-risk. But we have to make those patterns in other places and for other professions.