Baseten CEO Tuhin Srivastava discusses the intense AI inference market, highlighting the company's 30X growth driven by the widespread adoption of custom, post-trained models. He emphasizes the critical capacity crunch for AI compute, the strategic importance of a robust software layer for inference, and the belief that the application layer will thrive due to unique user signals. Srivastava also touches on the geopolitical implications of open-source models and the future where intelligence becomes a pervasive "concierge" for all aspects of life.
Summarized by Podsumo
Explosive Growth & Market Shift: Baseten has grown 30X in a year, driven by the mainstream adoption of custom, post-trained AI models and enterprises in-housing intelligence, with 99% of enterprise AI adoption still ahead.
Strategic Importance of Application Layer: Tuhin believes the independent application layer will persist against frontier labs, as companies leverage unique user signals and workflows to post-train specialized models, creating defensible moats.
Severe AI Compute Capacity Crunch: The market faces an "uncomfortably high utilization" (mid-90s) of compute, with limited slack and a shortage of reliable data center operators, leading to 3-5 year contracts with significant prepayments for capacity.
Inference with Software is Sticky: Unlike commodity GPUs-as-a-service, Baseten's inference platform with its software layer is "incredibly sticky" (400% annual NDR), making access to compute a strategic advantage.
Jevons Paradox in AI: Decreasing the cost of inference leads to increased consumption and embedding more intelligence into applications, driving better user experiences and more revenue, indicating a long-term growth trajectory for the inference market.
"No post-training pre-product market fit is what I have."
— Tuhin Srivastava
"If we don't have access to that intelligence in that form, I think it's just a massive loss. And as a country, we won't be able to innovate as fast because the cost of intelligence going down in control of intelligence, well, we have seen just more intelligence. Intelligent being embedded in more places."
— Tuhin Srivastava
"Inferrant with the software layer included is incredibly sticky. You know, like just just like you know none of our Top 30 customers have ever churned. You're talking like 400% annual NDR around our business."
— Tuhin Srivastava