Grafana Labs co-founder Anthony Woods discusses how AI-generated code and autonomous agents are straining software operations, forcing a shift where agents themselves are becoming primary consumers of observability data. He shares how Grafana is leveraging its open-source ecosystem to build AI-native tools that help teams manage overwhelming telemetry volumes, while warning about the risks of black-box AI systems and the erosion of accountability. Woods also highlights the critical challenge of training junior engineers as AI automates the 'easy' work that traditionally built expertise.
Summarized by Podsumo
Grafana's Assistant tool, built by a two-person hackathon team in one quarter, uses foundation models pre-trained on Grafana's massive open-source ecosystem (25M+ users, blog posts, GitHub repos) to provide AI-powered investigations without needing custom training.
The shift to agent-to-agent communication creates an accountability crisis: if two AI agents from different organizations interact autonomously and something goes wrong, there's no clear 'throat to choke' for responsibility.
OpenTelemetry is emerging as the 'HTTP of logging' - a vendor-neutral standard for application instrumentation that's becoming essential for AI agents to understand telemetry data consistently.
AI agents can now act as first responders in incident response: spinning up investigations that find root causes in minutes before human operators even reach their computers, potentially resolving 60-70% of issues automatically.
Grafana's Loki log system was architecturally redesigned for faster AI-driven queries, using novel filtering techniques to reduce petabyte-scale scans to 10 terabyte searches - crucial for agents doing analytic 'needle in haystack' queries.
"We don't inherently trust the AI to do the right thing. That's okay, right? We shouldn't, right? We should be kind of cautious because if there is an opportunity for it to do something wrong, it will find that opportunity eventually."
— Anthony Woods
"The thing that scares me the most is that all of the easy things are being taken care of by AI—that's the work you give your junior people to learn. As we take that opportunity away, it's like how are we going to give people the opportunity to develop the skills they need to become experienced senior SREs?"
— Anthony Woods
"Don't go and build an AI solution. Make AI part of the solution. We're in that hype cycle where everyone wants to label everything with 'AI, AI, AI.' Users are going to care less and less about that over time. What they actually care about is: does it work?"
— Anthony Woods