Excellent breakdown of LLM observability from the product perspectve. The running coach example really drives home why this matters. What I find intresting is that traditional APM vendors like Datadog have been racing to build LLM observability features precisely because of what you outlined here. They already have the infrastructure for monitoring APIs, databases, and latency, so extending that to trace LLM calls and prompts is a natural evolution. The challenge is that LLM observability needs different primitives than traditional observability. You cant just track response times, you need to evaluate output quality, token usage, prompt effectiveness, etc. Tools that nail the product leader perspectve you described will win becuase they help connect technical metrics to actual user experience.
Insightful. How do you see the complexity of these aplication layers evolving with multi-modal LLMs? Great breakdown, really helps clarify the product perspective.
Excellent breakdown of LLM observability from the product perspectve. The running coach example really drives home why this matters. What I find intresting is that traditional APM vendors like Datadog have been racing to build LLM observability features precisely because of what you outlined here. They already have the infrastructure for monitoring APIs, databases, and latency, so extending that to trace LLM calls and prompts is a natural evolution. The challenge is that LLM observability needs different primitives than traditional observability. You cant just track response times, you need to evaluate output quality, token usage, prompt effectiveness, etc. Tools that nail the product leader perspectve you described will win becuase they help connect technical metrics to actual user experience.
Insightful. How do you see the complexity of these aplication layers evolving with multi-modal LLMs? Great breakdown, really helps clarify the product perspective.