OpenTelemetry Normalization for GenAI: Solving the Multi-SDK, Multi-Framework Complexity Problem
Key Takeaways
- ▸OpenTelemetry adoption in GenAI is fragmented: different SDKs, frameworks, and providers emit radically different structures for identical operations (e.g., three different message formats for a single user-assistant conversation)
- ▸Complexity is multidimensional: the same SDK + provider combination produces different telemetry shapes depending on whether it's called directly, through LangChain, LangGraph, CrewAI, or other frameworks
- ▸Tool calls, messages, and metadata are represented inconsistently across vendors (Anthropic uses content blocks, OpenAI uses separate tool_calls arrays, LangChain wraps everything in constructor objects)
Summary
A technical deep-dive explores the surprising complexity of normalizing OpenTelemetry (OTel) data across diverse GenAI implementations. The article reveals that different SDKs (Traceloop, LangSmith, eBPF), frameworks (LangChain, LangGraph, CrewAI), and AI providers (Anthropic, OpenAI) emit fundamentally incompatible telemetry structures for the same operations, making standardization a three-axis problem rather than simple attribute renaming.
Part 2 of the analysis demonstrates concrete examples: the same conversation represented three different ways across SDKs (indexed attributes, constructor objects, raw API payloads), identical SDK-provider combinations producing different shapes depending on orchestration framework, and framework-specific metadata that changes the required parsing logic. Tool calls, messages, and completion structures vary drastically, requiring separate parsers for each combination.
The groundcover OTel Normalizer addresses this by implementing SDK-specific, provider-specific, and framework-aware extraction logic. The article highlights that normalization requires understanding not just data format differences but semantic meaning embedded in span hierarchies and metadata patterns that vary by framework.
- Effective normalization requires separate parsing logic for each SDK-provider-framework combination, not just simple attribute mapping
Editorial Opinion
This article exposes a critical blind spot in GenAI observability: while vendors tout 'OpenTelemetry support,' the standard is being interpreted so differently that telemetry data from one system is virtually incompatible with another. groundcover's normalization work is essential infrastructure for any production GenAI system that needs unified observability across multiple frameworks and providers, but it also highlights the urgent need for the OpenTelemetry community to establish stricter semantic conventions specifically for GenAI instrumentation.



