Deprecating Span Events API
OpenTelemetry's decision to deprecate the Span Event API is less about technical superiority and more about reducing cognitive overhead in an already complex observability landscape. If you're instrumenting LLM systems today, this matters because you're likely emitting dozens of events per request—prompt construction, retrieval steps, model calls, guardrail checks—and the distinction between span events and structured logs has always been murky.
The core issue is API duplication. Span events and log-based events serve nearly identical purposes: capturing discrete occurrences within a trace context. Both can carry structured attributes, both get correlated to the active span, and both show up in trace visualizations. The difference has been mostly implementation detail—span events live on the span object itself, while log-based events are separate records linked via trace context. This overlap has led to inconsistent instrumentation patterns across teams and vendors, with some preferring span events for "trace-native" data and others treating everything as logs.
For LLM observability specifically, this creates real friction. When you're tracking a RAG pipeline, you might emit a span event for "documents_retrieved" with metadata about chunk count and relevance scores, while simultaneously logging the actual retrieved text as a structured log entry. Different vendors handle these differently—some collapse them in the UI, others show duplicate timeline entries, and trace analysis becomes harder when semantically similar data lives in two places.
The migration path is straightforward but not instant. New instrumentation should use the logging API with explicit trace context correlation rather than calling add_event on span objects. In practice, this means switching from span.add_event("llm_call_started", attributes) to logger.info("llm_call_started", extra=attributes) with the logger configured to capture trace context. The event still appears on the span in trace views, but it's now a first-class log record that can be queried independently.
What breaks during this transition? Primarily custom tooling that directly queries span event storage. If you've built internal dashboards that aggregate span events—say, counting retrieval failures across all traces—you'll need to migrate those queries to target log backends instead. The data model shift is subtle but real: span events were always children of spans, while logs are peers that happen to reference spans. This affects how you join data in analytics pipelines.
The practical impact on LLM systems is minimal if you're using modern SDKs and vendor-provided instrumentation. Libraries like LangChain and LlamaIndex will update their event emission patterns, and trace UIs will continue showing events inline regardless of the underlying mechanism. The bigger question is whether your custom instrumentation is tightly coupled to the span event API. If you're directly manipulating span objects to add events in middleware or decorators, budget time to refactor toward structured logging with context propagation.
One underappreciated benefit: log-based events integrate better with existing log aggregation infrastructure. Many teams already pipe application logs to systems like Loki or CloudWatch, and having trace events flow through the same path simplifies data governance and retention policies. You're not managing separate storage for span events versus operational logs.
The deprecation timeline is measured in years, not months, so there's no immediate fire drill. But if you're designing new observability instrumentation for LLM chains or agent workflows today, default to log-based events. The API surface is cleaner, the data model is more flexible, and you're aligning with where the ecosystem is headed.