New OpenTelemetry Kotlin SDK

OpenTelemetry Blog

OpenTelemetry's new Kotlin SDK targets a real instrumentation gap for teams running Kotlin Multiplatform workloads, but whether it matters for your ML infrastructure depends heavily on where Kotlin sits in your stack. If you're running inference services on JVM or building mobile ML applications with on-device models, this could consolidate your observability story. If you're pure Python for training and serving, it's irrelevant.

The core value proposition is unified instrumentation across Android, JVM, browser, and desktop from a single codebase. For ML systems, this becomes interesting in two specific scenarios. First, teams serving models through Kotlin-based API layers on JVM can now instrument request latency, token throughput, and error rates using the same semantic conventions as their Python training infrastructure. Second, mobile ML teams running TensorFlow Lite or Core ML models in Kotlin-based Android apps finally get a path to production-grade distributed tracing without rolling custom solutions or relying on platform-specific SDKs that don't interoperate with backend telemetry.

The practical question is whether this actually reduces operational complexity or just adds another SDK to maintain. For shops already standardized on OpenTelemetry across their Python serving layer and Go orchestration services, adding Kotlin support means your mobile inference metrics flow into the same Jaeger or Tempo backend. You can trace a request from your edge model through your API gateway to your GPU serving cluster using consistent span attributes. That's valuable when debugging latency spikes or analyzing which inference path users actually hit.

The tradeoffs get messier for teams not already bought into OpenTelemetry. If you're using Datadog or New Relic with their native Kotlin agents, switching to OpenTelemetry means configuring exporters, validating that custom metrics still work, and potentially losing vendor-specific features like automatic error grouping or APM integrations. The Kotlin SDK is also in active development, donated from Embrace, which means API stability isn't guaranteed and you're betting on community momentum to fill feature gaps.

From a metrics perspective, what you actually get is standardized collection of request duration, throughput, and error rates with proper context propagation. For LLM serving specifically, you'll still need custom instrumentation for model-specific metrics like TTFT, token-by-token latency percentiles, cache hit rates, and prompt token counts. OpenTelemetry gives you the plumbing but not the semantic understanding of what makes an LLM request slow or expensive.

The KMP angle matters most if you're sharing business logic between platforms. If your feature extraction, preprocessing, or lightweight model inference runs in shared Kotlin code deployed to both Android and backend services, instrumenting that once and getting telemetry in both environments is legitimately useful. But if your KMP usage is limited to UI code or simple data classes, the observability benefit is minimal.

For teams evaluating this, the decision hinges on existing OpenTelemetry adoption and how much Kotlin you're actually running in production inference paths. If you're already exporting traces from Python FastAPI services to an OTel collector, adding Kotlin instrumentation to your mobile edge models creates a complete picture. If you're starting fresh, the ecosystem maturity of vendor SDKs might outweigh the standardization benefits until the Kotlin SDK stabilizes and proves itself in production ML workloads.