Introducing the UX Research Working Group

Prometheus Blog

Prometheus has launched a UX Research Working Group, and if you've ever wrestled with PromQL's learning curve or tried to explain recording rules to a new team member, you'll understand why this matters. This isn't about making dashboards prettier—it's about addressing systemic friction in how operators actually use the tool.

The impetus came from research presented at PromCon 2025 examining Prometheus and OpenTelemetry workflows. While that study focused on interop scenarios, it surfaced broader patterns: onboarding remains steep, configuration semantics aren't intuitive, and mental models don't align well with how the system actually behaves. Anyone who's debugged why a scrape_interval change didn't produce expected results or explained the difference between rate() and irate() to someone coming from StatsD knows these aren't trivial papercuts.

What's interesting here is the acknowledgment that technical excellence alone doesn't solve these problems. Prometheus handles high cardinality better than most alternatives, the TSDB is genuinely impressive, and the pull model has real operational advantages. But if engineers spend two days figuring out how to properly configure service discovery for a new Kubernetes cluster, or if the documentation assumes knowledge that newcomers don't have, that's a real cost that compounds across every team adopting the tool.

The working group's scope is deliberately practical. They're offering to conduct user research for maintainers evaluating competing approaches—think structured feedback on whether a proposed API change actually makes sense to operators, not just whether it's technically sound. They'll produce artifacts like user journeys and wireframes, which sounds fluffy until you consider how many Prometheus features have organically grown without a coherent model of how someone would actually discover and use them together.

The value proposition for maintainers is clearest when you're making decisions with significant UX implications. Should the default scrape timeout be 10s or 30s? How should we surface staleness handling in the UI? What should happen when someone misconfigures a relabel_config? These questions have technical dimensions, but they also have usability dimensions that are harder to reason about without structured research.

For practitioners, the working group is soliciting participation in research studies. This is worth engaging with if you have opinions about Prometheus UX, because it's a mechanism to influence product direction beyond filing GitHub issues. The difference between anecdotal complaints and structured research is that the latter can identify patterns across different use cases and quantify the severity of pain points.

The cynic might ask whether a working group can actually move the needle on something as established as Prometheus. Fair question. The test will be whether this produces actionable changes—not just reports that sit in a repo, but actual improvements to configuration ergonomics, better defaults, clearer documentation, or smoother integration paths. The fact that they're explicitly partnering with maintainers rather than operating as a separate layer is encouraging. UX research only matters if it feeds into decisions that ship.

If you're running Prometheus at scale and have accumulated a list of things that regularly trip up your team, participating in this research is probably worth an hour of your time. The alternative is that these decisions get made without input from people actually operating the system in production.