Business metrics in Grafana Cloud: Get an AI assist to help securely analyze your data

Grafana Labs Blog

Grafana Cloud's Private Data Source Connect solves a problem that most platform teams have encountered: how to safely query business data sitting in private networks without punching holes in your firewall or standing up VPN infrastructure. The approach is straightforward—deploy a lightweight agent in your VPC that establishes an outbound SSH tunnel to Grafana Cloud. All database queries route through this encrypted tunnel, meaning your PostgreSQL or MySQL instances never need public exposure.

The architecture inverts the typical connection model. Instead of Grafana Cloud reaching into your network, your PDC agent reaches out and maintains a persistent tunnel. This matters because it sidesteps the usual security review friction around inbound access rules and eliminates the operational overhead of managing VPN concentrators or bastion hosts. You're trading network-level complexity for a simple agent deployment pattern that most teams can handle with existing container orchestration.

Where this gets interesting is the combination with Grafana Assistant, their LLM integration. The practical value isn't about replacing SQL knowledge—it's about accelerating the iteration cycle when building dashboards against unfamiliar schemas. Point the assistant at a PostgreSQL database and ask "show me revenue trends by region for Q4," and it generates the appropriate SELECT statement with GROUP BY clauses and time bucketing. For teams supporting multiple business units with different data models, this reduces the context-switching cost when someone needs a one-off dashboard.

The SQL generation quality depends heavily on schema complexity. Simple star schemas work well. Heavily normalized databases with dozens of junction tables require more prompt refinement. The assistant understands common PostgreSQL patterns like window functions and CTEs, but you'll still need to validate the generated queries for performance. An unoptimized JOIN across large fact tables can easily timeout or spike your database CPU. The tool accelerates initial query construction but doesn't replace query plan analysis.

The Terraform blueprint they've published demonstrates proper infrastructure-as-code patterns for this stack. It provisions an RDS instance in a private subnet, deploys the PDC agent with appropriate IAM roles, and configures the Grafana data source—all parameterized for different environments. The key detail is how it handles secrets: service account tokens go through Terraform variables rather than hardcoded values, though you'll want to integrate with your actual secrets management system for production use.

One operational consideration: the PDC agent becomes a critical path component. If it crashes or loses connectivity, your dashboards stop working. The agent supports high availability deployments, and you should treat it like any other infrastructure service—monitor its health, set up alerting on tunnel status, and have runbooks for common failure modes. The agent exposes Prometheus metrics, so instrument it properly.

The business metrics use case is compelling because it unifies operational and business observability in one platform. Instead of SREs looking at Prometheus for service health while product teams use Looker for conversion funnels, both datasets live in Grafana. This matters during incidents when you need to correlate technical metrics with business impact. Seeing request latency spike alongside a drop in checkout completions in the same dashboard changes how you prioritize remediation.

For teams already running Grafana Cloud, PDC is worth evaluating if you're currently maintaining separate analytics tooling or struggling with secure database access patterns. The Terraform blueprint provides a working reference implementation you can adapt. Just remember that adding LLM-generated queries to your workflow means validating query performance before those dashboards hit production load.