Skip to main content

Grafana Cloud

Grafana Cloud is a hosted version of Grafana, Prometheus, Loki, and Tempo.

If you prefer this observability stack, now you don't have to host and maintain it yourself anymore. But self-hosting comes at a cost. Providing a service with good uptime, retention, and durability has its problems that need solving.


If you decide to self-host, it is still beneficial reading this chapter. Even though it is about setting up Grafana Cloud, the pricing model of Grafana Cloud forces you to be diligent with your metrics: store only a subset of all available metrics with emphasis on low cardinality. An exercise that you have to do eventually with your self-hosted stack as well to be able to provide a reliable service.

Shipping metrics

Grafana Cloud has a mostly self-explanatory setup. You have to install Prometheus, Loki, and Tempo on your cluster in shipping configuration. You will have a Prometheus running in your cluster that scrapes all metrics, but you will also have a remote_write configuration. After scraping, a subset of metrics will be forwarded to Grafana Cloud to benefit from storage, dashboards, and retention. See the setup guide (here)(

The same is true for Loki.

Alternatively, you can use the Grafana Agent project that is based on the open-source Prometheus and Loki projects, factored into a small package that contains the metric and log shipping parts.

If you chose Gimlet Stack as the installation method, it has a preconfigured Grafana Cloud integration with pruned metrics and logs.

Day-two operations

Billing alerts

When you use Grafana Cloud, you should always set billing alerts.

The built-in Grafana Cloud Billing dashboard allows you to track your usage. Make a copy of this dashboard, and set alerts for the total billable logs and metrics series.

On the included quota

Depending on your package, Grafana Cloud includes:

  • 100GB logs per month
  • 15000 metrics

The logging quota is fairly straightforward, but the metrics quota is not so self-evident.

  • Most off-the-shelf exporters push you over the 15K limit
  • To only ship metrics you use in your dashboards, put them on an allow list. See how.
  • Cloud billing is a dark art, learn how Grafana bills.

On the cost of Histogram metrics

Histogram metrics weigh heavier than other metrics. Each distinct label variation counts as a metric series.

If you have 3 labels with 10 different values each, that is 10x10x10 = 1000 metrics. So be careful with the number of different values you have per label.

This is especially true for histograms, as they have buckets (10 by default), and a histogram coming from a server/pod/thread counts as 10 metrics.

If you have 10 buckets and 10 workers, it is 100 metrics coming from a single metric line in code.

To identify the largest metrics you have, you can run

topk(10, count by (__name__)({__name__=~".+"}))

The top metric for me had 672 metric series.

Querying the metric, I could see that there are only a couple of labels: cluster, job, le, albeit rather high cardinality.

  • count(count by (le) (image_process_time_bucket)) shows 21 buckets
  • count(count by (job) (image_process_time_bucket)) 31 distinct jobs
  • count(count by (cluster) (image_process_time_bucket)) from 2 clusters

Since I pay $16 for 1000 series, this single metric (that is in code) costs $5 a month. High cardinality histograms are rather expensive.


See how to analyze metrics cardinality

Grafana Cloud also includes a dashboard for cardinality analysis.