Skip to content

Prometheus and PromQL

A brief set of notes for metrics and promql

Metric Types

Counter

  • Used for things that only go up
  • Used for things where we want to calculate the rate of increase of said value (requests etc)
  • rate(metrics_name[time_period])
    • This will show us the per-second rate of metric_name averaged over a 5 min period
  • Ex
    • Request Count
    • Tasks Completed

Gauges

  • For values which can go up or down
  • For metrics where you do not need to calculate the rate
  • Ex
    • CPU utilization
    • Memory Utilization
    • Queue Length

Histogram

  • measures the frequency of value observations that fall into specific pre-defined buckets.
    • For example we might want to keep track of http response times by bucketing every entry into buckets on range 0-0.005 , 0.005-0.1 , 0.1-1.0 etc etc and so on.
    • We store the freqeuncy of no. of. requests that fall into those specific buckets.
    • We might need to configure custom buckets if the predefined ones do not work for us.
  • Use this when
    • we want to later calculate averages or percentiles
    • we are not bothered by the exact values and approximations work for us
    • we know the range of values beforehand, so we can use the default bucket definitions or define our own buckets
  • Ex
    • request duration
    • payload size

Working Example

If the name of the metric is request_duration, then prometheus will automatically create other time_series for the same metric with additonal information; like

  • request_duration_bucket{le=0.005, }
  • request_duration_bucket{le=0.01, }
  • request_duration_bucket{le=0.025, }
  • request_duration_bucket{le=0..05, }
  • request_duration_count
  • request_duration_sum

The last two _sum and _count are used to calculate averages and percentiles.

Summary

  • Histograms, but a bit complicated and weird.
  • Use this mainly when we don't know the buckets beforehand and hence we can not use histograms

Operations and Patterns

rate

calculates the per-second average rate of increase of the time series.
i.e. This gives you the acceleration in the distance-time graph

Increase

calculates the increase in the time series.
syntactic sugar for rate(time_series[xm]) * xm

i.e. This gives you the increase in speed in a distance-time graph.

Sum

For summing over dimensions in a metric. Imagine this as a group_by operator.

Resources