I know QuestDB exposes metrics in Prometheus format, but I’ve never used Prometheus, so I was wondering what would be the best way to scrape the metrics QuestDB exposes, so I can store the metrics in a table on QuestDB itself.
That way I can reuse what I already know about QuestDB and I can easily integrate with a Grafana dashboard as I already have one setup and connected to other QuestDB tables.
By default, QuestDB doesn’t expose any metrics, but setting up the metrics.enabled config variable, or setting up an env variable QDB_METRICS_ENABLED=TRUE will make them available on localhost:9003/metrics.
You could use Prometheus to scrape those metrics, but you can also use any server agent that understands the Prometheus format. It turns out telegraf has input plugins for Prometheus and output plugins for QuestDB, so you can use it to get the metrics from the endpoint and insert them into a QuestDB table.
This is a telegraf.conf configuration which works for me (using default ports)
# Configuration for Telegraf agent
[agent]
## Default data collection interval for all inputs
interval = "5s"
omit_hostname = true
precision = "1ms"
flush_interval = "5s"
# -- INPUT PLUGINS ------------------------------------------------------ #
[[inputs.prometheus]]
## An array of urls to scrape metrics from.
urls = ["http://questdb-origin:9003/metrics"]
url_tag=""
metric_version = 2 # all entries will be on a single table
ignore_timestamp = false
# -- AGGREGATOR PLUGINS ------------------------------------------------- #
# Merge metrics into multifield metrics by series key
[[aggregators.merge]]
## If true, the original metric will be dropped by the
## aggregator and will not get sent to the output plugins.
drop_original = true
# -- OUTPUT PLUGINS ----------------------------------------------------- #
[[outputs.socket_writer]]
# Write metrics to a local QuestDB instance over TCP
address = "tcp://questdb-target:9009"
A few things to note:
I omit the hostname, so I don’t end up with an extra column I don’t need. If I was monitoring several QuestDB instances, it would be good to keep it.
I set the url_tag to blank because of the same reason. By default the Prometheus plugin for Telegraf adds the url as an extra column and we don’t need it.
I am using metric_version 2 for the input plugin. This is to make sure I get all the metrics into a single table, rather than one table for each different metric, which I find annoying.
I am using the merge aggregator. We need this because the influx input plugin outputs a whole row with values only for one column for each available metric. This is so because Telegraf was originally designed for InfluxDB, where this spare format makes sense, but on QuestDB querying data in that format would be annoying. By using the merge aggregator we merge into a single row all the metrics with the same timestamp before outputting, which is what we want. I then say drop_original to discard all the sparse rows and output just the merged version.
On my config, I used a different hostname for the QuestDB output, so we can collect metrics on a different instance. For production this would be a best practice, but for development you can just use the same host you are monitoring.