Hello,
we have the requirement to import an amount of 4.800.000.000 rows as fast as possible into questdb. Source of the data are 50 million Kafka events. Each event has 96 15-minutes values of a energy meter (a whole day). The timestamps in each event are ordered but the events are not sorted on the Kafka broker so it should be mostly out of order data.
We are using a java service to read the data from Kafka and submitting it to questdb via the questdb-java-client and http+ilp.
The questdb server has 32 cpu cores and ~396 GB RAM. QuestDB and Service are running on the same machine. Java service never used more than four cores and 4 GB of RAM
Our first attempt started okay with ~ 850000 rows/s but the perfromance decreased fast and ended below 40000 rows/s. The java service wrote the data very fast to WAL, so the pending rows grew temporarily up to 3.000.000.000 rows. The WAL to table process seemed to run in intervals with gaps getting larger with the time.
We are desparate looking out for some informations to get the import faster. Or is it hopeless with this kind of data. Maybe there are more options that we do not see. If you need more informations please ask.
Kind regards
Sebastian
This is out table:
CREATE TABLE ‘import_values’ (
id SYMBOL,
value DECIMAL(15,6),
timestamp TIMESTAMP
) timestamp(timestamp) PARTITION BY HOUR
DEDUP UPSERT KEYS(id,timestamp);
We have adjusted the following config parameters, all others are on default:
shared.network.worker.count=10
cairo.max.uncommitted.rows=10000000
cairo.o3.min.lag=5s
cairo.o3.max.lag=60s
cairo.o3.column.memory.size=32M
cairo.writer.data.append.page.size=64M
cairo.iouring.enabled=true
