Weird behavior of questdb under out of order sparse data

alex-aparin · February 14, 2025, 4:42pm

Hello. I recently tried to ingest data (via influx protocol and c++ client) of financial nature into single table. Tables were partitioned on hour criteria. Most data are sparse - among of 15 fields (of integer types) only about 4 have data - according docs questdb will correctly handle such cases with minimal memory footprint. But I was surprised because 2.7gb data were loaded during about half hour and my resulting harddisk consumption was about 90gb! It’s about 600 millions of records! I thought that questdb via columnar model will optimize storage, but on my filesystem for each partition I see that all column files have the equal size. For example some column which almost does not have any data occupy almost 512mb like others. Can it be a bug? I tried to minimize out of order hits (because data have floating timestamp, not monotonically increasing) but they still are. Maybe is there any good article about such behavior? thanks in advance

nwoolmer · February 14, 2025, 4:45pm

Hi @alex-aparin ,

There is some info on the storage model here: Storage model | QuestDB

We use sentinel values to keep the columns aligned, which take disk space.

If you expect to have very sparse data, you should deploy to zfs. You can also convert partitions to parquet format, which will compress them, but this is a beta feature, so I would hold off for a little bit.

Please could you share your schema? Perhaps we can alter it a little. It may be better to store your data row-modelled and then convert it into an appropriate dense format later.

You could also normalise the data and then join it back together, so you don’t write so many unnecessary values.

P.S. We will be releasing PIVOT hopefully in the next release, which will help to re-map your data from narrow to wide schema.

Topic		Replies	Views
High Dimensional Tables Community	2	53	May 8, 2025
Config parameters suggested for many tables or many columns Community	1	162	July 26, 2024
Increased disk and storage with many tables Community question	1	185	August 2, 2024
Heavy ZFS flush activity impacting QuestDB performance? Community question	6	82	April 22, 2025
Reclaim Writer page size Community	9	43	December 18, 2024

Weird behavior of questdb under out of order sparse data

Related topics