Data Compression Info for storage evaluation

tusharbalihalli · October 18, 2025, 9:25am

Hello QuestDB team,

I am using QuestDB to store data from Siemens PLC machine (IPC227G). I have setup a data pipeline that uses MQTT and telegraf to route data to QuestDB (which is running fine). Now, I want to estimate the storage consumption of QuestDB data. I believe QuestDB does not apply any special data compression steps, rather just store the respective data under columnar storage.

My questions:

If I want to estimate the storage consumption, I can do it by simply multiplying the number of rows with storage consumed by respective data-type? (also, I am aware that I have to add some offset to maintain indexes, symbol dictionaries, etc.)
Also, I will have data being ingested continuously and also being queried by my application (using Python API). On enabling ZFS compression, will this ZFS step put additional load on RAM/CPU?
As storage consumption is a prime factor for us, I was wondering if the parquet file storage will be introduced in the near future? (I believe parquet file takes way less storage compared to the current file-type?)

TIA and regards!

nwoolmer · October 18, 2025, 9:32am

Yes indeed, you can perform a basic calculation plus add 10% or so: Data types | QuestDB When compression becomes involved, it can be a little more complicated.
Yes, zfs adds additional load to apply lz4 or zstd compression. It will probably cap the disk rate at 200 MB/s or so, depending on the workload.
Parquet storage is already introduced, as a beta feature. However, queries will run a little slower against it than our native format, as we have a couple of optimisations in our native format that we need to apply to Parquet partitions.
1. I would run with the assumption that you will leverage Parquet more than zfs within the next couple of months.
2. Indexes are currently still stored in native format, even after Parquet conversion (they rely on random-access). We are changing some things to improve this, but it will likely harm your overall compression ratio in the short term, for which zfs will continue to help.

tusharbalihalli · October 29, 2025, 11:04am

Thanks for the quick feedback! I happen to check with Siemens device (currently under use), and they do not yet support ZFS compression. Plus, most of our server machines run on Windows (hence becomes more complex). So, it seems wise to wait for Parquet version.
Is there a fixed date when the stable version of parquet based storage version is expected to release? A quick PoC with any such stable version would be great to have!
Many thanks and regards

nwoolmer · October 29, 2025, 11:08am

We will ship 9.2.0 with a new parquet export API imminently. This will allow you to test what kind of compression ratios you can get.

In already-released OSS versions, it is already possible to convert native table partitions to Parquet. Querying them is a little slower than native. But that may be less of a worry versus storage size for your use case.

Please feel free to feed back any issues you encounter. We are sequencing optimisations for parquet reads/writes over the next couple of months.

tusharbalihalli · October 29, 2025, 6:08pm

With QuestDB’s native columnar data format, data ingestion and query performance is immensely great, no question at all! But with no compression support possible on our end (Ex: ZFS), data storage poses a problem.

I will be more interested in “in-place” conversion where actual data resides as “Parquet” (and not QuestDB columnar format). A quick test shows the data being compressed by 2.6X after being converted to Parquet. I know this compression ratio is further configurable

With the version I use currently (V8.3.3), this conversion has to be performed externally everytime (as SQL command). It would be more interesting to see if the future QuestDB versions can directly store data as *.parquet (as InfluxDB V3Ent does it).

I believe QuestDB Enterprise might have such options to automate this process along with optimized performance for query/writes? Like you said, future versions could have much efficient performance than the current ones.

nwoolmer · October 29, 2025, 6:11pm

It will be on a TTL setting. Current TTL only drops data, we have not released the TTL syntax to auto-convert to parquet files yet.

Most recent partition will always be in native format, for best realtime read/write performance.

Enterprise is separated from OSS by further extensions to move the data away from local disk.

A quick test shows the data being compressed by 2.6X after being converted to Parquet. I know this compression ratio is further configurable

In 8.3.3, it will probably write uncompressed files. You should move to a newer version and reconfigure.

In 9.2.0, the default is ZSTD with level 9, but LZ4_RAW is also a good option.

Topic		Replies	Views
Storage Capacity Planning Community storage , question	13	429	January 28, 2025
Size per record? Community question	1	68	July 21, 2025
Weird behavior of questdb under out of order sparse data Community question	1	60	February 14, 2025
Using QuestDB as logs storage Community	4	257	July 17, 2024
Heavy ZFS flush activity impacting QuestDB performance? Community question	6	130	April 22, 2025

Data Compression Info for storage evaluation

Related topics