Corrupted table / Metadata read timeout - Cannot disable WAL

Hi everyone,

I am rather new to QuestDB and really want to use it. Unfortunately I am running into a problem with a corrupted data.

When I query my table it returns an error: Metadata read timeout [src=reader, timeout=1000ms].
Even with a simple query like: select * from INTEGER_THREEMONTHS LIMIT 1

After some extra searching and looking around I found after running the following query:
select * from wal_tables() where suspended = true`
that for this table the WAL is suspended and has an error:

ErrorMessage: SymbolMap does not exist: D:\QuestDatabase\db\INTEGER_THREEMONTHS~26\PlcDataTypeId.o

When I check on in the file explorer indeed there is no file called PlcDataTypeId.o but there is a PlcDataTypeId.o.4. I tried copying this file and rename to simply PlcDataTypeId.o but then it gives an error about the PlcDataTypeId.k file missing. When I also copy the PlcDataTypeId.k.4 file to PlcDataTypeId.k it gives an error about the format of the .o file being wrong.

Currently the table details show that the WAL is 471 transactions behind. At this moment I am less concerned about loosing the data in the WAL but really want to be able to access the data in the table.

Unfortunately, running the ALTER TABLE INTEGER_THREEMONTHS SET TYPE BYPASS WAL also result in Metadata read timeout [src=reader, timeout=1000ms][errno=0] .

After more digging through log files I found the errors started after a power loss.
This instance is running on a Windows 11 machine.

Is their away to maybe just re-create the metadata based on the actual records?

Any help would be appreciated.

[QuestDB info]
Version = PostgreSQL 12.3, compiled by Visual C++ build 1914, 64-bit, QuestDB
Build = Build Information: QuestDB 9.3.5, JDK 17.0.9, Commit Hash 4bb96297b8908ba355692ef3f663281123296b7e
SERVER_VERSION = 12.3 (questdb)

Edit: added more info

Hi,

What does your deployment look like?

Did you encounter a hard shutdown or power loss scenario?

There is no easy get-out without restoring from back-up, the data is corrupted in this case.

Hi @nwoolmer,

Thank you for your answer. Yes, was just edited my post to include more info after more searching.
Indeed encountered a power loss scenario.

It is unfortunate that the data is corrupted then. Would have hoped that there would be a way to rebuild metadata based the still available files. I have a backup, but it is 8 days old. But still better than nothing.

In the meantime I found that it would help to set cairo.commit.mode=sync in the server.conf could help to prevent this corruption. However, this installation runs on Windows 11, would that still help?

Unfortunately, corruptions are non-deterministic. The data might be fixable manually by an expert, but it could take an unbounded amount of time, and still ultimately fail.

sync will reduce the likelihood of it occurring again for sure. We also have some performance improvements for the database when its in sync mode on the way, so in the long run, the performance gap will narrow.

It will mainly increase CPU/disk usage during ingestion, but not affect other processes.