Accepting INTEGER
as an alias for INT
could be fairly easy to fix.
If it is easier to get epoch micros, use that instead of the ISO format string. LONG micros are trivially castable to timestamp.
Accepting INTEGER
as an alias for INT
could be fairly easy to fix.
If it is easier to get epoch micros, use that instead of the ISO format string. LONG micros are trivially castable to timestamp.
That would be very nice! I wonder if there are more of those variations in SQL92.
If it is easy to fix, I continue my testing with Tableau and QuestDB of course.
Does the conf/text_loader.json
file work for COPY, or only with the REST interface?
While waiting for exports and conversions, I’m reading your documentation and found settings about timestamps. But it is in the “REST Interface” part, so I’m not sure if those extra formats also help in parsing the csv using the COPY command.
It would avoid having to convert every timestamp, as I also saw locale settings in the example at Import CSV | QuestDB
Re: variations, there certainly will be. We try to fix incompatibilities with SQL standard and PostgreSQL where appropriate, but we are not a classic OLTP database so in some cases, this is not possible or easy!
Re: text_loader.json
, you can specify formats etc. directly on the command. e.g
COPY weather FROM 'weather.csv' WITH HEADER true FORMAT 'yyyy-MM-ddTHH:mm:ss.SSSUUUZ' ON ERROR SKIP_ROW;
If the examples on this or the CSV docs do not work, then let us know and we can fix them. It is sometimes helpful to use a ‘known working CSV’ so you are confident the settings/SQL is working for you, and then adapt your inputs.
We do have this PR to merge: feat(sql): multi-threaded read_parquet() execution by puzpuzpuz · Pull Request #5256 · questdb/questdb · GitHub
This makes reading from Parquet files much faster. If you do have a means to get Parquet out of Tableau, perhaps by round-tripping via DuckDB or similar, this might be an alternate route.
P.S Raised an issue here: Add `INTEGER` alias for `INT` for compatibility · Issue #5258 · questdb/questdb · GitHub
Well, the wait is over
Importing about 445 million rows. It took some time as you can see: more than 3 hours!
Now I can start testing if Tableau can show the data and how compatible it is with SQL92.
Update #1:
Hmmm. Lots, and lots of errors now I have lots of data.
This is one of them:
2024-12-18T18:58:51.479881Z E i.q.c.p.m.PGConnectionContextModern undersized receive buffer or someone is abusing protocol [recvBufferSize=1048576]
Others seem to relate to non-existent functions like HAVING and some CASTs (again).
CASTing problems:
Update #2:
Increased recv/snd buffer sizes to 5M. That stops the above error about the receive buffers.
However many dashboards seem to use the HAVING token, which is not supported by QuestDB. If this error happens, Tableau locks up and I have to kill the process. Unlucky for me, once Tableau starts again, it shows the same dashboard and locks up again…
2024-12-18T18:57:45.069562Z E i.q.g.e.QueryProgress err [id=-1, sql=`SELECT SUM(1) AS "COL"
FROM "public"."vos_all" "vos_all"
HAVING COUNT(1)>0`, principal=admin, cache=false, jit=false, time=292027, msg=unexpected token [HAVING], errno=0, pos=57]
HAVING:
The syntax for the HAVING clause is as follows:
SELECT column1, column2, aggregate_function(column3)
FROM table_name
GROUP BY column1, column2
HAVING condition;
column1
, column2
: Columns you want to select and group by.aggregate_function(column3)
: The aggregate function (e.g., SUM, COUNT, AVG) applied to column3
.table_name
: The name of the table.condition
: The condition used to filter grouped results based on the aggregate function.Update #3:
Some of the (for me) standard Tableau functions are not available anymore and/or don’t work:
It seems that these functions rely on the backend database. But QuestDB supports WINDOW functions, so no idea at this moment what the exact problem might be. Maybe SQL92 does not support them?
Update #4:
One day later, and one day more of testing!
Regarding Update #3:
So far it comes down to problems around:
Furthermore, speed can come with a lot of disc space usage. If I compare the databases for those 445 million rows / 106 fields:
Lucky me, the SSD is 2TB
It is a pitty many dashboards don’t work due to the CASTs. I was really wondering how QuestDB handles strpos() functions. TimescaleDB does a sequential scan for this and takes ages (30 minutes or more) before the dashboards is shown. Tableau on the other hand only needs a few seconds.