Latest data by Id

CREATE TABLE counterData (
  counterId INT,
  timestamp TIMESTAMP,
  value DOUBLE
);

SELECT * FROM counterData
LATEST ON timestamp PARTITION BY counterId;

Is it a way to have kind of latestCounterData to get latest data by counterId?
I’m not able to run LATEST ON cuz it will scan a lot of data.
I used to have a sql trigger in MSSQL.

Hey Vadym,

If I understood it well, you need what LATEST ON provides, but you find LATEST ON not performant enough?

In your example you are using an INT column. If you use instead a SYMBOL column, LATEST ON might perform better.

When you are partitioning by a single SYMBOL column, LATEST ON knows how many different values you have in your column, so it can start scanning the table by reverse timestamp, and stop scanning the moment it has found a row for each different column value.

If you are not using a SYMBOL, LATEST ON needs to do a full scan to find all possible values.

In some cases, even when using a SYMBOL column, it might be the case some values are very rare, and you only get one once in a while, or even you might have values that are not in use anymore but were at some point. In that case, LATEST ON needs to scan a big chunk of data. If your use case can afford it, it is a good idea to set a maximum timestamp using WHERE, as in

SELECT * FROM counterData
WHERE timestamp > dateadd('M', -1, now())
LATEST ON timestamp PARTITION BY counterId;

So any symbols not present in the past month, for example, will be ignored.

2 Likes

Thank you very much for your answer. I will try to use SYMBOL instead of INT. Currently dateadd('M', -1, now())I takes 1.5 sec. I do preaggregation on the data so it’s crucial to be fast.

Perfect. Let us know if it worked :slight_smile: