How to compute and expand average power over time?

alexcode · April 8, 2025, 3:07pm

I’m working with a dataset where an IoT device sends a wh (watt-hour) value at discrete timestamps, identified by an operationId. I want to visualize this data by plotting the sum of average power per operation, broken down by hour.

Here is the raw data:

timestamp	operationId	wh
2025-04-01T14:10:59.000000Z	1001	0
2025-04-01T14:20:01.000000Z	1002	0
2025-04-01T15:06:29.000000Z	1003	0
2025-04-01T18:18:05.000000Z	1001	200
2025-04-01T20:06:36.000000Z	1003	200
2025-04-01T22:20:10.000000Z	1002	300

Compute mean power per operation

I’m using lag() with a window function to get the previous wh and timestamp, and compute the average power:

SELECT
    timestamp AS end_time,
    cast(prev_ts AS timestamp) AS start_time,
    operationId,
    (wh - prev_wh) / ((cast(timestamp AS DOUBLE) - prev_ts) / 3600000000.0) AS mean_power_w
FROM (
    SELECT
        timestamp,
        wh,
        operationId,
        lag(wh) OVER (PARTITION BY operationId ORDER BY timestamp) AS prev_wh,
        lag(cast(timestamp AS DOUBLE)) OVER (PARTITION BY operationId ORDER BY timestamp) AS prev_ts
    FROM meter
)
WHERE prev_ts IS NOT NULL
ORDER BY timestamp;

This gives me:

end_time	start_time	mean_power_w	operationId
2025-04-01T18:18:05.000000Z	2025-04-01T14:10:59.000000Z	48.563334682314	1001
2025-04-01T20:06:36.000000Z	2025-04-01T15:06:29.000000Z	39.984450491475	1003
2025-04-01T22:20:10.000000Z	2025-04-01T14:20:01.000000Z	37.488284910965	1002

Expand the value across hours

Now I’d like to expand each row to hourly buckets between start_time and end_time, filling each hour with the cumulative energy based on the mean power.

For our example:

ts	operationId	energy
2025-04-01T14:00:00.000Z	1001	39.67354647241263
2025-04-01T14:00:00.000Z	1002	24.981776528168066
2025-04-01T15:00:00.000Z	1001	48.563334682314
2025-04-01T15:00:00.000Z	1003	35.66390848003506
2025-04-01T15:00:00.000Z	1002	37.488284910965
2025-04-01T16:00:00.000Z	1001	48.563334682314
2025-04-01T16:00:00.000Z	1003	39.984450491475
2025-04-01T16:00:00.000Z	1002	37.488284910965
2025-04-01T17:00:00.000Z	1001	48.563334682314
2025-04-01T17:00:00.000Z	1003	39.984450491475
2025-04-01T17:00:00.000Z	1002	37.488284910965
2025-04-01T18:00:00.000Z	1001	48.563334682314
2025-04-01T18:00:00.000Z	1003	39.984450491475
2025-04-01T18:00:00.000Z	1002	37.488284910965
2025-04-01T19:00:00.000Z	1003	39.984450491475
2025-04-01T19:00:00.000Z	1002	37.488284910965
2025-04-01T20:00:00.000Z	1003	39.984450491475
2025-04-01T20:00:00.000Z	1002	37.488284910965
2025-04-01T21:00:00.000Z	1002	37.488284910965
2025-04-01T22:00:00.000Z	1002	37.48828491096

How can I expand the results between start_time and end_time per hour and fill each hour with the correct mean_power_w (converted to wh for that period)?

Is there a way to achieve this using SAMPLE BY, joins, or maybe generate_series-like logic in QuestDB?

nwoolmer · April 10, 2025, 9:05am

Hey Alex, let me take a peek at this and come back to you!

nwoolmer · April 10, 2025, 12:06pm

Could you clarify something please!

For our example:

is this result set your intended end goal? Because it doesn’t look like a cumulative energy, it just looks like an energy-per-hour, based on the next measurement which was averaged.

alexcode · April 10, 2025, 8:37pm

Hi @nwoolmer, thanks for helping.
The result set is my intended goal but I spotted a bug in my script used to generate it in the last hour span of each operationId.

At the row 2025-04-01T18:00:00.000Z 1001 48.563334682314, the value should be 14.636449480641858.

Here is the corrected result

ts	operationId	energy
2025-04-01T14:00:00.000Z	1001	39.67354647241263
2025-04-01T14:00:00.000Z	1002	24.981776528168066
2025-04-01T15:00:00.000Z	1001	48.563334682314
2025-04-01T15:00:00.000Z	1003	35.66390848003506
2025-04-01T15:00:00.000Z	1002	37.488284910965
2025-04-01T16:00:00.000Z	1001	48.563334682314
2025-04-01T16:00:00.000Z	1003	39.984450491475
2025-04-01T16:00:00.000Z	1002	37.488284910965
2025-04-01T17:00:00.000Z	1001	48.563334682314
2025-04-01T17:00:00.000Z	1003	39.984450491475
2025-04-01T17:00:00.000Z	1002	37.488284910965
2025-04-01T18:00:00.000Z	1001	14.636449480641858
2025-04-01T18:00:00.000Z	1003	39.984450491475
2025-04-01T18:00:00.000Z	1002	37.488284910965
2025-04-01T19:00:00.000Z	1003	39.984450491475
2025-04-01T19:00:00.000Z	1002	37.488284910965
2025-04-01T20:00:00.000Z	1003	4.39828955406225
2025-04-01T20:00:00.000Z	1002	37.488284910965
2025-04-01T21:00:00.000Z	1002	37.488284910965
2025-04-01T22:00:00.000Z	1002	12.600229095074349

Indeed, “cumulative” might not be the right term here. What I’m actually trying to do is expand each operation’s average power output over time.

So for each operationId, I calculate the average power in W between two measurements, then I want to apply that average power to every hour between the start and end timestamps, converting it to Wh for each hour.

So the goal isn’t to sum up values cumulatively across time, but rather to spread the average energy contribution of an operation across each hour it spans.

nwoolmer · April 10, 2025, 9:08pm

Okay thanks! @javier has cooked up a sizeable query, and he will come back to you with a suggestion soon.

javier · April 11, 2025, 9:43am

Hi,

It took me a while to come with a suitable solution. I thought I had one (which was mega complex) but then I realised, speaking with @nwoolmer, that this should be easier. And here it is

with
sampled AS ( 
  select timestamp, operationId, sum(wh) as wh from meter  
  sample by 1h
  FILL(0)
),
sessions AS (
select *, SUM(case when wh>0 THEN 1 END) OVER (partition by operationId order by timestamp desc) as session
from sampled
), 
counts AS (
select timestamp, operationId, 
  FIRST_VALUE(wh) over (partition by operationId, session order by timestamp desc) as wh,
  COUNT(*) over (partition by operationId, session) as attributable_hours
FROM sessions  
)
SELECT timestamp, operationId, wh/attributable_hours as wh_avg 
--,wh, attributable_hours,
FROM counts
;

Basically what you need here is to slice, for each operationId, all the hours with no wh value (or 0) preceeding a row with a value for wh. Then we divide the wh for this row across all the rows in the same slice, to attribute the average. I am calling each of this slices a “session”.

My query starts with the sampled subquery, which just makes sure we have an entry for each operationId and missing hourly interval, filling with 0 wh all the intepolated rows.

Now the real trick for the query. We divide the data in “sessions”. A session is defined by all the rows with no value for wh before a row with a value for wh. Or, if we were to reverse the timestamp order, a session would be defined by a row with a value for wh, followed by several rows with zero value for the same operationId. We can express this with this window function

SUM(case when wh>0 THEN 1 END) OVER (partition by operationId order by timestamp desc) as session

Now, for each operationId we will have multiple sessions (1,2,3…), and if we did something like…

COUNT()  as attributable_hours GROUP BY operationId, session

…we would get how many attributable rows each session has.

With this info I can go to the next subquery that I named counts in which I am using a window function to COUNT the number of rows per session (notice how the count window function is not using order by so this will not be a running count, but all the rows for the same session will have the same value as attributable_hours.

I am also getting the FIRST_VALUE for the session sorted by reverse timestamp, which is the wh value for the only row with value in each session.

And then the latest query is just a simple division, to average the wh reported in the session across the number of attributable_hours.

I left a line commented out in case you want to see more clearly the intermediate values for the calculation.

Note if you want to filter the results by timestamp or operationId, you should ideally add the filter at the first query, the one named sampled, so the rest of the process is done on the relevant subset of data.

I hope this helps!

alexcode · April 11, 2025, 4:00pm

Thank you @javier, I like the approach. I need to work on the window function as it does not take in consideration the first occurence of the operationId (the start_time).
Thus, keeping increasing the COUNT of attributable_hours as the session is still set to 1.

I will post my findings here.

Thanks again for the help.

Topic		Replies	Views
How to calculate hourly electricity consumption in QuestDB? Community sql , question	1	91	June 26, 2024
Calculate data reception frequency table Community	2	29	April 29, 2025
Plotting overlayed data from past week Community sql , question	1	42	August 13, 2024
How to find elapsed time using window functions Community sql	1	82	July 3, 2024
Is there a way to LAG() with VARCHAR data? Community sql , question	2	27	June 11, 2025

How to compute and expand average power over time?

Compute mean power per operation

Expand the value across hours

Related topics