I want to get both the maximum/minimum value and the time when the maximum/minimum value occurred
create table
CREATE TABLE IF NOT EXISTS weather_behavior (
city STRING,
site STRING,
air int,
temperature double,
humidity int,
windLevel int,
ts TIMESTAMP
) TIMESTAMP(ts) PARTITION BY MONTH WAL;
query sql
SELECT
to_str(ts, 'yyyy-MM-dd HH:mm') AS hour_time,
sum(temperature) AS total_value,
avg(temperature) AS avg_value,
max(temperature) AS max_value,
--max_value_time
min(temperature) AS min_value,
--min_value_time
FROM weather_behavior
WHERE ts in ('2020-12-01')
SAMPLE BY 1h
I tried the sub_query/frist_value/last_value but later the syntax was wrong
then I Use Join table t2,t3,but timeout, query aborted xxx
with t1 as(
SELECT
city,
ts,
to_str(ts, 'yyyy-MM-ddTHH:mm') AS hour_time,
sum(temperature) AS total_value,
avg(temperature) AS avg_value,
max(temperature) AS max_value,
--max_value_time
min(temperature) AS min_value,
--min_value_time
FROM weather_behavior
WHERE ts in ('2020-12-01') AND city = 'test'
SAMPLE BY 1h
)select t1.*,min(t2.ts) max_value_ts ,min(t3.ts) min_value_ts from t1
join weather_behavior t2 on t2.ts in t1.hour_time AND t2.city= t1.city and t2.temperature=t1.max_value
join weather_behavior t3 on t3.ts in t1.hour_time AND t3.city= t1.city and t3.temperature=t1.min_value
``
### I just want to find out the results through 1 sql, do not want to query many times, what is the best way
DECLARE
@sample_interval := 1h,
@sample_interval_str := '1h'
with w_max_min as (
select *,
rank() over(partition by timestamp_floor(@sample_interval_str, ts) order by temperature ASC) as ranking_min,
rank() over(partition by timestamp_floor(@sample_interval_str, ts) order by temperature DESC) as ranking_max
FROM weather_behavior
)
SELECT
ts,
sum(temperature) AS total_value,
avg(temperature) AS avg_value,
max(temperature) AS max_value,
first_not_null(case WHEN ranking_max = 1 THEN ts END) as ts_of_max,
min(temperature) AS min_value,
first_not_null(case WHEN ranking_min = 1 THEN ts END) as ts_of_min,
FROM w_max_min
SAMPLE BY @sample_interval;
I am first using window functions to get the ranking of the rows according to max and min temperature. Since you want to SAMPLE BY 1h, I had to use a trick and partition by timestamp_floor('1h', ts), so the ranking will be relative to events in the same hour.
Now in the SAMPLE BY query, I use a CASE to nullify a column for every row which does not have 1 as the ranking, then I use first_not_null to extract the max/min timestamp for each hour interval.
Note I am using DECLARE in the query, as it can be tricky to use the same units across the query. Unfortunately I have to use two variables as the timestamp_floor expects a string and SAMPLE BY expects a literal. Still, I like to have it at the top, so I can easily change the sampling interval.
glad it helped! By the way, in case there are multiple timestamps with the same max or min value for the hour, it will just return the first one. If you want to return instead a list of all the timestamps, the query can be adapted using string_agg like this:
DECLARE
@sample_interval := 1h,
@sample_interval_str := '1h'
with w_max_min as (
select *,
rank() over(partition by timestamp_floor(@sample_interval_str, ts) order by temperature ASC) as ranking_min,
rank() over(partition by timestamp_floor(@sample_interval_str, ts) order by temperature DESC) as ranking_max
FROM weather_behavior
)
SELECT
ts,
sum(temperature) AS total_value,
avg(temperature) AS avg_value,
max(temperature) AS max_value,
string_agg(DISTINCT case WHEN ranking_max = 1 THEN ts::varchar END, ',') as ts_of_max,
min(temperature) AS min_value,
string_agg(DISTINCT case WHEN ranking_min = 1 THEN ts::varchar END, ',') as ts_of_min,
FROM w_max_min
SAMPLE BY @sample_interval;