Time Series Downsampling/Upsampling in MonetDB
Introduction
Time series databases are designed to efficiently store and query large amounts of data over time, but the downsampling and upscaling of these datasets can be a challenging task. In this article, we will explore how to downsample and upscale time series data using MonetDB.
Understanding Time Series Data in MonetDB
In MonetDB, time series data is stored as a table with columns for each dimension (e.g., date, station ID, temperature). The data can be queried using SQL-like queries that take into account the temporal nature of the data. When we query time series data, MonetDB uses an efficient indexing mechanism to quickly locate relevant data points.
Aggregating Time Series Data
To downsample or upscale time series data in MonetDB, we first need to aggregate the data based on a specific period (e.g., 2 days or 12 hours). This can be achieved by using aggregation functions such as AVG, SUM, and MIN along with date-related functions like EXTRACT(DATE FROM time).
However, when using aggregation functions, MonetDB requires us to group the data based on multiple columns (in our case, day and hour). But how do we do this when our query doesn’t explicitly use a grouping clause?
Sample Data
Let’s look at an example of sample data that represents temperature readings for three stations over a period of 24 hours.
+----------------------------+------------+--------------------------+--------------------------+
| time | id_station | temperature | discharge |
+============================+============+==========================+==========================+
| 2019-03-01 00:00:00.000000 | 0 | 407.052 | 0.954 |
| 2019-03-01 00:00:10.000000 | 0 | 407.052 | 0.954 |
| 2019-03-01 00:00:20.000000 | 0 | 407.051 | 0.954 |
| 2019-03-01 00:00:30.000000 | 0 | 407.051 | 0.953 |
| 2019-03-01 00:00:40.000000 | 0 | 407.051 | 0.952 |
| ... | ... | ... | ... |
| 2019-03-01 23:59:50.000000 | 0 | 406.982 | 0.944 |
+----------------------------+------------+--------------------------+--------------------------+
Querying the Data
We want to query this data and obtain an average temperature reading every two days.
SELECT EXTRACT(DATE FROM time) AS "day", AVG(temperature) AS avg_temp
FROM datapoints
GROUP BY "day";
This query will give us an average temperature for each day, but we need to downsample the data further. We can do this by using a combination of SAMPLE BY and aggregation functions.
Sampling Data
SELECT EXTRACT(DATE FROM time) AS "day",
AVG(temperature) AS avg_temp,
SAMPLE(AVG(temperature), 2)
FROM datapoints
GROUP BY "day";
This query will give us an average temperature for each day, and then sample every second value. The resulting data will have fewer rows than the original dataset.
Upsampling Data
To upsample the data, we can use a similar approach to downsampling but with an aggregation function that produces more values (e.g., INTERVAL).
SELECT EXTRACT(DATE FROM time) AS "day",
AVG(temperature) AS avg_temp,
INTERVAL 1 DAY * (AVG(temperature) - SAMPLE(AVG(temperature), 2)) + SAMPLE(AVG(temperature), 2)
FROM datapoints
GROUP BY "day";
This query will give us an average temperature for each day, sample every second value, and then use interpolation to calculate additional values between the sampled points.
Using MONETDB’s Sampling Function
MonetDB provides a sampling function called SAMPLE that can be used in queries. This function takes two arguments: the input value and the interval at which to take samples. For example:
SELECT EXTRACT(DATE FROM time) AS "day",
AVG(temperature) AS avg_temp,
SAMPLE(AVG(temperature), 2)
FROM datapoints
GROUP BY "day";
This query will give us an average temperature for each day, and then sample every second value.
Conclusion
Time series downsampling and upscaling can be a challenging task in MonetDB. By using aggregation functions like AVG, SUM, and MIN along with date-related functions like EXTRACT(DATE FROM time), we can downsample and upsample time series data based on specific periods (e.g., 2 days or 12 hours). Additionally, the use of sampling functions like SAMPLE and interpolation techniques can be used to achieve more accurate results.
Further Reading
By following the steps outlined in this article, you should be able to downsample and upsample time series data using MonetDB.
Last modified on 2025-03-28