Computing daily difference using synthetic variable


#1

I have a raw variable which is continuously increasing (until it eventually overflows and restarts at 0). It’s similar to a water meter that generates pulses.

What I would like to create is a new synthetic variable, which contains one value each day, and each value is equal to pulse count at midnight - yesterday’s pulse count at midnight. Then I want to display this as a chart on my dashboard, so that users can see daily totals.

I’m sending the data every 30 seconds, with Ubidots industrial, but I don’t want to depend on assuming that all the data is there, because there could be network interruptions. Therefore I don’t want to use shift to step back over a day’s worth of data.

Here is a simple example, [SOLVED] Creating a synthetic variable for pulse counter sensors but my goal is to go farther than this, and create a very sparse time series with only 1 value for each day, where each value represents the total number of pulses in a day (doing this by month and year is also a goal of mine).

Here is my first guess at an expression, where var1 is the raw variable that is the pulse count. I am ignoring the overflow condition. Will this run every day at midnight, and be timestamped to midnight?

last(var1,‘1D’) - min(var1,‘2D’)

will these three always produce the same result, due to the fact that data_range is 1D?:
last(var1,‘1D’)
max(var1,‘1D’)
min(var1,‘1D’)

On the Synthetic Variables Basics page, https://help.ubidots.com/user-guides/analytics-synthetic-variables-basics
I have the following question regarding the offset example (“A = sum(raw_variable, “8H”, offset=6)”). Is offset required to be positive, and is offset required to be an integer?


#2

Hello Poven,

Before getting into my explanation and how to address your request, can you advise your thinking process about what would be the outcome for the expression:

last(var1,‘1D’) - min(var1,‘2D’)

From my side and knowing the synthetic variables engine, I can say this won’t give the expected result of:

but other different from that.

Beyond that, your question about whether or not the following expressions, last(var1,‘1D’), max(var1,‘1D’), min(var1,‘1D’), would render the same result, the answer is no. Respectively, the outcome for each of these expressions is:

  1. Last value received each day timestamp at the beginning of the day, that is, at 00:00:00.
  2. The maximum value of the day timestamp at the beginning of the day, that is, at 00:00:00.
  3. The minimum value of the day timestamp at the beginning of the day, that is, at 00:00:00.

Given your variable is an incremental value until it overflows and restarts at 0, sometimes it would be possible to have the below scenarios depending on when the variable overflows:

  1. last(var1,‘1D’) = max(var1,‘1D’)
  2. `last(var1,‘1D’) = min(var1,‘1D’)``

Finally, in order to achieve your goal:

It’s necessary to use shift() function in order to access your pulse count variable previous value. I would it as follows (assuming {{var}} is an incremental variable consisting of pulse counts):

last({{var}}, ''1D") - shift(last({{var}}, ''1D"), -1)

herein, you would be logging the pulse count for each day and timestamping it a the beginning of the day (00:00:00).

Also, how would a connection issue affect this computation, that was your first stopping reason to use shift() function?


#3

Thanks for your explanations. My thinking for last({{var}},'1D') - min({{var}},'2D') was that the last value is obviously the current value, and if I understand it correctly, when the date range is specified in units of D, the calculation runs at midnight. the min part of the expression was intended to capture the value from 24 hours ago, since this is a monotonic series (except in the case of the overflow), and I thought incorrectly that I would need to specify 2D, since the current day is the day of execution, which counts as 1 whole day, even though we are 0 seconds into that day. So I would specify 2 days, in order that the previous 24 hours of yesterday and the 0 seconds of today are all incorporated into the min expression. I see this was not correct.

Your further explanation of the differences between last, max, and min is enlightening. If I understand it correctly now, I see that this function basically looks at all data from within each specified date range interval, and takes the value which is applicable (e.g. maximum or minimum or last) and then remaps that data value to a timestamp for the beginning of that interval. It does this all the way through the time series, creating a new time series with a timestamp every date range interval, e.g. every 1D, or every 6H, depending on what was specified.

My hesitation with shift was that that on your Supported Synthetic Variable Functions PDF (https://res.cloudinary.com/di2vaxvhl/image/upload/v1551994025/Supported-Synthetic-Variable-Functions.pdf), you show the return value as a single value for the Data Range Functions, whereas you show the return value as an array (i.e. time series) for all other functions (including shift). Since shift takes a time series (array) as an argument, I thought that I couldn’t nest a data range function such as max inside the shift function.

Regarding my concern with the connection issue, if I were to use shift on the raw pulse count time series, my concern was that making the assumption that I could shift by (24 hours * 3600 sec/hour / ( 1 count / 30 sec )) would only work if I had a data value for each 30 seconds, if some values were missing, then my shift would be off and the errors would accumulate over time! Your data rage functions which allow the data series to be remapped to other time units are invaluable to avoid this!!! I am just learning how to use them.

I will try your code and let you know how it turns out. Thanks.


#4

Hi Peter,

Glad to know you have a better picture of how the Synthetic variables engine is built to work.

Let me know the outcome using the expression advised above.


#5

Hi David,

So far it doesn’t appear to be working. I left it overnight so that the midnight would trigger it. The synthetic variable still has no data in it. Here is my code:
(last({{btu-energytotalmode1},'1D') - shift(last({btu-energytotalmode1}},'1D'),-1)) / 3412
The raw variable btu-energytotalmode1 is in units of kBtu, and is always increasing (it is a cumulative total). I am taking the daily difference and dividing by 3412 to get MWh.

In the interest of getting results a little quicker than every day at midnight, I also created a new synthetic variable called test1 which is equal to the sum of a timestamp raw variable and a temperature raw variable (this gives it a somewhat random but generally increasing over time value). Then I made another synthetic variable with the expression
last({{test1}},'1H') - shift(last({{test1}},'1H'),-1)
but this still is not working after almost a day. It says no data found just like for the one above.


#6

I will post the solution here once I have it working. For now, one major error in my code above is that the date range string must be escaped with double quotes (") rather than single quotes (’).

e.g.:
CORRECT: last(var,“1H”)
WRONG: last(var,‘1H’)


#7

David from Ubidots looked into this further and determined that the the computation would work in a single synthetic variable if it was put into a multiline format. Here is a working example:

x = last({{var}}, “1D”)
(x - shift(x, -1)) / 3412