Always On vs simulated computations


Thanks to @seth1, I was inspired to simulate Sense’s basic Always On calculations on an hourly basis rather than Sense’s 1/2 second basis (hourly is the best I can do using exported data), and compare against my Always On to see if there is any kind of correlation. I had thought about doing this in the past, but had always thought it might be a little pointless because of the hour granularity of the export data. When @seth1 saw some correlating results, that pushed me to experiment in spite of my doubts.

To simulate, I used two calculations:

  • min24 - computes the minimum Total Usage over the previous 24 hours. This is the old “high watermark” calculation used by Sense until about mid-late 2018.

  • percent48 - computes the 1% quantile lowest Total Usage over the past 48 hours. This is the newer computation that is more immune to data dropouts that show up as 0. The 1% percentile value is computed using the R quantile function.

Three caveats as well:

  • hourly vs. half-second calculations - Timing is everything - A half-second sample/calculation is going to capture Total Usage minimums that are almost always sure to be lower than the average of 7200 half-second periods in an hour (hourly Always On). This is a big thing, especially for the min24 calculation, which only looks for the absolute lowest sample in the 24 hour period. The sampling/calculation difference means that my simulated hourly calculations will almost always be larger than the corresponding Sense computed Always On that looks at half-second intervals.

  • Dealing with bad data - I’m not sure how Sense deals with bad data coming from the monitor, but I have seen two forms of bad data in my history timeline - negative Total Usage plus basic dropouts. For my purposes, dropouts default to zero and any negative Total Usage values are converted to zero. Another option would be to convert these NAs and not use the those values in the min and quantile calculations, but in some cases, that wouldn’t leave me with enough 48 hour data to do a workable quantile calculation. So there are likely more zeros coming out of my simulations than Sense would see.

  • No smartplug offset - I’m not yet looking at subtracting off smartplug or Hue “Always On” contributions that Sense put into place about the time the Hue/smartplug integrations were added. So my simulated values after Sept/Oct 2018 will also be high because of that.

With that, here are the comparisons:

Here’s min24 in action vs. Always On. As expected almost all the points are north of the unity 45 degree line. We see lots of vertical and horizontal lines in the same colors thanks to the quantization effect of looking at min values over a 24 hour period. Also, as expected, many more zeros on the min 24 simulation side.

And here’s percent48 in action. Very similar to min24, except that the statistical function (1% quantile) filters out some of the biggest outliers, hence a little tighter packing toward the unity line. BTW - I’m guessing that most the zeros on either axis, plus the points south of the unity line represent bad data coming out of export due the aforementioned data dropouts and negative Total Usage.

Here’s the current code:
EstAlwaysOn.R (4.4 KB)