Making Sense of 'Always On' - Chapter 2 - 'Always On' Blips

I decided to do one more probe into Always On data to see how it has varied over time in my household for as long as I have had Sense data available (reset at the start of Aug 17).

First a view of the detailed data:

  • Red dots are aggregated daily Always On totals
  • Blue dots are hourly Always On x 24 (to normalize with daily)
  • Green dots on the bottom indicate that there was some good data for that clock hour (still could be a data dropout for part of that hour).
  • Orange dots at the top indicate data was missing for an entire clock hour.

What do we see ?

  • A long stable period from Aug 17 until June 18, where Always On mostly stays in a stable zone, though the number of hourly dips in Always On increases after March 18.
  • A wilder period of swings between June 18 and mid-Aug 18.
  • A very flat, low period between mid-Aug and Mid-Sep 18
  • A resumption of higher Always On with wider swings from mid-Sep to the end of Oct 18.
  • The start of the smartplug beta at the start of Nov 18, with a falling Always On as smartplug always on data gets pulled out of Always On. More dropouts and outlier low Always On hours.
  • The start of a new Always On calculation around Dec. 20. Lots of hourly dropouts and shorter dropouts as well.
  • Starting Jan 1, 2019, I did several things to reduce dropouts and they seem to have partially worked.

I’m guessing that each of these domains corresponds to different tweaks of the Always On computation, or the way that Always On data gets aggregated into hours for export, each of which is probably as separate chunk of code handled by different people.

For a long time, I’ve been trying to figure out ways to detect/predict dropouts that aren’t visible in export because they don’t cover a full clock hour. I know many exist because I can see plenty in the Power Meter, plus I observed a bunch when I was looking for discrepancies vs. my utility data. I started looking at all sorts of complex R packages to detect time series anomalies, but then realized that I had yet to do one of the simplest used in R - Time Series Decomposition. It breaks down the series data into a ‘trend’, ‘seasonal/periodic’, and ‘random’ component using the Berlin Procedure algorithm.

Even though Always On is supposed to have relative stability over a 24hr (or 48hr) period because of the way it is derived, I decided to apply a 24 hour periodicity to the decomposition, and I used ‘additive’ vs. ‘multiplicative’ decomposition. The results were interesting:

I paid special attention to the ‘random’ component of the time series decomposition as that really highlights anomalies from any periodic trends or long-term trends. After peering to the negative dips, I found that there was magic threshold around -0.05kWh. Virtually all random dips below -0.05kWh represent some kind of data dropout ! Voila…

Still have to test whether peaks have any meaning, or are just “recovery” from dips.