Some Data Science of “On", “Off” and “Idle”

Just a caveat before I start “Installment 2” on this topic. I don’t profess to know how Sense actually does “Idle” vs. “On” detection nor do I have access to the fine-grained data they have to work with. But I want to share how one might do an assessment with the export data available.

Installment 2 - Data Science of “On”, “Off” and "Idle"

As I mentioned in the previous installment, the next important step is to figure out a density level for the second biggest maximum, or secondary mode if one exists, that indicates a true data mode, while filtering out peaks caused by data dropouts. Two observations here to help narrow the possibilities:

First, one only needs to worry about filtering out dropout noise that exists between 0.000kWh (Off) and the primary mode (the biggest one), because data dropout noise is subtractive - it lowers data values that should be attributed to the primary mode, but it cannot increase data values. So for now, I’ll look at data points above the main mode as legitimate data, but filter secondary modes that lie between 0.000kWh and the primary mode using some chosen threshold amplitude.

Second, with the current export resolution, I can’t do a census of the number of data dropouts. I would need a resolution of a second or smaller to detect all of them, so I need to find a proxy for how frequently they occur. Fortunately, the nice digital nature of the recirc pump energy usage gives me a good proxy via the frequency of in-between hours vs. the total number of hours. Summing the numbers, I see 49 in-between values out of 618 samples, giving about an 8% incidence of dropout noise errors for the recirc pump. But the recirc pump is only on 3/4 or the time, so I have to multiply by 4/3 to get worst case, or 10.5% Assuming that all HS110s see the about the same % dropout, and that there are no other huge causes of bad data, a 12% threshold for the second mode density should be enough to filter out data dropout related modes, though we may accidentally lose a small real secondary modes or two in the process.

So with this new filtering criteria, let me look at a few more devices on smartplugs.

Sonos Amplifiers
Here’s the 1 sec resolution energy waveform for our two Sonos Connect:AMPs looks like this. Combined, they draw a steady 13Wh most of the time, with spikes when the music is turned on in the rooms they supply.

In the histogram, the primary mode is as expected, around 13Wh, with a small secondary mode between zero and the main mode, which again is likely attributable to dropout noise.

If I look in the hourly data, I see 3 datapoints past the local minimum to the right of the primary mode (red triangle just left of 0.020kWh). Those datapoints correspond to the 3 most recent periods I had the speakers turned on, mostly to see if I could force Sense’s algorithm to see both Idle and On states. The threes datapoints I “detected” just barely met my criteria for being beyond the primary mode. Sense still hasn’t identified a separate Idle vs. On for the Sonus yet, possibly because I haven’t left it on long enough with the music cranked.

Washing Machine
Here’s the waveform for our washing machine with a time resolution of 1 second. Mostly Idle at around 2Wh 98% of the time, but with a very broad range of energy usage when it is running.

And here is the associated energy histogram. My algorithm suggests that 20Wh might be a good breakpoint between Idle and On. Please note that the wash cycle runs about 1 1/2 hours so the power used during a cycle will fall over 2-3 exported hours. The 20Wh breakpoint probably represents 5 minutes, or so, of a wash cycle falling into an exported hour.

If I look at the hourly data, 20Wh seems like a good breakpoint for Idle to On transition. I see 78 hours greater than 20Wh out of a total of 969 hours sampled, or On about 8% of the time. That jibes a quick visual of the Washing Machine Power Meter over the time period the smartplug was installed.

In the next installment I’m going to look at my furnaces, but they will require a little additional background because the furnace fans for each were both previously identified by Sense.