Why can't you train Sense?

So why can’t you “train” Sense? We get this question a lot and on the surface it seems reasonable: Sense looks at the signature of devices in order to detect them consistently in your home, so it should follow that if you turn off everything in your home except for a single device (or turn said device on and off repeatedly), Sense should be able to properly identify that device in isolation and then continue to look for it in the future. Unfortunately, it’s not that simple.

It’s not a matter of Sense just knowing the “pure” signature of a device in isolation. Those signatures are unique, yes, but they’re unique in your particular home too and, moreover, they look different depending on the other devices running in your home. And then there’s another complicating factor: Your home is a dynamic place. Devices get added, removed, and moved around. Power quality waxes and wanes at the microscopic scale. In short, while it’s easy to imagine the Sense device detection process as a series of simple 1:1 comparisons between your home data and a “dictionary” of device waveforms, the reality is quite different.

The common machine learning analogy for what we do is trying to pick out a particular voice in a room of people speaking (the “cocktail party problem,” as its known). This is a pretty strong analogy, but it’s not perfect. That’s already a very hard problem, but what if that voice keeps changing radically in quality (in vocal range, in accent, in speed, and so on)? That’s essentially what we’re dealing with.

With this in mind, the prospect of “training” Sense should look a lot more problematic. Sense needs to see repeated patterns, enough repeated patterns that it can consistently detect a device regardless of power line noise or any other devices running concurrently (which make the “voice” sound different). Exactly how many noise-altered iterations of a device Sense needs to see varies, but it’s much more than you would be able to comfortably tag. This explanation is avoiding the topic of resolution, where Sense looks at sub-second features like the on/off transients of a device to properly identify it. The resolution of the Power Meter is downsized to allow for a better viewing experience, but even if we showed you the full data at a 1MHz resolution (our engineers are probably cringing while reading this), it would be near impossible to manually mark the exact beginning and end of a millisecond event with any accuracy — and Sense needs accuracy. To complicate things even further, the signatures of some devices, like LEDs, can change after being turned on/off in succession. And if you want another complicating factor: Manually turning off a device via its power switch or circuit breaker can look quite different than a “natural” device cycle. It’s a tough nut to crack — a nut wrapped in an inch-thick shroud of steel.

A related question we often get is, “Why can’t we tell you what devices we have in our home and you can look for them?” Sense Community user @kevin1 has provided a fantastic explanation via the analogy of facial recognition:

There are really two steps to the process. First machine learning identifies the bounding boxes/ circles for all the faces in a photo(s). After that a human can tag the faces with specific names. Subsequently, machine learning can begin associating names with at least some photos. But until machine learning has identified the face as a face, there’s no value in telling it that Jack is somewhere in the photo. Some photo environments do let a user define a facial region and assign a name, but that data is entirely for human benefit and NOT used for learning, because there is no “identification” for machine learning to tag with the name. But you certainly can erase incorrect names that have been automatically associated with an identified face, form improved learning. Similar to marking a device as “not on”.

We do believe that user input here is useful and we recently released our Home Details + Compare feature which does just this, but the benefit is in the long term and not immediate.

I hope that helps to explain why training Sense is just not a realistic option. Still, you can help furnish our data science team with data to refine the device detection process by renaming devices and utilizing features like Community Names. Be sure to turn on Network Identification as it can help find your networked devices, like Smart TVs. In addition, we’re pressing forward with integrations that can help your Sense monitor grab ground truth data directly from your devices, like our Philips Hue and smart plug integrations.

Thanks to our users for helping to refine the language in this explanation, especially @kevin1 and @markhovis73.


Great explanations, many thanks Ryan


Here’s one Machine Learning solution to the most simple case of the cocktail party problem, for those folks interested - just two people speaking over one another.