Additional input layer for Machine Learning

Great things are happening with my Sense! While it is still in the early stages of figuring out the devices in my home, Great Job so far!!!

There is a post “Option to Verify Device Activity”, that may seem similar to this, but my suggestion differs in that this is a guided scenario still under control of the machine algorithms. I’ll explain:

I would like to see an option to turn on Assisted learning and verification. If on, the normal device detection would take place using the current algorithms up to the point that it displays a new device found. At this point, It would be very useful to have the option to select a guided verification. Sense may ask to turn the device on when prompted, and turn it off when prompted for N number of cycles to verify that it found the device.

An additional option would be to flag certain occurrences as ‘false’ or ‘non-valid’ so that the algorithm could analyze the difference between the ‘valid’ and ‘non-valid’ detection could help to refine the algorithms for all going forward.

Machine learning uses a log of input, during algorithm generation, which is good, in order to refine, it needs valid feedback during the results phase to determine if it is accurate or not.

2 Likes

David,
You’re suggesting the ability to “tag” power signatures to enable some supervised learning. You might want to read up on what that entails in other machine learning domains. There are two issues I can think of as a non-expert in that area.

  • Does your “tagged” signature snapshot capture enough of signature “background” to really do training ? When you do “tagged” image training for photo of video image recognition, you ideally need 100’s of different photos/videos of the same kind of object from all kinds of angles and lighting/background conditions.
  • Photo/video tags have the notion of a bounding box. What is the equivalent “bounding box” for signatures ? In the time domain that might be easy. Maybe you can also come up with a magnitude bounding box ? But is there also a “frequency domain” bounding box ?

I was just thinking more of a marker, or point in time. Once marked, this is just a reference for all the data that normally would be used anyhow. A simple example is the machine is, for purposes of discussion, sampling millions of points per second looking for signatures. Say, for example, it already displays a visual representation of this on a time-scale. not every point but the trend graph over time. And say, for example, that it also marks specific transitions, say with a marker and watts changed. Now this is not a single event, but one of a very many that will be compared. However, If I could notice a pattern of say the fridge door opens and each time I see a 145 watt increase and when I close it I see a 140ish watt drop - all being marked by the system, I could tap the markers and provide a hint. i.e. fridge light seems to correspond with that. Then the machine would use that data not to simply decide that I was right, but as additional information to use when looking at all the signatures to finally identify it. More types of info mean more reliable, in the end, info. We as humans due this, voice recognition does this (i.e. google home device training to recognize individuals in the family/home for separate access via same unit, etc.

David,
I’m not sure that human feature-weighting would help training and classification, based on what I know about RNNs and LSTM. It’s important to tag when things go on and off, though. What you are suggesting is very different than the supervised learning that Google Home uses to get started with multiple users. The Sense equivalent would be to turn off everything else in your house, then walk through a guided list of all the motors, fans, compressors, incandescent lights, leds, EV chargers, etc. turning on and off every appliance, at all possible power levels.

What I am suggesting is simply additional data points. There are already data-points to help the machine to understand that this may be a refrigerator, etc. I am not sure if that is combination of the % of folks that eventually identify it as such as well as initial research into the basic patterns or what. However, I am confident that beyond the signatures for Heat, Motor, The fact that it can now identify Dryer, LaserPrinter, etc. etc. etc. in many cases is simply based on data and patterns and signatures of the same.

Kevin. do you Work for Sense?

Are your comments specific to sense or just a theory in general based on your specific experiences?

Thanks!

Hi David,
I don’t work for Sense and am really only speculating on what might work and what might not, based on thoughts about what technologies Sense is using under the hood. I’ve taken a few machine learning courses and data science courses so it’s fun to make educated guesses about how Sense does their magic. I believe Sense is using recurrent neural networks (RNN) with long short-term memory (LSTM) for classification, perhaps with some DSP conditioning and normalization upfront in the Sense probe. If that’s the case, Sense is looking for and identifying pattens / signatures, back at the mothership, that are not comprehensible to humans in any meaningful way, and any inputs to the training, other than tagging a device on or off, with perhaps the identified mode (microwave high power vs. popcorn) of operation, won’t help.

I think Sense initial bootstrapped the fundamental models (heating elements, motors, incandescent lights, microwaves, etc.) in a couple of houses that had automated “tagging” feedback from a number of representative appliances. Once they brought up basic generic models for those, they could begin remotely identifying devices in customer’s home and having customer tag more specific data (drier vs. garage door motor, and specific model). But fundamental models still need to be initially trained by Sense, hence each type EV charger requires it’s own bringup.

If you want to see where I’m coming from, try Googling a few simple articles on training neural networks.

1 Like

additional input I am talking about would effectively be improving the patterns that the signatures are compared against.

I think we have exhausted this thread - not sure we are fully in sync - but I appreciate your perspective

Hi David,
If Sense is using machine learning, the only useful training feedback (Y) for classification is whether a device is on or off, not specific aspects of the waveform that a human thinks are intuitively important (and enable us to make intelligent calls on what is on or off). The whole of the waveform data (every sample) is already flowing into the network under training. It’s up to the mathematical forward propagation, back propagation and cost function minimization to determine which waveform features to key off of.

This article gives a great overview of how recurrent neural networks learn to find patterns on their own, though in languages, instead of waveforms.

So, wouldn’t it be helpful to explicitly turn on and off devices (particularly “discovered” but not identified) while telling Sense that we are doing so…and perhaps for large consumers not yet identified at all? I would think that turning on an oven, signaling to Sense that I did, letting it warm up fully, then turning it back off with notification…perhaps many times…would provide useful data.
After almost a year, Sense has identified only about 14 of our 57 devices, and does not even report those particularly reliably. It’s never found many of our large consumers of power, like stove, ovens, dryer, etc. So I have to conclude that the process needs (LOTS) more help.
If I could do this over again, I wouldn’t. I also tell any one who is interested to run, not walk, away from this, at least until Sense gets beyond being an acedemic research project. Fortunately, I have an excellent home energy monitoring system (welserver.com) that I can use to track my usage and generation.

That makes sense, but the number of vectors ( tagged on/off cycles) required for neural network training would need to very large across a broad range of background conditions to be useful. Like tens of thousands of cycles. That’s why Sense needs to crowdsource across hundreds users to get a meaningful dataset. A few dozen repetitions wouldn’t really be all that helpful.

As for me, I’ve had my Sense for about a year but did one reset mid-year. It has identified 26 devices, plus Always On, but more importantly I’m seeing roughly 70% of my usage identified (8% typically always on, 40% going to the EVs). Plus a better much view of my solar generation overlaid with usage.

1 Like

Oh well, just an idea.

In my case I already had excellent (and very accurate) monitoring of major device use and my solar production. Interestingly, my welserver is always within .7% of my revenue grade meters (both solar and net consumption) going to the utility folks, while Sense differs by 3-5% monthly…and not even consistent.

Since my Sense has identified hardly any of my devices and tracks those poorly, I’m wondering if I should do a reset for its first anniversary, or just stop worrying about it and give up on a bad experiment. Sense also hasn’t identified ANY of my dozens of wireless devices yet, which should be pretty easy. Thoughts?

1 Like

Andy,
The WELserver stuff looks really cool and very flexible, plus pulls together electrical and heating energy measurements quite nicely. I do wonder if you need a separate pulse wattmeter for every major source (solar) and sink (devices) you need to monitor ? Or can you pull readings from existing meters via zigbee / z-wave, as well ? I can see how you could get greater accuracy with WELserver, especially since the Continental wattmeters they recommend have a revenue grade option.

My 2c is that you already have a great solution in place, so you should set your expectations low, look at the Sense like a fun science experiment, reset it and watch it progress (and regress) without any great need to force it to learn. You may want to read the history of Imagenet to facilitate a little more patience - machine learning for image recognition has come a long way in the past 10 years, but it has all been about building up huge well-tagged image dataset.

As for wireless devices, detecting most of them via their power usage signatures is probably much difficult than identifying a motor, heating element, microwave or lightbulb pattern since they most digital systems have much more variable patterns. At the same time Sense has the benefit of seeing subnet packets broadcast by networked devices so it might be able to auto-tag power events associated with certain broadcast sequences (a TV turning on or waking up from sleep mode), or Sense might even substitute neural network classification of the event for direct interpretation of the the event based on those broadcast sequences. But there are are thousands of different networked (wired and wireless) devices out there, with hundreds of different variations in how they handle DHCP and ARP broadcasts upon power/wakeup and shutdown, so it’s gonna take a while for Sense to fill in the blanks. The good news - they’re going to work on recognizing devices that they have data for, so keep your Sense on the network,

1 Like

Re your “fun science experiment”, I’d add “expensive” and “misleading”, but I get your point. I’m giving up on expecting it to give me the data I bought it for, and regularly monitoring it. Reset coming up.

I only have WattNode on my biggest consumers…way too expensive to put them everywhere, hence my purchase of Sense. The next group of consumers has individual on/off monitoring and a calculation based on Kill-a-watt monitoring for a short period of time. Of course, that misses the on-going monitoring Sense promises, but doesn’t actually deliver. The last group, 34 of my 57 loads, I don’t try to individually monitor…just the total.

Thanks for setting my expectations straight.

Andy,
You have gotten me interested in the WattNode pulse meters. Do you use one to give you a read on your solar output ? The reason I ask is that my solar inverter readout is consistently high by about 4% compared to my Sense, which shows results that are within 1% of my revenue net meter. I would like to get independent confirmation of my solar output, especially since I’m paying based on the readings from the inverter. Two questions arise:

  • Do you have an L1 and L2 backfeed for your solar ? I do. If you have both, which of the two options did you use to connect up the WNB-3Y-208-P ?
  • Are there any cheaper and simpler (less flexible) devices than the WELserver, that I could use to collect pulse output data ? If I total up the cost of the WEL (400$+), plus the meter (250$), plus the cost of the time it would take me to interface, we’re talking 1000$.

It looks like a fun project, and if I can find a more turnkey and lower cost data acquisition unit, I’ld do it…

With almost zero solar in recent weeks (cloudy, then lots of snow), I kind of wish I was paying for production, but we purchased the array and micro-inverters. I’d have thought that solar fees would be based on the revenue grade metering, like our SREC’s are. We read it monthly and e-mail in a meter photo.

Yikes, the system costs seem to have gone up quite a bit since we built our geothermal/solar home and I first installed Wel. I’ve also been adding both systems and monitoring incrementally over time, so the bite has been spread out over almost 10 years and we didn’t install solar till year 6. See Web Energy Logger: for our current monitoring system.

I also see a higher production reading from the microinverters (Enphase) than from the revenue grade meter and from my Wattnode, which match closely. Sense falls below both by a few %.

I’m told by my installer that the microinverter sensing is not as accurate as the revenue grade meter, and that it’s always higher due to transmission losses from the microinverters to the power panel. That apparently can be a few %, which is surprisingly high.

I’m not sure what you are asking in Q1 below. Our configuration is “single phase three wire” for both the power company and the solar. The solar system is 33 panels and microinverters for each, so it looks exactly like the utility service at the panel…phase a, phase b, and neutral…

Can’t help with other devices…I’ve been totally happy with Wel and not much interested in others. That said, although there were hardly any residential choices…lots of industrial units for thou$and$…there seem to be lots on the market now. When I got into this, I was really happy to see Wel for what I remember as $299 for the home starter kit. I don’t remember what the wattnodes costed (bought 6 years apart), but this seems higher too.

Regards,

Andy

Pretty cool and extensive heating and electrical systems ! Much simpler here in NorCal. Only 21 panels and a single SolarEdge, though all the panels have power optimizers. I don’t think suppliers started putting revenue grade metering into inverters until a couple of years ago (my system is 5 years old). Previously they would tend to measure a little bit upstream from the output to the house, so they would overestimate on the high side which plays nicely for the installer community. I pay about 10c per kW/h for the 20 year life of the panels.

As for my question, is sounds like have the same setup with both phases and a neutral feeding into the main panel. The install manual for PV option suggests two different approaches for measuring both phases using a single CT - either run both solar phases through the same CT or use a CT for each and parallel them into the Wattnode. Just wondering which you did ? It also sounds like the Wattnode loses accuracy if the the two legs coming off your inverter are unbalanced unless you want to use a Wattnode for each leg. I’m going to have to think this one through a little bit.

The WattNode with Option PV only has a single phase available to measure the inverter. If you have an inverter that generates a single 120 VAC output (or 230 VAC for Europe), you can measure the energy with full rated accuracy. If you have one or more inverters that connect to both L1 and L2, then you can still get good accuracy, but not quite as good. The error will generally be one-half of the percentage difference in AC voltage between L1 and L2. The L1 and L2 voltages should be well balanced, but for example, if your L1 voltage is 119.6 and your L2 voltage is 121.4, then the difference is 1.8 VAC or 1.5%, so your error would be approximately 0.75%. If you need better accuracy, you can use two separate WattNodes instead.

Ah, now I understand. We’re running a full wattnode for the solar array, not the PV connect. So, I have two current transformers for each wattnode and two for the Sense. So, I’m measuring both voltage and current for each line. At the moment I’m exploring additional wattnodes for the biggest consumers, perhaps lumping those thru a single sub-panel to save $. That’s the analysis that I’d hoped Sense would help me out with…alas, it doesn’t.

WOW. Up here in the northeast, power costs us 22c/kwh. That makes a big difference in the breakeven. Also, when we installed, NH was offering some nice rebates, gone now. The economics simply didn’t support the solar as a service approach. Our solar generates about 2/3 of our annual use, and payback calculations give us a breakeven in just under 7 years. We do have another roof, less optimal than the current array, so we may expand later on once the primary array is paid for. Of course, government meddling is making future solar projects much harder to justify…sigh.

Regards,

Andy

2 Likes

The solar PPA (power purchase agreement) worked out for us, but because SolarCity needs to guarantee the power and pay for the HW, they only put panels on the choicest, most irradiated parts of our roof (only about 1/3). We maybe only get 20% of our household needs from the panels (EVs are about 35% of the appetite), but because we’re on an EV TOU schedule and tiering we alleviate 40% of the cost.

Good luck installing more - Kevin.

I work in ML in medicine. Totally different domain and different statistics, of course. But for us, having expert curators to tag the data with ground truth (This patient discharged, this patient died; here are the borders of the tumor; this heart rate increase was due to exercise, seeing your granddaughter, sepsis, etc) is helpful for us as we let ML do its thing. It allows us to have more reliable points for calculating cost function of the current ML estimate.

I don’t know if Sense is using similar techniques, but I bet you could find a cohort of volunteer curators to add to your team of full-time data scientists.

(By the way, I’m rooting for you guys. Product Development is tough, and doing it out in public with the constant barrage of critique cannot make it any easier. Keep it up!)

Paul

6 Likes

Indeed!

Our data science team is doing a lot of this type of work on their own, and has some tricks up their sleeves to get some ground truth data from eager volunteers very soon. Great suggestion!!

3 Likes