Internet Data usage - 1.79Gb Uploaded in under 3 weeks

Hi folks,
Finally got some free time and switched to Unifi routers. Deep Packet inspection shows:

125Mb Down
1.79Gb Up
to Amazon (assuming your server infrastructure is on AWS).
This is from the last 2.5 weeks. Is this expected behaviour?

The last 18 days of my Sense from Unifi.

image

I believe in a thread a while back, the couple hundred was quantified a bit more to around 300Mb / day.

My use from a 10-day period. As you have more devices detected, you’ll have more local processing of signal data (as I understand it), so less data upload is required.

I believe the above period includes at least one device firmware update, too.

@jamieeburgess, in line with what I have seen previously:

Wifi data useage - #4 by dave

Far less than if Sense were uploading every bit of data that it conceivably could sample.

4M samples/sec x 2 Bytes per sample (really 12 or 14 bits) = 8MB/sec
8MB/sec x 3600 sec/hour x 24 hours/day x 30 days/month = 20736 GB/month.

1 Like

Something to note:

I’ve been monitoring upload/download since getting a couple of Senses and, as you might expect, a Sense with multiple smart plugs and many detected devices will upload significantly more data than a Sense with minimal detected devices (in my case the second Sense is not actually doing any detection since its CTs are clamped to individual device circuits).

1 Like

Here is what mine does. Interesting to see that 1GB download. I didn’t think SW updates would be that large on this device if that is what it reflects. I should install my UniFi USG to get better derail.

Since 1/15
sensedata

Side note: Interesting to see that so many people using sense use unifi as well…

Thanks everyone - seems normal.
Does feel a little on the high side though.
Once the model is trained for known devices I’m assuming the upload reduces.

Ha…It never actually gets trained!

Is there no edge compute going on?
I would have assumed once the signature has been processed by the cloud AI models reading the (seemingly vast) amount of data, Sense can push that signature to the device so it can recognize it going forward locally and report if that device is simply on/off (and then not need to send the data for it to the cloud?)

@jamieeburgess,
Per my quick short calculation above, it’s clear that there is a LOT of edge computing and processing going on in the monitor or we would be seeing 3-4 orders of magnitude (1000x -10000x) more of raw monitor data flowing upstream.

I think you are also making a few other assumptions that may not be completely true.

  • Sense still has to stream some form of “full data” back to the the mothership for two other reasons. 1) Feeding the power meters, which reside of the mothership and 2) for detecting devices that have slow transitions, outside the range of edge processing done by the monitor, like EVs. You can probably calculate the minimum data required by thinking about 2B of data every second or so for every Power Meter that gets populated (main power meter plus all detected devices). The Sense monitor needs to send more than just on/off data back to the mothership for the power meters.
  • Sense also has to continue pull raw-ish data for training / identification even after you may have had many devices detected, and the monitor has many “models” already residing in it, because it still needs to sort out unidentified transitions.
1 Like

Begs the question: If bulk (edge) memory were cheap (and available), could the modeling optimization speed be orders of magnitude better?

Thought: If truly raw data were available you could go back and re-process it much more effectively.

Of course Sense must have thought about this in depth when designing the system and spec’ing the edge memory. I’m wondering though if a certain number of test-bed systems (with TB of memory) wouldn’t accelerate things? With 8TB of edge memory you could get almost 12 days of data!

The next question is likely: Could you then ingest that data and process it at the edge without also beefing up the Sense itself … probably only in the lab.

Then again, Folding@home reached 146K TeraFLOPS and SETI@home was on over 5 million PCs. These days: Bitcoin. Wouldn’t it be fun to watch your PC power consumption in the Sense PM this time around!

Maybe that wasn’t such a crazy idea @markandcarrie2004 !

Bitwatt? Wattcoin?

Good question. In my mind, it’s all about the “signals” Sense is looking for and what it takes to “tune in” to those signals, while removing noise. I think Sense is really looking for three types of “signals” today:

  • Physics-based transitions/signatures in a 1/2 to 1 second window. Examples include resistance heaters, traditional (not LED) lightbulbs, traditional microwaves (using transformers not DC), AC motors (not DC). For those, much of the processing is real-time filtering and “feature extraction”, to identify interesting transitions in high speed data and reduce to a set of parametrics (at least 17) that machine learning can use to differentiate between devices. My take is that Sense is doing the right things with the DSP realtime processing, but the big challenge is the models that “listen” to the incoming transitions. Probably a lot of hard things to make it all work - abstracting the models enough for general usage (not just home-specific), model management (making sure the best models are being used and interacting nicely with one another). But I don’t think that poring over the detailed data is breakthrough to better recognition.
  • Electronically managed power flow cycles over a minutes to hours window. Examples include EV charging or mini-splits in action. I don’t think any of this would be helped by close examination of the most detailed (microsecond) data.
  • Patterns of devices working together. Examples include a washing machine, furnace, etc., where multiple physics-based devices interact in predictable ways.
1 Like

I agree from a practical standpoint (meaning I should stop here!) but I do ponder the theoretical …

The cloud upload is restricted by processing, bandwidth and, top of the list perhaps, cost. With higher bandwidth, processing and lower cost would Sense have chosen the upload sample rates they chose. Maybe 2X, maybe 10X, 100X?

High frequency data is surely required at some point, as we know, because Sense is ultimately refining models based on the accuracy of detection (eventually via user feedback) and a good proportion of that detection seems to hinge upon the on/off “signals” that are often begging to be looked at … in potentially raw detail (at least to make a match).

So the question becomes: If you don’t have a high sample rate archive to parse and you know you’re getting the detection wrong (missing the on/offs), you can only look to future successes rather than including past failures (analysis thereof). Strength Through Failure!

There are definitely tradeoffs in the current system-level architecture and division of labor, driven by the relative resource (CPU, GPU, memory, bandwidth) costs and availability, but I don’t see a solution, even with vastly improved cost structures that would entail bypassing a DSP front-end filtering and extracting features… Why ? Because RNNs (recurrent neural networks) that are used to make predictions based on time-series data are resource limited, even with the amazing hardware becoming available. The number of parameters and computations needed for a fully connected RNN, blows up quickly with every new time stage - If you wanted to look at a 1/2 second window using incoming 4M sample / sec data, you would have an RNN the size of the Sun (OK, just exaggerating). The good news is that DSPs are kind of a degenerated RNN that is optimized for finding signals in realtime with lots of incoming data.

There are likely a few folks at Sense that look at the raw data and work on refining the DSP front-end to improve feature selection, and to squeeze out more, better features. I also think, based on my back the envelope calculation that Sense does occasionally upload chunks of near raw data for particularly interesting transitions/events. My take is that the better strategy for “Strength through failure” is to continue bringing in good-quality ground truth. That’s where regular fails and wins can have strong influence on the learned identifications. But poor-quality ground truth, or misguided ground truth, can also cause identifications to bounce around.

1 Like

Precisely.

… and so with sufficient grunt, in theory the DSP “RNN” could be part of the “Sense RNN”.

But yes, I agree, effort at a larger scale (longer samples, post DSP) is no-doubt a more solid and practical strategy.