Smart plug power data glitches

I own both the Wemo and Kasa smart plugs. Both will “glitch” infrequently, where they erroneously report (or Sense erroneously records) 0W or >10000W.

Kasa HS110 glitching over 10kW for a few seconds:

Kasa HS110 glitching to 0W, sometimes for minutes at a time; sometimes for days at a time:

Wemo glitchgin to 0W for a few seconds to a few minutes:

The main sense CTs don’t show power changes that align temporally in magnitude with what the smart plugs report [as a Sense device] for these glitches…this is another indicator that the reported smart plug power is wrong when these glitches occur.

Have others noticed this behavior?

I’d curious to learn details for how the smart plugs communicate their data to Sense, an what Sense does to validate it gets the correct data. I’m guessing the glitches to 0W are “missing data”, for which I hope Sense firmware updates could be updated to eliminate; but that the >10kW glitches are caused by glitches within the Kasa HS110. By the way, does anyone know how to view the ~1Hz power samples from the smart plugs from their own GUIs?

Can only speak to TP-Link/Kasa, not Wemo, but I suspect they are similar in operation.

  • Sense sends out a UDP broadcast every 2 seconds asking all the smartplugs their status.
  • All receiving smartplugs reply with status packet that includes power.
  • If there are network issues (latency, collisions) or if the Sense monitor is overloaded, data gets lost, and shows up as zero.
  • If you have a switch/access point between the Sense monitor and the smartplug in question, that tries to minimize the effects of broadcast packets, that might also stop information from flowing to the monitor. I have flushed a Netgear managed switch and a mesh system because it was too painful to stop them from “eating” multicast packets.
  • I don’t think there is any real-time checking of incoming smartplug data beyond making sure it came from a legitimate status response packet.
  • The Kasa app doesn’t do any real-time charting so it isn’t helpful in spotting the missed data.
  • You can write your own code to do real-time sampling.
  • @duanetiemann has some pointers to both documentation and code here:
    Anyone have any info on the TP-Link Kasa HS110 Energy Monitoring Plug API - #5 by duanetiemann
2 Likes

Sounds like this is a “bad” protocol used by Sense, that can (and does) result in dropped data frequently, and highly dependent on customers’ LANs. A better protocol would have guaranteed delivery, at least within something like a 1 day buffer (or however long the smart plugs keep their data in their local memory).

One can only access the plugs using the API provided.

Looks like TP-Link smart plugs use a TCP protocol, and so I would think data delivery would be guaranteed. (Looks like they also support a UDP protocol, but I don’t think that one is used for the real time power data delivery.)

Perhaps even with guaranteed delivery, the API is still just “give me the latest power data”, rather than “give me all the power data between times X and Y, or since time Z”, and so if Sense is unable to send a request fast enough, the API might not provide a mechanism to pull slightly older data (that it missed out on) out of the smart plug to fill in gaps. Lame if that is the case. (If that were the case, however, I would think Sense wouldn’t treat the missing data as “0W”, but rather as whatever the last valid data sample returned was.) I dunno…I encourage Sense developers to see if a more robust method for getting the smart plug data is possible.

BTW, having the smart plug data have 0W glitches I think screws up the “Always On” data.

Couple things: first, if you haven’t already, definitely report this in to the Support team. I just went through all my plug data and am not seeing any dropouts (that I can’t make sense of by comparing to my Unifi data). With that last point in mind, are you totally certain these aren’t broader network issues?

Let me bring this up with the team as well and get back to you.

1 Like

I think Sense and Kasa talk to TP-Link smartplugs via TCP port 9999, but from what I remember from using Wireshark is that the communication is all UDP packets and protocol. Sense sends a UDP multicast broadcast with the following “emeter” request to port 9999:

{“system”:{“get_sysinfo”:null},“emeter”:
{“get_realtime”:null}}

All the TP-Link devices that have power monitoring respond back to the Sense via a UDP packet containing the status, including the power reading (or not, subject to any temporal networking or device issues). AFAIK, neither TP-Link or Wemo Insight have any local timestamped log of monitored data, but the TP-Link must have some kind of daily and monthly accumulators to support the API commands below:

Get Daily Statistic for given Month
{“emeter”:{“get_daystat”:{“month”:1,“year”:2016}}}

Get Montly Statistic for given Year
{“emeter”:{"“get_monthstat”:{“year”:2016}}}

Bottom line - As @RyanAtSense suggests, it’s probably best to contact support, since any one of a number of things could be stoppering up the communication. It would be great if the Sense/smartplug communication was end-to-end guaranteed, but the TP-Link protocol really doesn’t support it. I do wish two things:

  1. Sense would mark the data differently so we could tell the difference between a true 0 and NA (not available).
  2. Sense would flag excess NA generation, either from smartplugs for from the CT main/solar measurements, so they could trap any issues with the Sense monitor and I could check my network more quickly for issues.

I kind of alluded to that in this wishlist item here. Like this request to help influence this a new feature.

1 Like

I haven’t vetted the Sense smart plug data. But the raw data from some of my HS110s and the HS300 is glitchy.

I intend to provide a sanitizing feature. Maybe Sense already does that.

I see individual samples well over 10,000 watts. And one sustained (15 min) burst of 283,233 watts.

2 Likes

@duanetiemann, with your experience with the TP-Link smart plug API, is the emeter data retrieved via TCP or UDP, or are both options? Do you see occasional periods of time, seconds to minutes in duration, where you get no samples back from the smart plug, using your utility?

Does the API allow historical emeter samples to be requested, or if a monitoring client (like Sense or your own tool) misses an emeter sample point, the missing data is gone forever?

I have seen my Sense record >10kW power samples from my HS110 (but not for 15 minutes…I’ve only seen a few such >10kW glitches, and each only for a few seconds), so I don’t think Sense implements a “sanitizing” feature. What were you thinking…if emeter sample > 2kW (or some other threshold), then treat it as an error???..clamp it to 2kW???..return the last known “good” sample instead???..return 0W???

I’m curious if you have had any technical collaboration with TP-Link or Sense.

I use TCP. I don’t know of a UDP interface. I have seen periods of no data, but it may have just been problems in my network delaying requests.

There is a daily summary history available from the plug that can be queried with the get_daystat request providing month and year of interest. I don’t think individual past samples are available. BTW, You can also access the smart plugs remotely via their website. I don’t know if the plug gets the daily numbers from the website or stores them locally.

I’m reluctant to sanitize data. I’ve been critical of Sense zapping Other to zero when individual measures add up to more than the total observed. But these seem to be clear measurement errors that are not particularly useful. I expect I’ll provide a sanitization option so users can see where raw measurements went south if they need to, but generally expect that they would use sanitized versions. The threshold should be a setting, IMO. And glitches should be reported as missing data when sanitized.

I would expect that I reported the errors to TP-Link, but I don’t see any record of that. Maybe I didn’t get to it.

I don’t have any formal relationship with Sense or TP-Link, other than being a beta tester for Sense.

3 Likes

TPLink and Wemo work differently on the network.

Wemo uses a separate HTTP transaction to fetch the current wattage. This is slow and generally an extremely inefficient way to get the ~16 bytes of information which we actually need. Having to do this every few seconds contributes to the overall latency of your network and can definitely saturate Sense’s wifi connection with enough plugs.

TPLink has two ways to fetch the current wattage: TCP, UDP unicast and UDP broadcast. UDP broadcast is the best choice for sense because:

  1. We only care about the wattage when the query is sent, and we don’t backfill from the smart plugs, so a slower protocol like TCP would not help us. Yes, TCP is more reliable, but only because it tries retransmitting unacknowledged packets. By the time those retransmissions have been done, Sense doesn’t care about the data any more.
  2. Broadcast UDP (which is distinct from multicast, which TPLink does not use) is great, because it means that Sense can send a single broadcast wattage query to the whole network and receive unicast responses from each plug, all in a single packet. This means that the whole network’s load is basically as minimal as possible.

As for the weird data above, Sense believes the plugs. The plug from the first example was claiming 11188.48 watts, 15 times in a row, and it continued with its weird readings afterwards, as you see in the plot.

For the instances of dropouts, it’s either that the plug was reporting zero watts, or that the plug was not responding to queries for that time period. We see both happen, but not too frequently.

Currently, Sense does not do any scrubbing of high-but-possible numbers from the data (10kW being in the sane range for a plug to report).

10 Likes

Thanks for the explanation Jonah ! This kind of tech info is good for helping the user audience to understand why adding and supporting different smartplugs isn’t a trivial exercise, plus the tradeoffs involved in semi-realtime data acquisition.

I wish that dropped/missed data packets would not be treated as 0W by Sense, but rather as “missing data”. Some of my smart plugs are on devices that are always on (e.g. model/router/switch/NAS), but due to missing packets, Sense currently erroneous thinks these devices briefly consume 0W and thus doesn’t consider these devices as “always on”.

1 Like

I assume perhaps naively that the complication there is that the database is “numbers” so N/A does not compute. How to encode and store it?

@JonahAtSense: “-1”?

Most data science infrastructure (R or python packages) has native encoding for ‘NA’ for all native datatypes plus ‘NaN’ (not a number), ‘Inf’ and ‘-Inf’ for numeric types. And pretty much all core R functions and packages know how to deal with these. How these are encoded underneath is hidden from the user. But since Sense communicates data to users via CSV, external coding is via ASCII/UTF. CSV readers have appropriate conversion options to convert text NA to a data science NA, etc.

I’m assuming that all our monitor data flows into a data science savvy database system. How Jonah would convey NA from the Sense monitor to the mothership is probably more a function of the data protocol of the send packets.

Understood … though what I was getting at was also “how does Sense encode it their end”. The user data (CSV) seems manageable … I’m wondering I suppose if what could/should be “NA” encoded in the sampling is actually “0” at the sample level?

I wish that dropped/missed data packets would not be treated as 0W by Sense, but rather as “missing data”.

We actually do have some hysteresis to prevent occasional dropped packets from producing sudden drops in the wattage graphs. A dropout has to occur for something like 30s before it’ll count as a zero wattage.

I think the request to show offline as “n/a” or some other non “0” representation is reasonable, and I’ll forward that feedback to the product team for consideration as a future improvement.

3 Likes

Thanks @JonahAtSense,

Sounds like the hysteresis can “fix” short dropouts. I already have a wishlist item to optionally expose missing data in the UI and export via something like NA.

@ixu, there’s one more tricky element about dealing with NA. When I added my wishlist item, I suggested it as an option that a user could toggle on and off. Why ? Aggregation operations. When Sense rolls up hourly, daily, monthly and billing cycle level data, there are two ways to do the aggregation math:

  1. Treat the NAs as NAs which means that an hour of power consumption with 59 min and 30 sec of good data and 30 sec of NA, sums to NA. That’s a good thing when you are looking for issues and trying to understand their sources. Not so useful when you want to squeeze out as much info as you can from the data.

  2. Treat the NAs as 0. That’s how Sense currently functions. That kind of hides data loss, especially when aggregated into hours, etc.

There might be a hybrid approach that aggregates an entire hour of NAs to NA, while treating the NAs like zeros if the NAs don’t fill an entire hour. But probably the best option would be to only expose the NAs in the main and smartplug Power Meters - that would pinpoint the exact times of missed data. Then Sense could continue to aggregate with NAs treated as 0, exactly as they do today.

1 Like

@kevin1 … nice explication.

The other method, I think, though potentially complex from a few angles, would be to stick to the NA as 0 and break out the NA event into a separate log.

  • Infrequent (>1hr) NA cycles could be logged without significant processor load (?) or data overhead.

  • Mid-frequency cycling NAs (<1hr) could be logged either in detail (if processor/data allows) or aggregated and characterized in the log.

  • High-frequency cycling NAs (~1s?) could trigger some kind of modal behavior. Methodology open to debate on that.

In any case, the existing method of NA=0 could be adhered to and historical data exports remain valid.

I see some kind of log export of other data as inevitable anyway.

2 Likes