Given that we’re into the New Year, I decided to do one last analysis of my 2022 energy usage and compare against my utility bill for accuracy. Along the way I was reminded of all the weird and hidden idiosyncrasies in the Sense Data Export feature.
Starting Out: How to Pull Sense and Utility Data
Download Sense data for the year 2022 data broken in to 1 hour intervals using the Export function in the Web App.
Examine both files for missing hours and other irregularities, then fix or log them - more on the details when I discuss the irregularities, below.
Select just the “Total Usage”, “Solar Production” and “Always On” data from the Sense download, then pivot the data to a wide format where those 3 entries are columns.
Subtract “Solar Production” from " Total Usage" to get “Net Usage”, so I can compare against my utility’s net meter reading.
Sum the the 15 min utility data into hourly readings so that both data sets are in hours.
Join the data from the two sources by date and time so everything is matched up, then compare.
What I Discovered in the Exported Data
The Sense exported data isn’t necessarily in date/time order in the Exported CSV file that comes out - don’t rely on it being in order. And don’t be confused if the first entries you see in the file are not the first or the last day or hour in your interval. Sense chunks the data when exporting, resulting in potential confusion. Just remember to sort by date / time before thinking the export is wrong.
You will likely see missing hours in the exported data, for a lot of reasonable reasons:
Native Detections - can leave lots of hourly gaps, most frequently when a native detection is off for an entire hour. Sense does NOT log a 0 in the Export file for hours when the device was off, so there will not be a row of data in the Export file for that hour. Unfortunately, there are other possible circumstances and time periods where Sense fails to produce an hourly record as well - before a device has been added to your home, after the device has been removed from your home and sometimes when the Sense monitor or your home network is having difficulties. Sometimes, working with the Export file, you’ll need to interpret whether a missing hour is really a 0 measurement or truly missing data.
Smart plugs generally report continuously, so Sense Export typically does include data for every hour as long as the measurement is over 1/2 Wh or so, and as long as Sense monitor and your home networking are delivering the data. You can see the real measured data in the Device Power Meter for each smart plug. If you see gaps / 0’s that could be an indicator of data drops, that will not show up in the Export.
You are also likely to see duplicated entries for few reasonable reasons:
There might also be mismatches between in-App devices and Exported devices, beyond the duplicates situation. I’m not sure why these mismatches happen yet, but I have at least on mismatch for all of 2022 - “Floor Heater Guest” (in-App) vs “Floor Heat Guest Room” (Export).
My utility data isn’t perfect either. I’m guessing yours won’t either. My 15 min data didn’t have any holes, but it did have two issues:
My utility has the same representational problem as Sense, when “falling back” - there is no way to represent the second 1AM hour (1AM standard time), in the export format. But instead of outputting the second 1AM as a duplicated hourly result, as Sense does, the utility just ignores the second 1AM.
There are also 15 minute increments labeled as * This data was estimated, some of them when my Sense monitor wasn’t producing any data, but not all. presumably, these happen when there is either an power outage, or the utility’s own network isn’t functioning correctly.
To get a quick summary of most of this info from out of your 2022 data Export, you don’t need to be a data whiz. All you need to do is to read the CSV into Excel and do a Pivot Table.
14 Hours in 2022 Lost to Sense
Here are 14 hours of 15 minute intervals reported by my utility which Sense missed.
The first chunk on Jan 6, labeled “* This data was estimated” truly looks like a power outage. Note how small the usage is - 20Wh to 130Wh.
The second chunk, on July 29th, is more likely an internet outage or network problem on my end that lasted beyond Sense’s 6 hour storage.
DATE START END USAGE NOTES Hour
1 2022-01-06 02:00 02:14 0.02 * This data was estimated 02:00:00
2 2022-01-06 02:15 02:29 0.04 * This data was estimated 02:00:00
3 2022-01-06 02:30 02:44 0.02 * This data was estimated 02:00:00
4 2022-01-06 02:45 02:59 0.04 * This data was estimated 02:00:00
5 2022-01-06 03:00 03:14 0.02 * This data was estimated 03:00:00
6 2022-01-06 03:15 03:29 0.04 * This data was estimated 03:00:00
7 2022-01-06 03:30 03:44 0.18 * This data was estimated 03:00:00
8 2022-01-06 03:45 03:59 0.14 * This data was estimated 03:00:00
9 2022-01-06 04:00 04:14 0.12 * This data was estimated 04:00:00
10 2022-01-06 04:15 04:29 0.13 * This data was estimated 04:00:00
11 2022-01-06 04:30 04:44 0.12 * This data was estimated 04:00:00
12 2022-01-06 04:45 04:59 0.13 * This data was estimated 04:00:00
13 2022-01-06 15:00 15:14 0.01 * This data was estimated 15:00:00
14 2022-01-06 15:15 15:29 0.02 * This data was estimated 15:00:00
15 2022-01-06 15:30 15:44 0.02 * This data was estimated 15:00:00
16 2022-01-06 15:45 15:59 0.01 * This data was estimated 15:00:00
17 2022-01-06 16:00 16:14 0.02 * This data was estimated 16:00:00
18 2022-01-06 16:15 16:29 0.02 * This data was estimated 16:00:00
19 2022-01-06 16:30 16:44 0.02 * This data was estimated 16:00:00
20 2022-01-06 16:45 16:59 0.04 * This data was estimated 16:00:00
21 2022-07-29 06:00 06:14 0.35 NA 06:00:00
22 2022-07-29 06:15 06:29 0.24 NA 06:00:00
23 2022-07-29 06:30 06:44 0.22 NA 06:00:00
24 2022-07-29 06:45 06:59 0.22 NA 06:00:00
25 2022-07-29 07:00 07:14 0.16 NA 07:00:00
26 2022-07-29 07:15 07:29 0.81 NA 07:00:00
27 2022-07-29 07:30 07:44 0.12 NA 07:00:00
28 2022-07-29 07:45 07:59 0.15 NA 07:00:00
29 2022-07-29 08:00 08:14 0.26 NA 08:00:00
30 2022-07-29 08:15 08:29 0.39 NA 08:00:00
31 2022-07-29 08:30 08:44 0.33 NA 08:00:00
32 2022-07-29 08:45 08:59 0.11 NA 08:00:00
33 2022-07-29 09:00 09:14 0.15 NA 09:00:00
34 2022-07-29 09:15 09:29 0.18 NA 09:00:00
35 2022-07-29 09:30 09:44 0.23 NA 09:00:00
36 2022-07-29 09:45 09:59 0.54 NA 09:00:00
37 2022-07-29 10:00 10:14 -0.07 NA 10:00:00
38 2022-07-29 10:15 10:29 -0.24 NA 10:00:00
39 2022-07-29 10:30 10:44 -0.21 NA 10:00:00
40 2022-07-29 10:45 10:59 -0.06 NA 10:00:00
41 2022-07-29 11:00 11:14 1.07 NA 11:00:00
42 2022-07-29 11:15 11:29 1.06 NA 11:00:00
43 2022-07-29 11:30 11:44 0.21 NA 11:00:00
44 2022-07-29 11:45 11:59 0 NA 11:00:00
45 2022-07-29 12:00 12:14 -0.4 NA 12:00:00
46 2022-07-29 12:15 12:29 0.03 NA 12:00:00
47 2022-07-29 12:30 12:44 1.42 NA 12:00:00
48 2022-07-29 12:45 12:59 0.01 NA 12:00:00
49 2022-07-29 13:00 13:14 0.18 NA 13:00:00
50 2022-07-29 13:15 13:29 0.29 NA 13:00:00
51 2022-07-29 13:30 13:44 1.06 NA 13:00:00
52 2022-07-29 13:45 13:59 -0.39 NA 13:00:00
53 2022-07-29 14:00 14:14 -0.31 NA 14:00:00
54 2022-07-29 14:15 14:29 0.33 NA 14:00:00
55 2022-07-29 14:30 14:44 0.28 NA 14:00:00
56 2022-07-29 14:45 14:59 0.57 NA 14:00:00
Here are the two hours I find in my Sense Exported data, but not in what comes from my utility.
The first is an obvious, but small issue on Sense’s part - The hour associated with Jan 1, 2023 starting at 12:00AM touches 2022, but doesn’t live in 2022. So this one can safely be excluded.
The second hour, 1AM on Nov 6th, is really a reflection of me fixing the “fall back” problem in the Sense data. The 1 AM is really the 1 AM standard time that follows 1 AM daylight savings time, the extra yearly hour in the day, when we “fall back”. I fixed in the Sense data because Sense outputs duplicate 1AM data. So this one is useful to have ins the Sense data, but can’t be compared against my utility data.
DateTime Solar Always On Total Net
1 2023-01-01 00:00:00 0.001 0.513 2.41 2.41
2 2022-11-06 01:00:00 0.001 0.433 0.834 0.835
When I first looked at the Sense Export, I was baffled to see a huge number of duplicate entries - Two Device entries for the same hour often with different values. Then I realized that device Names can refer to multiple Device IDs, which are the real detections, that have been merged. I then did a count of duplicates by Name and Device ID and got the following list:
Name DevID count
1 AC 3 4466b6a8 834
2 AC 3 90172486 834
3 Always On always_on 2
4 AppleTV Family Room da94c832 2
5 AppleTV Master caa89d7f 2
6 Cable Amps 0b550d1e 2
7 Canon Printer E4KDKPsK 1
8 Canon Printer ks5w26g4 1
9 Coffee Maker 35d584f6 8
10 Coffee Maker b05db37b 8
11 Family Room Switch a12d3fb8 2
12 Fridge f022103b 2
13 Fridge 2 ebd937ef 2
14 Furnace Down dtSNTXPF 2
15 Furnace Up OsBEZsNy 2
16 HP LaserJet 3b211685 2
17 Hue Down 09190c6b 2
18 Keene Vent Bridge 46d8391d 2
19 KP115 Roamer ssi-78bdfde3 2
20 Kyle’s PC da6fd827 2
21 Microwave d0956b5a 129
22 Microwave e0a3bf5d 129
23 Modem 6f6e7d08 2
24 Office Always On 1022400f 2
25 Playroom Cluster ba04ecc2 2
26 Recirculation Pump 9f9090d6 2
27 Router 4bb72296 2
28 Solar Production solar 2
29 SolarCity Bridge 80cd62b4 2
30 Switch e4372a48 2
31 TiVo db039872 2
32 Tivo Mini 04894581 2
33 Total Usage mains 2
...
Surprisingly there appeared to be 2 kinds of duplicates.
Same Name but different DevIDs - that’s the merged device case, like AC 3, Coffee Maker and Microwave.
Same Name, single DevID - How did those happen ? How, became an apparent when I looked at the date and time of those duplicates, all at 1AM on Nov 6th. All of those are related to the extra “fall back” daylight savings transition hours.
Once I moved all those Nov 6, 1AM duplicates to the extra hour, I had a much shorter list, that made sense due to merges.
There’s still one minor mystery - the Canon Printer is not merged and is a smart plug. I suspect this duplicate happened when I was setting up the smart plug and I might have reset the device and connected a second time.
I should also point out that I have a number of smart plugs that are merged using “What’s connected to this ?”, but none of those show as duplicates in the Export. That’s because Sense Export only outputs the data from the smart plug, since it is the smart plug reflects the true data for the whole device. Not the same as merge of native detections that are additive.
You Might Find Some Hidden Surprises in Your Data Export !
I found this hidden device with Device ID ‘battery’ scattered throughout my Sense hours from July 9th until Dec 28th. No real data, but perhaps related to this ??
DateTime DevID Name kWh
1 2022-07-09 02:00:00 battery NA 0
2 2022-07-09 03:00:00 battery NA 0
3 2022-07-09 20:00:00 battery NA 0
4 2022-07-11 10:00:00 battery NA 0
5 2022-07-11 12:00:00 battery NA 0
6 2022-07-11 13:00:00 battery NA 0
7 2022-07-11 14:00:00 battery NA 0
8 2022-07-11 16:00:00 battery NA 0
9 2022-07-11 17:00:00 battery NA 0
10 2022-07-11 18:00:00 battery NA 0
11 2022-07-11 19:00:00 battery NA 0
12 2022-07-12 10:00:00 battery NA 0
13 2022-07-12 11:00:00 battery NA 0
14 2022-07-12 12:00:00 battery NA 0
15 2022-07-12 13:00:00 battery NA 0
....
656 2022-11-22 15:00:00 battery NA 0
657 2022-11-22 16:00:00 battery NA 0
658 2022-11-23 19:00:00 battery NA 0
659 2022-11-30 12:00:00 battery NA 0
660 2022-12-05 17:00:00 battery NA 0
661 2022-12-06 11:00:00 battery NA 0
662 2022-12-11 20:00:00 battery NA 0
663 2022-12-13 00:00:00 battery NA 0
664 2022-12-14 18:00:00 battery NA 0
665 2022-12-16 13:00:00 battery NA 0
666 2022-12-16 14:00:00 battery NA 0
667 2022-12-20 10:00:00 battery NA 0
668 2022-12-23 03:00:00 battery NA 0
669 2022-12-27 12:00:00 battery NA 0
670 2022-12-28 19:00:00 battery NA 0
You might also find deleted devices lurking in your Export file as well, but non this year for me.
Here’s the fast way to glean most of what I highlighted above for your data using basic Excel. It only takes a few simple steps:
Read the 1-hour data from Jan 01, 2022 to Jan 01, 2023.csv into Excel. Select the region that included the header and columns with the data, all the way to the bottom of your file.
Do into the top level Data menu and hit “Summarize with Pivot Table”. Fill in the Pivot Table Fields like this. The “# of Data Hours” user to be “Count of Device IDs”, but I changed the name in the next step.
You’ll get a new Pivot Table sheet that looks like this. I have done a couple modifications to my Pivot Table using the customization icon to the right of the Row Labels. I have removed a few devices for clarity, and sorted by the decreasing “Count of Device IDs”, then renamed to “# of Data Hours”
I have also added the type of detection used for each device, to add a little context. It also illuminates a few things.
Total Usage and Solar production are the most reliable (always produced data) - Sense produced data for these for 8747 out of 8760 hours in a non-Leap-Year. So 13 missing hours.
My most reliable smart plugs produced data for 8693 out of 8760 hours, missing 67 hours in the year. What’s more, is that it is the same 67 missing hours for all the most reliable smart plugs - the 13 hours where it looks like my Sense monitor wasn’t collecting data, plus the same 54 hours where there was likely some kind of network SNAFU in my house that borked even the most reliable smart plugs.
Always On was only produced for 7589 out of 8760 hours - 13 missing due to the Sense monitor and 1158 hours missing due to other reasons. Typically, Sense does not produce a value for Always On in export when the value of the calculation is either zero or negative, usually due to something being wrong with the top-line Solar Production and Total Usage data.
Devices with multiple Device Ids for a single Name are also clearly visible - Merged devices like AC 3, plus also measurements taken by my Wiser smart plug/switches.
You can also see the devices that don’t run much, but use lots of energy in the Native Detections at the bottom - devices like our EVs and oven.
One of my end goals was to get a better view of my big ticket device over the year. With a little more knowledge of my Exported Data, I’m ready to do that by charting my weekly usage using a program I wrote a couple years ago and have since improved.
Unfortunately, it looks like my Exported data is somewhat less amenable to being plotted than it was a couple years ago. Two things stand out and distort/obscure the results.
Other is negative every week - At least according to the Export data. This is where I have to do some digging. Three important things to realize:
Sense doesn’t output values for Other via Export. A user has to calculate them as Total Usage - all detected usage (Always On + native detections + smart plugs).
Some of the detections in the Export data might be extraneous - I found a second redundant AC 3 device that nearly matched my main one in the Export file. Extra false loads like this can create negative Other.
I’m not sure how the Sense app treats negative Other. Places where Other might have gone negative are visible in the Hourly, Weekly and Monthly trend bars - the bar drops to zero. That might indicate that the Sense app zeros out Other any time it goes negative with the app or is used for their data aggregation
I’m going to do three fixes to the data based on these discoveries and see how the results change:
Filtering out the extra ‘AC 3’ entry
Clamping the ‘Kitchen Overheads’ to 105Wh when they are on.
And the same trick Sense seems to do - clamping Other on the bottom to 0.
Time to get some understanding about how my device totals compare against the in-App totals This is going to be a manual process because I have to hand-enter trend data from the app (the Home Assistant integration only pulls Trend data for top level usage and solar production), so I’m just going to spot check the top 20 devices against my 2022 totals from the Exported data.
The only significant difference is Other - in-app yearly total is 2823kWh vs
4273.0 with Export data and calculation, when I zero out hours where Other is negative
-899.6 with Export data and calculation, when I leave the negative hourly Other data in
But we have to remember that Other is computed value (Total Usage - all detections for that hour - Always On). If any one of those values was different or not included, then Other is going to be off.
Several Exported devices had duplicates and in-App did as well:
Kitchen Overheads - the device with the huge power spike problem in Oct. I included both in my Export calculation and in-App Sense shows 2 as well. When I add the two in-App numbers together I get the same 5179.1 the Exported data shows.
AC 3 - The air conditioner device that has a nearly duplicated Export device, but for a shorter time period. I only included the one over the longer time period in my Exported data. But that matches up with the larger of the two in-App Ac 3’s.
There’s also a Floor Heater Guest device that shows up in-App but NOT in the Export, probably because I lumped a bunch of smaller devices into “Other Identified”. If expand to all devices I see something really weird. In-App, I have a “Floor Heater Guest”, spelled exactly that way at 899.2. In Export, I have a “Floor Heat Guest Room” that only comes in at 32.0. Not sure why I have this mismatch between in-App and Export device names.
I’ll also point out that if I include the second AC 3 (as Sense does), plus the Floor Heater Guest into the Other subtraction, my Export number comes much closer to Senses, down from 4273 to 3184. So I’m guessing I should do a full direct device-by-device comparison.
The Trouble with Time Series Data !
My Export vs. In-App number comparisons pointed out that I need to do an even more basic device-by-device comparison between the Exported Devices and the In-App Devices. To do this, I carefully scraped the full Device List from the current Sense Device List, from the Sense Web App. Overall statistics:
Sense In-App - 78 devices (by Device ID) with 73 unique Names (5 name duplicates)
Sense Export - 77 devices (by Device ID) with 71 unique (6 name duplicates)
So not so bad, right - maybe 1 difference ? Turns out it’s a lot more complicated.
My first table, below, is a name comparison with the names/w device IDs that are In-App but not in the Export on the vertical and the names/w device IDs in the Export but not In-App on the horizontal.
3 of the names are somewhat different but have matching Device IDs (green, orange, blue). In two the cases the names are only minimally off, one with an extra space and the other messed up around a special character ('). The last is a place where the device looks like it was renamed in-App, but somehow Export is still seeing the default detection name, Water Heater 3.
There are also 4 names, 1 in-App and 3 in the Export, where the mismatches are all understandable (the ones in green), given the sources I used. Solar, mains and battery are not in the device list. Other is not in Exported data, but is in the device list.
That leaves 5 names In-App and 2 names (duplicates) in Export that don’t have counterparts.
The bottom is a scaled down comparison I did using Device IDs. I removed the ones I resolved. Ultimately a few more unmatched Device IDs showed up, though 6 of those are probably due to duplicated devices (3 in each list), which gets us back to the same 2 and 5 mismatches (in yellow) that existed at the name level.
I should also point out that many of these mismatched devices, though not all, come from Sense integrations and tend to live on forever, even if the origin devices disappear.
Ugly huh ? But I don’t see this as sloppiness on Sense’s part. This looks very similar to problems that I have seen when working on projects for my wife’s company, having to do with analytics for keywords in Google Ads. All the time-series analytics work nicely when everything is stable without changes to Campaigns, Ad Groups and Keywords aren’t changing. But things get much more challenging from an analytics perspective when the overriding structure changes. In Google-land, it’s Campaign, Ad Group and Keyword changes as well as changes in the associated hierarchy. In Sense, it’s new devices, renamings, device deletion and rediscoveries. Both of these result in objects being around that might not have existed at the beginning or end of the analysis period, and in multiple instances with the same object name that may or may not correspond to similar named objects.