I’m doing a quick cut and paste of 4 separate posts I did, from the more limited access Data Analysis category. @RyanAtSense can give you access to that category if you have a lot more interest in the kind of data analysis folks are doing with Sense.
The first thing many people ask when they first install their Sense is “How accurate is Sense vs. my utility monitoring?” We can now begin to answer that, at least in the case where one’s utility allows them to download usage data. I’m going to walk through my accuracy analysis process for all the existing data I have for 2018.
In my case, I have solar, so I have two utility data sources:
- An hourly net meter energy reading from my electric utility, PG&E, via their “Green Button”. Unlike some solar installs, I don’t have a second meter looking at my total house usage - I’ll only be able to compare Sense net usage vs. the net usage from my PG&E meter.
- An hourly solar energy reading from my SolarEdge inverter. The data is downloaded via my SolarCity / Tesla portal. Unlike PG&E and Sense, the SolarCity download capability only allows hour resolution downloads a day at a time. I had to write a little script to download all 224 days in 2018 I was interested in, then merge them all together.
Once the data was in, I did a little processing in R,
- Converting the vertical narrow Sense data format into a wide time vs. device format and extracting only the Total Usage and Solar Production columns
- Calculating my Sense net usage (Sense Total Usage + Solar Production, since Solar Production is negative)
- Calculating the difference (Diff) between PG&E net usage - Sense net usage
- Calculating the percentage difference (PerDiff) between PG&E net usage - Sense net usage
- Aggregating the data into daily data so I could also compare that.
An example of wide Sense data below - Note all the NAs (not available) for hours where there was no energy value available in the Sense export for those devices.
Screen Shot 2018-08-25 at 3.40.40 PM.png3252x836 577 KB
Here are the initial results for Sense net usage vs. PG&E net usage in a scatter plot. A high percentage of the data correlates quite well. The slope of the line is almost exactly 1 at both the hourly and daily level. But there are a significant number of outlier data points, especially at the hourly level, where the Sense power values are well below the PG&E values. And based on color, the furthest outliers occurs in March and June. There also seems to be a peculiar upward curving tail in the negative end of the hourly curve. I’ll investigate both of those in depth a little later.
HourlySensevsPGE.png766x748 42.2 KB
DailySensevsPGE.png766x748 37 KB
One more look at the accuracy - If I look at a histogram of my net PG&E usage minus Sense net-usage difference, two things become apparent.
- There’s a long, but very thin “error tail” on the positive side, where the PG&E net is much greater than the Sense net.
DiffHist50W.png766x748 18.8 KB
- But huge majority of the differences are within a very narrow band around zero. Based on my histograms, I’m only going to look in detail at the 217 hourly mismatches (out of 5421 total) greater than 200W to start with.
Hourly difference is less than 500W
DiffHist10W.png766x748 18.7 KB
Hourly difference is less than 200W
DiffHist10W2.png766x748 17.4 KB
Comparing Solar Generation
The next thing to do is to compare Sense against the other data source, our hourly feed from SolarCity, showing production from our 4.2kW solar system that uses a 2013 vintage SolarEdge inverter. All the data is going to be negative (energy consumption is positive, production is negative), so this might also give a little more insight into the negative tail of the net usage correlation scatter plots earlier.
My first scatter plot of the hourly solar data, with coloring set to the month was little confusing. The “line” was more of an ellipse. And the coloring really didn’t offer any clues as to why.
HourlySolar.png766x748 104 KB
But shifting to coloring based on time of day showed a discernible pattern. Sense gave higher readings than SolarCIty in the morning, SolarCity gave higher readings in the afternoon - a systemic error. But why ?
SolarHourly2.png766x748 101 KB
Looking at the aggregated daily data, the ellipse goes away, indicating the the error cancels itself out on a daily basis. Note that I have gone back to coloring by month for the daily plot.
SolarDaily.png766x748 36.3 KB
A few things we can see from the daily plot.
- There is good correlation between Sense solar data and SolarCity, though the slope of the correlation line is slightly less than one. I’m not going to calculate it by regression until I remove erroneous outliers for which I can find a measurement issue, but the slope differential bears out a 4% SolarCity/SolarEdge inverter over-optimism saw with earlier measurements.
- The hourly measurement ellipse is gone. That indicates there were offsetting measurement differences between Sense and SolarCity, that cancel out when aggregated on a daily basis. I suspect that there is an interval assignment difference between the two measurements of 15-30min. Perhaps the SolarCity measurement is for hour centered around the timestamp, while Sense measurement is for the hour following the timestamp. That would explain the ellipse.
- We have a number of big hourly differences between SolarCity and Sense, but only a few big relative differences between SolarCity and Sense at a daily level. That means that the big daily mismatches are likely caused by the accumulation of sequential hourly errors.
A quick look at the hourly differences in a histogram show a different view on the ellipse, a typical distribution the width of the ellipse, with a small number of outliers worth investigating.
SolarDiffHourly.png766x748 21.3 KB
The daily histogram tells us that we really only need to look closely at a small number of mismatches. The histogram once again highlights the systemic optimism of the SolarCity data - the mode of the distribution is offset bay about 800W from 0.
SolarDiffDaily.png766x748 19 KB
Now we have looked at direct comparisons between Sense and both “utility” sources. Next, it is time to do some detailed mismatch analysis between Sense and PG&E data.
Pushing into the Biggest Errors
So now it’s time to figure out the origins of the 217 biggest hourly mismatches between Sense net energy (I’m correcting this from previous entries - I’m really comparing energy, not power) and my PG&E hourly net energy, for the 5,421 exported hours. But how does one dig through that much data to analysis and categorize root causes ? Fortunately there are a couple of easy clues about what to look at more carefully:
- First, the comparison between solar hourly and daily mismatches hinted that multiple mismatches often occur in the same day. Therefore it makes sense to apply a measure to each mismatch that looks at how many other mismatches occurred in the surrounding 24 hours or so. That measure of locality would probably be helpful in identifying systemic issues that last multiple hours.
- Second, looking at the 5 largest mismatches (largest Diff) in the sorted list, it’s clear that there is something wrong with Sense’s measurement during these hours - Total Usage is negative ! So we should look at the magnitude of mismatches vs. Sense Usage and Sense solar to see if there is any pattern. What’s more, most of the largest mismatches take place during only 3 different days (table below)
Screen Shot 2018-08-27 at 8.19.17 AM.png1210x420 71.6 KB
It’s a simple matter to add a calculation to sum the number of mismatches in the surrounding 24 hours. We’ll call that value the number of problem neighbors (ProbNeigh) and plot the results over time.
ErrVsDate.png1500x748 27 KB
Time locality of many of mismatches totally jumps out from this chart, though there also appears to be a random dispersion of other mismatches one time. Let’s push more closely into a time region with plenty of grouped issues.
ErrsVsDate2.png1500x748 21 KB
Clearly we’re going to need to look into what happened on April 29th, June 7th, June 10th and June 21st.
But before we look too closely, we should also look for patterns between size of mismatches, Sense Usage and Sense Solar. Here’s that plot:
SenseSolarVsSenseUsage.png766x748 15.7 KB
It should be immediately apparent that there are some problematic Sense measurements here. Sense Solar data should always be zero or negative (or some very small positive value). Sense Total Usage should always be positive. We have 2 quadrants of measurements that violate those rules with the ones in the negative Sense Total Usage and positive Sense Solar containing mismatches of the largest magnitude. What’s happening here ?
Now it’s time for the painstakingly manual part of the process, looking at the the detailed waveforms for each of the mismatches to see what’s actually amiss, starting with the obvious Sense measurement errors I identified earlier.
The good news is that the Sense web app makes this much easier. Once you are logged in, the Sense web app lets you call up a time specific window in the Power Meter via a start and end parameter in the URL. Using this feature, I was able to quickly generate URLs for every 2 hour window surrounding all 217 mismatches. Here’s an example URL that displays the 2 hours surrounding 9pm on the 2nd of Jan in the Power Meter:
So now comes the fun part - actually seeing what mismatches look like. For my Sense environment, I discovered two flavors of measurement errors. The first was the negative Sense Total Usage situation, shown below.
The start of a negative Total Usage hour:
Screen Shot 2018-08-27 at 12.22.49 PM.png3258x2126 211 KB
The end of a negative Total Usage hour after a reboot:
Screen Shot 2018-08-27 at 12.24.04 PM.png3246x2118 213 KB
For three multi-hour periods on March 7th, April 28-29th and June 7th, my Sense monitor went totally wonky and produced these kinds of waveforms and data, eventually correcting itself or fixed with a reboot. As the earlier graph suggested, this Sense monitor issue was the root case of the 5 largest errors, as well as bunch more smaller ones. The good news is that Sense has updated the monitor firmware to alleviate this bug.
The second flavor of Sense measurement errors first made itself known as I was investigating the cluster of mismatches detected on June 10th. More pedestrian than the negative Sense Total Usage phenomenon, this mismatch stems from simple loss of data, either due to monitor down time or a networking problem extensive enough to prevent full backfill of the data from the Sense monitor buffer.
Screen Shot 2018-08-27 at 12.36.44 PM.png3270x2122 254 KB
It turns out that the next batch of mismatches by magnitude, were caused by this kind of data gap occurring late at night when our EVs were charging, like the situation below.
Screen Shot 2018-08-27 at 12.25.02 PM.png3256x2140 242 KB
Eventually, after looking carefully at the first 20 or so waveforms, I decided to just muscle my way through all 217 mismatches, categorizing the 2 monitor measurement issues along the way. In the process, I discovered a third flavor of mismatch waveforms - one that looked completely normal. The waveform below is associated with the largest gap that can’t be directly traced to an obvious Sense measurement problem or data gap.
Screen Shot 2018-08-27 at 12.27.03 PM.png3246x2140 220 KB
So where did the mismatch come from ? We’ll have to delve a little deeper and more analytically on these, plus look at the PG&E side of the equation as well.
BTW - Quick statistics on the 217 largest hourly mismatches after viewing them all. 36 stemmed from negative Total Usage, 56 tied back to data gaps from the Sense monitor, and 125 look, for all intents and purposes, as completely normal on the Sense side.
Analyzing the Error Types
Whew ! Now that I’m done labeling all the Sense measurement errors and gap that result in mismatches, I can do some fun stuff. I’m going to try to categorize the 217 mismatches by all the known variables, including # or problem neighbors and whether they are Sense data gap issues.
I decided to use a simple K-Means clustering with 4-6 possible clusters, and tried to adjust the weighting to do two things:
- Clearly separate mismatches that were Sense monitor related vs. others of unknown origin, so I could focus on the ‘non-gap’ mismatches
- Create distinct groupings in the “non-gap” part of the mismatches
Without going into all the details and weighting experiments that I did, here are some of results given my strategy.
My clustering cleanly separates mismatches due to Sense monitor problems vs. other errors, and also stratifies by the number of problem neighbors. Note that most Sense monitor errors occurred standalone, or nearly standalone, with no other nearby mismatches.
GapVsNeigbors.png766x748 54.6 KB
The clustering approach also stratifies the ‘non-gap’ mismatches by the magnitude of Sense Usage. We’ll see if that is helpful in identifying the root cause. These first and second scatter charts also explain the labeling of the clusters.
GapVsUsage.png766x748 54.9 KB
One cluster, in cyan, consists solely of both types of Sense monitor issues. The reddish orange are the mismatches that have, by far, the most problematic neighbors. And the olive and lavender have fewer or no neighbors, but are distinguished by whether they are in the high or low end of the usage curve.
Now let’s apply the clusters to more familiar scatter charts.
We saw this one earlier in black and white and wondered about the positive Sense Solar Production and negative Total Usage. Turns out those indeed were all monitor-related data gap errors, as well as others.
SolarVsUsage.png766x748 45 KB
Here’s the same type of scatter chart we used to initially look at correlation, except now I’m using it to look at only the mismatches. Two things stand out to me from this chart:
- I made my dots way too small to really reflect the true size of the mismatches.
- ‘non-gap’ mismatches are heavily skewed toward the lower end of usage. Probably worthwhile looking at a distribution.
PGEvsSenseNet.png1126x854 55 KB
And speaking of distributions, here’s a view of the “density of mismatches” by size.
- Very few large ones - mismatches rapidly fall off.
- Virtually all the really large ones are attributable to monitor issues.
- Probably worthwhile pushing into the next set by size, which are mostly ones with few neighbors but high Total Usage.
RankofMismatches.png1464x673 61.2 KB
Here’s a chart to get everyone thinking. I went back to my graph of the number of problem neighbors (mismatches in the surrounding 24 hours) vs. date for the most mismatch-prone period of time, and overlaid both the cluster coloring plus my best guess at when various firmware upgrades happened. It does look like firmware upgrades may have caused 10 or so of the data gaps in 3-4 different time periods. Other then that, I’m looking for comments on what I should try next.
MismatchesVsFirmware.png1713x781 72.1 KB
One more plot looking closely at what mismatches really look like. Hourly Sense and PG&E data overlaid with markers for the different types of mismatches for some of the most mismatch-prone days in June. What’s fascinating is that the accuracy is typically so good that you can’t even see the Sense data line in pink - it’s completely covered by the PG&E data, except in the mismatch zones. Blue dots indicate Sense monitor data gaps, while the rest of markers indicate the type of cluster categorizing that mismatch.
Rplot01.png2102x962 268 KB
BTW - these monitor issues aren’t all on Sense. I’ve been tinkering with my network for various reasons and have created a few outages on my own. If my Fing network monitor had a deeper event log, I could have overlaid my network tinkering events as well.
Charting “Cleaned” Data
Now that I’ve been able to identify some of the real measurements errors, I’m going to strip those hours off and take a look at the earlier plots again, with the known problematic data removed.
Here are the hourly and daily scatter plots, looking much better…
HourlyUsage.png766x748 40.1 KB
DailyUsage.png766x748 37.8 KB
And the same histograms of the difference between Sense and PG&E are predictably much tighter …
50WBin.png766x748 19.7 KB
10WBin.png766x748 19.1 KB
Plots between my solar data sources also look better - no visually dramatic outliers.
DailySolar.png766x748 36.8 KB
And a tighter histogram spread:
SolarHist.png766x748 23.9 KB
If I do a linear fitting on both usage and solar, I get adjusted R2s that are very close to 1, meaning a very close linear fit.
The hourly usage comparison has a slope that indicates that Sense is slightly more than 1% optimistic vs. my revenue meter. From this perspective it looks like there are still significant mismatches in the center of the line and a curve in the tail in the negative zone that bear investigating.
UsageFit.png766x748 33.8 KB
The daily solar comparison has a slope that indicates my SolarEdge inverter is slightly more than 3% optimistic vs. my Sense data. Probably worth taking a closer look at SolarCity vs. Sense vs. PG&E during solar production. Unfortunately, there’s not a direct way to compare all three, since I only have net readings from my PG&E meter and SolarCity only offers solar data.
FitSolar.png766x748 31.5 KB