I’d like to see your detailed analysis but cannot.
I’d like to see your detailed analysis but cannot.
I’m doing a quick cut and paste of 4 separate posts I did, from the more limited access Data Analysis category. @RyanAtSense can give you access to that category if you have a lot more interest in the kind of data analysis folks are doing with Sense.
The first thing many people ask when they first install their Sense is “How accurate is Sense vs. my utility monitoring?” We can now begin to answer that, at least in the case where one’s utility allows them to download usage data. I’m going to walk through my accuracy analysis process for all the existing data I have for 2018.
In my case, I have solar, so I have two utility data sources:
Once the data was in, I did a little processing in R,
An example of wide Sense data below - Note all the NAs (not available) for hours where there was no energy value available in the Sense export for those devices.
Here are the initial results for Sense net usage vs. PG&E net usage in a scatter plot. A high percentage of the data correlates quite well. The slope of the line is almost exactly 1 at both the hourly and daily level. But there are a significant number of outlier data points, especially at the hourly level, where the Sense power values are well below the PG&E values. And based on color, the furthest outliers occurs in March and June. There also seems to be a peculiar upward curving tail in the negative end of the hourly curve. I’ll investigate both of those in depth a little later.
One more look at the accuracy - If I look at a histogram of my net PG&E usage minus Sense net-usage difference, two things become apparent.
Hourly difference is less than 500W
Hourly difference is less than 200W
Comparing Solar Generation
The next thing to do is to compare Sense against the other data source, our hourly feed from SolarCity, showing production from our 4.2kW solar system that uses a 2013 vintage SolarEdge inverter. All the data is going to be negative (energy consumption is positive, production is negative), so this might also give a little more insight into the negative tail of the net usage correlation scatter plots earlier.
My first scatter plot of the hourly solar data, with coloring set to the month was little confusing. The “line” was more of an ellipse. And the coloring really didn’t offer any clues as to why.
But shifting to coloring based on time of day showed a discernible pattern. Sense gave higher readings than SolarCIty in the morning, SolarCity gave higher readings in the afternoon - a systemic error. But why ?
Looking at the aggregated daily data, the ellipse goes away, indicating the the error cancels itself out on a daily basis. Note that I have gone back to coloring by month for the daily plot.
A few things we can see from the daily plot.
A quick look at the hourly differences in a histogram show a different view on the ellipse, a typical distribution the width of the ellipse, with a small number of outliers worth investigating.
The daily histogram tells us that we really only need to look closely at a small number of mismatches. The histogram once again highlights the systemic optimism of the SolarCity data - the mode of the distribution is offset bay about 800W from 0.
Now we have looked at direct comparisons between Sense and both “utility” sources. Next, it is time to do some detailed mismatch analysis between Sense and PG&E data.
Pushing into the Biggest Errors
So now it’s time to figure out the origins of the 217 biggest hourly mismatches between Sense net energy (I’m correcting this from previous entries - I’m really comparing energy, not power) and my PG&E hourly net energy, for the 5,421 exported hours. But how does one dig through that much data to analysis and categorize root causes ? Fortunately there are a couple of easy clues about what to look at more carefully:
It’s a simple matter to add a calculation to sum the number of mismatches in the surrounding 24 hours. We’ll call that value the number of problem neighbors (ProbNeigh) and plot the results over time.
Time locality of many of mismatches totally jumps out from this chart, though there also appears to be a random dispersion of other mismatches one time. Let’s push more closely into a time region with plenty of grouped issues.
Clearly we’re going to need to look into what happened on April 29th, June 7th, June 10th and June 21st.
But before we look too closely, we should also look for patterns between size of mismatches, Sense Usage and Sense Solar. Here’s that plot:
It should be immediately apparent that there are some problematic Sense measurements here. Sense Solar data should always be zero or negative (or some very small positive value). Sense Total Usage should always be positive. We have 2 quadrants of measurements that violate those rules with the ones in the negative Sense Total Usage and positive Sense Solar containing mismatches of the largest magnitude. What’s happening here ?
Now it’s time for the painstakingly manual part of the process, looking at the the detailed waveforms for each of the mismatches to see what’s actually amiss, starting with the obvious Sense measurement errors I identified earlier.
The good news is that the Sense web app makes this much easier. Once you are logged in, the Sense web app lets you call up a time specific window in the Power Meter via a start and end parameter in the URL. Using this feature, I was able to quickly generate URLs for every 2 hour window surrounding all 217 mismatches. Here’s an example URL that displays the 2 hours surrounding 9pm on the 2nd of Jan in the Power Meter:
So now comes the fun part - actually seeing what mismatches look like. For my Sense environment, I discovered two flavors of measurement errors. The first was the negative Sense Total Usage situation, shown below.
The start of a negative Total Usage hour:
The end of a negative Total Usage hour after a reboot:
For three multi-hour periods on March 7th, April 28-29th and June 7th, my Sense monitor went totally wonky and produced these kinds of waveforms and data, eventually correcting itself or fixed with a reboot. As the earlier graph suggested, this Sense monitor issue was the root case of the 5 largest errors, as well as bunch more smaller ones. The good news is that Sense has updated the monitor firmware to alleviate this bug.
The second flavor of Sense measurement errors first made itself known as I was investigating the cluster of mismatches detected on June 10th. More pedestrian than the negative Sense Total Usage phenomenon, this mismatch stems from simple loss of data, either due to monitor down time or a networking problem extensive enough to prevent full backfill of the data from the Sense monitor buffer.
It turns out that the next batch of mismatches by magnitude, were caused by this kind of data gap occurring late at night when our EVs were charging, like the situation below.
Eventually, after looking carefully at the first 20 or so waveforms, I decided to just muscle my way through all 217 mismatches, categorizing the 2 monitor measurement issues along the way. In the process, I discovered a third flavor of mismatch waveforms - one that looked completely normal. The waveform below is associated with the largest gap that can’t be directly traced to an obvious Sense measurement problem or data gap.
So where did the mismatch come from ? We’ll have to delve a little deeper and more analytically on these, plus look at the PG&E side of the equation as well.
BTW - Quick statistics on the 217 largest hourly mismatches after viewing them all. 36 stemmed from negative Total Usage, 56 tied back to data gaps from the Sense monitor, and 125 look, for all intents and purposes, as completely normal on the Sense side.
Analyzing the Error Types
Whew ! Now that I’m done labeling all the Sense measurement errors and gap that result in mismatches, I can do some fun stuff. I’m going to try to categorize the 217 mismatches by all the known variables, including # or problem neighbors and whether they are Sense data gap issues.
I decided to use a simple K-Means clustering with 4-6 possible clusters, and tried to adjust the weighting to do two things:
Without going into all the details and weighting experiments that I did, here are some of results given my strategy.
My clustering cleanly separates mismatches due to Sense monitor problems vs. other errors, and also stratifies by the number of problem neighbors. Note that most Sense monitor errors occurred standalone, or nearly standalone, with no other nearby mismatches.
The clustering approach also stratifies the ‘non-gap’ mismatches by the magnitude of Sense Usage. We’ll see if that is helpful in identifying the root cause. These first and second scatter charts also explain the labeling of the clusters.
One cluster, in cyan, consists solely of both types of Sense monitor issues. The reddish orange are the mismatches that have, by far, the most problematic neighbors. And the olive and lavender have fewer or no neighbors, but are distinguished by whether they are in the high or low end of the usage curve.
Now let’s apply the clusters to more familiar scatter charts.
We saw this one earlier in black and white and wondered about the positive Sense Solar Production and negative Total Usage. Turns out those indeed were all monitor-related data gap errors, as well as others.
Here’s the same type of scatter chart we used to initially look at correlation, except now I’m using it to look at only the mismatches. Two things stand out to me from this chart:
And speaking of distributions, here’s a view of the “density of mismatches” by size.
Here’s a chart to get everyone thinking. I went back to my graph of the number of problem neighbors (mismatches in the surrounding 24 hours) vs. date for the most mismatch-prone period of time, and overlaid both the cluster coloring plus my best guess at when various firmware upgrades happened. It does look like firmware upgrades may have caused 10 or so of the data gaps in 3-4 different time periods. Other then that, I’m looking for comments on what I should try next.
One more plot looking closely at what mismatches really look like. Hourly Sense and PG&E data overlaid with markers for the different types of mismatches for some of the most mismatch-prone days in June. What’s fascinating is that the accuracy is typically so good that you can’t even see the Sense data line in pink - it’s completely covered by the PG&E data, except in the mismatch zones. Blue dots indicate Sense monitor data gaps, while the rest of markers indicate the type of cluster categorizing that mismatch.
BTW - these monitor issues aren’t all on Sense. I’ve been tinkering with my network for various reasons and have created a few outages on my own. If my Fing network monitor had a deeper event log, I could have overlaid my network tinkering events as well.
Charting “Cleaned” Data
Now that I’ve been able to identify some of the real measurements errors, I’m going to strip those hours off and take a look at the earlier plots again, with the known problematic data removed.
Here are the hourly and daily scatter plots, looking much better…
And the same histograms of the difference between Sense and PG&E are predictably much tighter …
Plots between my solar data sources also look better - no visually dramatic outliers.
And a tighter histogram spread:
If I do a linear fitting on both usage and solar, I get adjusted R2s that are very close to 1, meaning a very close linear fit.
The hourly usage comparison has a slope that indicates that Sense is slightly more than 1% optimistic vs. my revenue meter. From this perspective it looks like there are still significant mismatches in the center of the line and a curve in the tail in the negative zone that bear investigating.
The daily solar comparison has a slope that indicates my SolarEdge inverter is slightly more than 3% optimistic vs. my Sense data. Probably worth taking a closer look at SolarCity vs. Sense vs. PG&E during solar production. Unfortunately, there’s not a direct way to compare all three, since I only have net readings from my PG&E meter and SolarCity only offers solar data.
WOW, thanks @kevin1 that’s some level of detail.
Good luck in sorting discrepencies. Please share when you figure them out.
Never thought of an angled CT…? I wonder how much that changes things?
Anyone try adjusting them after the initial install?
I plotted the Power company readings vs. SENSE readings for 9 months - Daily.
I plotted the ‘kWh difference’ values as a scatter point chart and looked for any trends. Two distinct baselines were observed. (It is hard to see the two dashed lines that show the average baselines on this chart at -0.137 and +0.011)
I calculated the average for each baseline, then I calculated the ‘offset’ from the baseline. ‘offset’ = kWh difference - baseline average, I plotted those values in a Histogram, Then I summed all the outliers (values that were not close to the bell curve).
On April 23rd, 2018 there was a shift on the offset baseline of 0.148 kWh and has stayed consistently at the new baseline.
I posted this same information with more graphs earlier in:
Thanks @kevin1! If anyone here wants to be added to the data analysis subforum, just shoot me a PM!
One additional thought that might make finding discrepancies easier:
Thanks to a question from @duanetiemann, I discovered that downward “spikes” in my Always On graph usually correspond to data dropouts from the Sense probe. The Always On calculation for downloads behaved like an anomaly detector (I can explain more if you want more info). For example, my hourly Always On for the month of Nov. is charted below.
The circled downward “spike” corresponds to the data gap in my Power Meter below:
This is a second home and the utility provides a monthly bill with no day to day breakdown. I’ll try and check a few things out when I’m next there in a couple of weeks.
Thanks for the help.
If that’s the case, then looking for spikes in your “Always On” hourly data downloaded from Sense is probably the best way to find places where Sense might have shortchanged the final monthly total. Find any negative going spikes in your hourly Always On, then look to see if there are data gaps in your power meter at the same DateTime.
I contacted support and created a ticket (133718) to look into this. The usage on my bill is typically more than 50% higher than that reported by Sense.
Looks like two phase (US style).
The left is the neutral(white lines).
120/240 60Hz where the clamps are.
Not like 240V(single phase) and 415V(3 phase) 50 Hz in AuNZ.
But what’s that additional red and black “main” and where does it go ?
I have another panel in the golf cart garage.
Sense support is telling me that I would need another monitor and another account to see the additional usage for my house. What doesn’t make sense is that my big consumers are 3 X AC units and the Pool Equipment. All of these and everything else in the house shows bubbles in Sense. They’re not all identified by Sense but they all show. Why is my usage so different from my utility bill???
This graphic shows what sense is recording vs my bill. The bill shows usage way higher than sense. I could understand this if one of the AC units or the pool equipment wasn’t showing but that’s not the case…
What’s wrong here?
Could you please share more pictures of all of your electrical panels (Include the ‘Power Company’ meter). Your main panel configuration allows two separate taps. Could you show a picture that includes the top of the previous panel so we can see where the wires are entering.
You are monitoring only the single panel and not the sub-panel in the golf cart garage that you mention. There may be additional circuits on your sub-panel that are consuming power. To get a complete reading of your whole house you would need to move the CT clamps above the main breaker cut-off (or do as SENSE support suggest - and purchase a second orange monitor to register the power going to your sub-panel. You have what is termed in the US as a ‘split phase’ installation. It is not a 3-phase.
This is all I have of the main panel.
Zach has reached out to me and I’ll be trying a few things this weekend.
Can you provide a picture of the main breaker panel with the cover plate removed? We need to see how the wires enter from the meter base and how they are connected to your “main” service breaker.
Will do on Saturday.
I think I’ve probably found the issue. The 2nd panel has all the FAUs for the AC systems.
So, although all the AC consumptions show in the APP when fan mode is manually turned on the APP shows nothing so it is being missed. I’m surprised that the conusmption for these is so high. I was already seeing large conusmption for the units themselves so made the assumption that each unit was being accurately tracked. Silly me.
This weekend I’ll see if I can position the clamps before the split to the second panel and hope that’ll resolve things.