So You Just Installed Your Sense and You Want to See If It's Really Working

Detailed Error Analysis

So now comes the detective work. There are really two questions here:

  1. How big / small a difference is really an “error” ? When do we stop, because there is always going to be some level of difference when we are using two different measuring devices ?

  2. How do we track down the causes of the differences we deem errors ?

Instead of answering these questions directly, I’m going to take a cue from how physics treats measurement errors.

  • Random errors - are errors that come from random factors during measurement. Generally these errors will have a normal bell curve distribution and cancel each other out over many measurements. We just have to accept these errors and just characterize them statistically, putting a bound on them.

  • Systematic errors - are measurement errors that occur due to some factor that is related to a consistent difference in the measuring system, outside events (power outage, networking issues), or something that wears or changes over time or with temperature. These errors will show up as a bias or consistent difference that correlates with the outside conditions. Here we can relate the difference back and fix it or at least compensate for it.

Our Analysis Tools

Key tools in our holster are below. You can do all these things in Excel, but I’m going to do the next bit in R because it’s easier and offers more flexibility.

  • Unity Plot - This is a great way to visualize accuracy of your Sense. It’s basically just an x-y plot of your utility usage for a given hour/day vs the Sense usage. If your Sense is working right, it should be a 45 degree line with very little divergence. Any point not on the 45 degree line need to be explored to understand the source of error.

  • Linear regression / linear model - a mathematical fitting of the two values for the same time interval. Linear regression will produce a line that best fits all the points, including any bad data point that can skew the line a little. A linear regression will produce the slope of the line (hopefully close to 1.0), the y-intercept, plus one or two R^2 values that indicate the goodness of fit, where 1.0 is a perfect line, with no outliers. You should have a 0.95+ R^2 value if Sense is doing it’s job right. The nice thing most linear regression tools vs. a unity plot is that regression will also produce a residual/error measurement for each point, which represent the deviation from the fitted line. Residuals give us the handle to find the biggest outlying data pairs that need investigation.

  • Error / Residual histograms - a difference histogram gives a better view of how much the difference between your utility and Sense varies, or your fitted line and the Sense data. You can see whether the difference is centered around zero or offset, which would indicate that one of the two measuring devices is consistently more than the other. You’ll also be able to see if you have a typical normal distribution, that is balanced, or if it is skewed. In one interesting case, @Dcdyer saw a split distribution that highlighted a change in his CT setup over time that made it more accurate. A histogram will also reveal the extent of outliers, and allow you to adjust analysis accordingly.

  • Difference timeline - If you want to look for changes in accuracy over time, it’s useful to chart the difference between the two measurements over time. @Dcdyer did a good job of it here and spotted the same difference he saw in his histogram. If you have a long time history (more than two weeks), I would recommend charting daily differences rather than hourly for a variety of reasons.

  • Statistical difference analysis - If you really want to get to the bottom of the causes of errors / discrepancies between your utility and Sense, you’re going to need to investigate various relationships and look for patterns or links. And you’ll likely need to peel the onion - when you find one error source that affects certain data points, you’ll need to flag the errors, then fix or remove those points to find other errors that might have been masked by the first set of errors. So we’ll be looking at changes in error distributions over time and even temperature (if I had been measuring temperature in my service closet).

3 Likes