How Accurate is Sense vs. Utility Metering?

OK - I decided to evaluate my old historic Sense vs. PG&E data one last time using a little different methodology. My biggest issue was figuring out how to remove the bad data, that primarily resulted from Sense data gaps. This time round my goal is to remove data via use of a linear model (regression), via a pure statistical approach. I’m going to:

  • Compare my Sense hourly net data (consumption - solar) vs. my utilities hourly net usage data
  • Use the data to create a linear model via regression
  • Remove the 50 data points with largest error residuals
  • Use the improved data to create a linear model via regression
  • Remove the next 50 data points with largest error residuals with the new model
  • Use the improved data to create a linear model via regression
  • Remove the next 50 data points with largest error residuals with the new model

Doing that gives a nice graph like this showing the removal of the worst points…

If I look for the causes for the 150 biggest residuals I can compare against some data I put aside about Sense and SolarCity connectivity and see that most of the very largest residuals stem from data dropouts from one of those two sources.

SenseVsPGE

Out of almost 6100 hourly datapoints (254 days), I removed 150 of them, all of which had true error causes.

  • 61 Data dropouts from Sense
  • 63 Cases where the Sense Solar data went crazy - started matching my Total Usage
  • 26 Cases where SolarCity data dropped for some reason.

I can also see how the removal of the worst points changes the distribution of the residuals from this originally:

Residual1

To this after 50 removals:

To this after 100 removals:

To this after 150 removals

You can see how the residual error drops and centers nicely around zero with the removal of the problematic datapoints.

1 Like