Scraping the web app Power Meter for data analysis

One final chapter on this enterprise. I invested a bunch of time in “finding” data dropouts, because I kept seeing the impact of dropouts in Sense exported data. “Always On” data is exceptionally sensitive to dropouts, though I don’t know enough about how it is calculated to understand exactly why. But anomalies in “Always On” gave me a way to pretty much automatically locate 334 hours over the past 535 days, that included one or more data dropout events. In the chart below, I overlay the number of hours with detected dropout events (black line) on top of my column chart of dropout events per day. They correlate hugely, though that should be no surprise.

What surprised me was that “Always On” anomalies found about 95% of those hours. Yes, I had to do a little filtering because there were some false positives. And yes, I did have to search in one more region (Always On < 0.135kWh) to find the remaining 5% of the hours with dropouts.

As you can see from the chart, dropout events are discretely different than dropout hours. You can have more than one dropout event per hour, but a big single dropout event can also subsume several hours. It also looks like I missed a few hours despite my intense search, but I can easily go back and locate them now, if I want to.

Best of all, I can put this experiment to rest and hope that Sense finds the data leaks that cause dropouts in the history timeline and export.

More on finding data dropouts using “Always On” data here: