Correlating NVEnergy Usage Data with Sense Data

Kevin,

Your guide gets my data science brain jazzed up! Ugh, this is so cool! Anyway, I followed your steps and came out with a graph that looks like this. Attached are the data tables and models used for charting. Any ideas why it could look this stupid? The reason there are two sets of data from the utility company is because that is how it came to me from them. I did the work you suggested and saved the two formatted and pivot tabled spreadsheets for analysis as Demand NVEnergy and NoDemand NVenergy and they relate to the demand and just . I’m not sure as to why the two data sets don’t match up, Also, I can’t add links or attachments because I’m ‘new to the forum’, which is a bit lame so here is a google link with ***. Just replace the *** with https, drive, google, and com to get to google drive where I uploaded them. Any help would be appreciated!

:// . . drive/folders/1Z6LSD3WkymSc8atrKwoxNIRKYuLF91zv?usp=sharing

2 Likes

Hi @chrisseivert24, I’m taking a look, right now. I have also made sure that can post .csv’s, .png’s, .jpg’s and .xlsx’s. going forward. One quick question for you - are the csv’s you put on the Google drive straight from the downloads ? The reason I ask is that sometimes Excel can change the the formatting of columns when you save a csv in Excel (the saved csv can be different, for time/date fields than the original read in format).

Yeah I noticed that! It’s odd, but I changed the format to excel workbooks so the formatting is correct.

So first step is figuring out what the files from your utility actually are. They look a lot like mine - 15 min interval smart meter output. And you have added the DateTime correctly. The one marked “Demand” looks like this:

Screen Shot 2020-10-04 at 3.51.17 PM

The units (kW) and the name (Demand) suggest that this represents the maximum power draw that you pulled during that 15min period. Utility companies usually use that to calculate demand charges above and beyond your usual per kWh charges.

I think the second utility file without Demand in the title is some kind of actual usage, since it uses delivered kWh (KWH_DEL) for the units. But the usage in that column never goes negative even though Sense shows that you have times when you are putting out power to the the grid. Do you know if you have Net Metering for your solar ?

Yeah I have net metering. I get 81% of what I give them from excess solar generated back as a credit on my bill but that’s more gee whiz stuff. I know for a fact on that sense data set I gave you that there is a negative output for solar given as excess back to the grid so it’s interesting as to why solar doesn’t share that as well. I’ll give you an unscrubbed data set straight from the source (besides name and address of course). Hold on

Look at the link I sent one more time. There should be a data masked folder with more data that hasn’t been edited at all.

1 Like

Thanks for the full data set. One more question - what kind of utility power meter do you have ? A picture of the front of it might suffice. I ask, because the “Read_Type” column in your data makes me question how your 15 min data is created. Very few reads are “Actual” and most are “Calculated”. I don’t know enough about utility meters and monitoring infrastructure to know what “Calculated” means. And once again, I’m not seeing any negative usage periods which seems wrong for net metering, because there are bound to be times when you overproduce.

ps: It is nice that your power company annotates Time of Use (ToU) periods to each 15 min interval, but worried about what that “Calculated” usage means. Wondering if other Sense users might know.

Yeah of course! I took a picture that displays the model number of usage. I’m wondering, are there any other sense users that could help with this with us?

I wish I could summon all the Sense users in Nevada to weigh in here @chrisseivert24.

Haha it’s not big deal. I’m sure we can figure it out! What would you say the next step could be?

I’d recommend creating a new post in the Technical Questions subforum with the name of your utility/region to see if other Sense users in the area have more understanding of the reports provided by your utility or recommendations for you here.

@chrisseivert24,

In the absence of more info on NVEnergy output data, I tried plotting the hourly non-Demand Usage data against Sense Net data (Total Usage + Solar Production). It looks to me like NV Energy is serving you up a “hot mess” of data with a little actual usage buried inside. It’s a hot mess from 2 dimensions.

Here’s your raw hourly data plotted and colorized by the NV Energy Read Type. Usage is from NV Energy, Net from Sense. The first big issue is that this data from NV Energy apparently converts all negative energy (you putting power on the grid) into positive energy (you using energy from the grid). Hopefully they are only doing this for downloaded data, not for actual billing :wink:

One I flip all the Sense Net negative energy (you putting energy on the grid) to positive, the chart looks a little better and begins to make sense (pun intended).

It looks like most of the time, the “Actual” reads correlate well with Sense, the “Mixed” reads (where at least some of the 15 min periods had some “Actual” reads) look to reasonably correlate, and the “Calculated” reads are all over the map.

I think NV Energy only reads the meter every once in a while (that’s a feature on your meter - on demand reads), then fills the rest using some kind algorithm. Plus they convert all your data to positive for export, even if it is really negative.

I also noticed in working with the data that every 15 min period actually has two Usage values. It’s as if two different data sets were appended to one another. I discovered that when I noticed I was summing 8 values per hour rather than 4 values. Your NVEnergy download has 6144 rows for the period in question should only have 3072 samples.

@chrisseivert24,

OK - one more mystery solved ! So as I mentioned earlier, you can ignore the “Demand” data download for the moment, and focus on the other data file. Once inside you have to look at the “powerFlow” information.

There are 8 readings per hour rather than 4 because half are “delivered” energy (from grid) and half are “received” energy (your solar to the grid).

So the first thing you need to do is to make all the Usage where the “powerFlow” is “received”, negative. Once I did that, it corrected the lack of negative values issue in your NVEnergy data. I now get a graph closer to what I would expect, and again, the hourly “Actual” reads look very close to Sense.

If I look at the 50 hourly readings, of the 768 total, that are composed of all 8 “Actual” 15 min readings, your Sense Net Usage is incredibly close to your NV Energy Net reading. If I do a linear model for just those 50 points, I get R-squared’s equal to 1, essentially a perfect linear fit. And the slope (1.0002097) and intercept (0.0003593) coefficients are near perfect as well. Not sure how much you know about linear models and regression, but when your meter is sending “Actual” reads for an entire hour, Sense is spot-on with it.

Not sure what’s happening the rest of the time with NVEnergy and “Calculated” usage, but it nets out over the entire month/billing cycle you supplied. If I sum Usage for the entire time period I get 966.275kW, while Net (from Sense) gives 969.564kWh.

Now that I know how the data is organized using R, I can give you further guidance on doing the same correlation in Excel if you want. Let me know.

I did one more experiment, mostly to understand how good a job NVEnergy does in distributing the “Calculated” energy reads. If it distributed energy intelligently between “Actual” reads, then aggregating all Net and Usage data up into individual days should show pretty good correlation. Unfortunately, NV Energy doesn’t distribute the energy usage intelligently over daily boundaries so that it matches up with your actual daily usage. The ultimate monthly totals between Sense and NVEnergy match, but the daily totals are all over the map.

Here’s a plot of the daily totals from Sense (Net) and NV Energy (Usage). The color of the points represents a count of how many Reads that day were “actual” (max = 8 x 24 = 192 if every read during the day were Actual).

Rplot04

The number of Actual reads appears to have a small influence on correlation, but ultimately all days have less than 40% Actual reads in a day.

And here’s kind of a different view of what is happening in time. Sense readings are represented by the red line, NVEnergy readings by the blue line. Once again, the data points from NVEnergy are labeled as to their Read Type by color. Blue means fully “Calculated” for that hour, green means “Mixed” (some Actual and some Calculated) and red means “Actual” where all 8 reads for that hour were actual.

@chrisseivert24,

You have two quirks in NV Energy’s energy data download that need to be taken care of before doing your NV Energy Pivot Table. The things we need to tweak:

  1. The energy Usage column has to be adjusted to negative, when the “powerFlow” column equals “Received”, so we can compute Net Usage. So we need to create a derived "Real Usage column.

  1. We have to keep track of the sum of “Actual” Read Types, vs. “Calculated” per hour, so we know when a comparison with Sense is actually meaningful. To do this, we need a derived indicator column, “IsActual”, that is set to 1 if the read is “Actual” and 0 if it is “Calculated”.

Screen Shot 2020-10-10 at 4.43.13 PM

  1. As before, we also need a text version of the DateTime hour for each measurement.

Once we have those 3 derived columns, it’s time to do a PivotTable with the following specifications:

Once you have that, result, you can copy and paste to merge with the filtered and processed (PivotTable) Sense data. You’ll only want to do detailed comparisons against Sense when you have 8 Actual reads in an hour.

1 Like

One final installment on duplicating the correlation chart I did earlier in R, using Excel.

Once I combine the hourly Net results (Usage + Solar, or Delivered - Received) from Sense and NV Energy and put them side, by side, we still have one more task in from of us. We have to divide up the hours that are entirely “Actual” reads from “Mixed” reads (some “Actual”, some “Calculated”), from the entirely “Calculated” hourly periods. To do this we need to create three new derived columns that either contain the NV Energy hourly data, as appropriate, or NA. They we can plot each column, “Actual”, “Mixed” and “Calculated” as its own series. To do that, each column need the logic functions below.

The NA’s are useful since they make those points invisible for charting.

And just for fun, I’ll also create a Diff column that shows the kWh difference between Sense and NV Energy. Maybe we can use that later. Unfortunately there are very few “Actual” read hours compared to the other two.

Once I’m done, I can plot all three series as XY Scatter charts and voila:

I’ll conclude with one more view on NV Energy’s data. Your Sense correlates incredibly well with your NV Energy “Actual” reads… But not so well with “Mixed” hourly reads (some “Actual” and some “Calculated”), and very poorly with most entirely “Calculated” hourly reads.

The histogram below highlights the error/differences distribution between the different read types.

I believe that Sense is giving you consistently a much better answer than your NV Energy download.

Geez thank you so much for all of your help. I’m sure I’ll have many other questions for the outliers. Do you think there’s a way to talk to NV energy and get them to change the number of actual reads per hour? It seems like it would have been a pain to do all of that and I appreciate it so much! My final question is wondering if the data being reported to NVenergy (which is my bill) is being calculated correctly. I assume it is being that the data correlates well?

It was some fun forensic data analysis… I’m not sure why you have so few Actual Reads, though in the old days you would only get one read per month. I’m guessing that the small percentage of Actual Reads, most frequently back-to-back, indicates a limit in NV Energy’s metering infrastructure. Most meters send small packets of data to other nearby meters, hopping the data from meter to meter until it gets to a receiving station that sends the data to their data center. If the path to the receiving station is long, through lots of other meters in the mesh, a utility will typically dial back the number of packets of data sent, just to prevent congestion. It would be worth asking NV Energy customer support if they could up the data rate / Actual Reads. I’m pretty sure that I have near 100% actual reads for every 15 min period.

On whether your bill is correct, a lot depends on what kind of plan you are on. If you have a simple fixed kWh price based on your net usage, or a monthly tiered rate based on your net usage, things should work out OK. The total net usage per month aggregates out OK (966.275kW for NVEnergy, 969.564kWh for Sense). But if you have ToU (time of use) pricing, the Calculated reads might not aggregate out correctly to match the real usage per ToU period. If I use the ToU labels in your data, there definitely is a difference between Sense/Net (continuous reads) and NV Energy/Usage (infrequent actual reads). I don’t know the rates for each of the 3 ToU but you can do the math.

Screen Shot 2020-10-12 at 10.37.25 AM

If SUMMER_EV_PEAK is the cheapest rate and SUMMER_ON_PEAK is the most expensive rate, then you are likely being charged too much. NV Energy is charging you for fewer cheap kWh and crediting you for less solar production during expensive hours than Sense is seeing.

1 Like