I’m celebrating the start of the 7th year of solar production on my SolarCity solar system this year and decided to see if I could sort out some kind of predictive calculation that was a function of yearly periodicity, days in service (dirt and degradation), plus any other time factors (like day in year). Now, my solar install preceded Sense, so all this daily production data comes from web access to data coming from my inverter over the years.
First I looked via linear regression, at all the output production versus:
Cyclic = cos(2 * pi * (day in year (Yday)- 172)/365)
- Day 172 is Jun 21st, longest day of the year.
Days in Operation (DIO) - to account for degradation over time
Yday - any other weirdness about the time in the year.
Here’s what the fit looked like:
Broad distribution of points all living inside an envelop of the max possible solar production, gated by max normal solar radiation per day. Reasonable fit given all the negative variability depending on weather - Adjusted R-squared: 0.7107. You can also see the degradation effect in the lowering fit lines - I lose about 1.6W of daily capacity every day of operation. A good reason to have a power purchase agreement with a guarantee. Here’s the linear model information:
lm(formula = kWh ~ Cyclic + Yday + DIO, data = SolarHist) Residuals: Min 1Q Median 3Q Max -20.451 -1.147 1.007 2.432 6.459 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 19.6539052 0.2157409 91.100 <2e-16 *** Cyclic 8.2409384 0.1176425 70.051 <2e-16 *** Yday 0.0022278 0.0007835 2.843 0.0045 ** DIO -0.0015961 0.0001297 -12.311 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.826 on 2189 degrees of freedom Multiple R-squared: 0.7111, Adjusted R-squared: 0.7107 F-statistic: 1796 on 3 and 2189 DF, p-value: < 2.2e-16
After this, I decided to look at just the maximum value on every given day in the year to help compensate for weather variability. I also pulled the days in operation (DIO) variable out of the equation since I’m looking at the maxes over a period of 6 years.
Much better fit with less weather variability. Adjusted R-squared: 0.9455. Plus the Yday factor becomes negative and increases, plus becomes more significant. I’ll need to look more closely at why that happens. I was actually getting a slightly better fit when I used a 162 day offset, rather than a 172 day offset inside my cyclic cosine component. Would love to see what other solar users are seeing, though this might not be the best forum.
Here’s more info on the model that fits the max:
lm(formula = Max ~ Cyclic + Yday, data = SolarHist) Residuals: Min 1Q Median 3Q Max -7.7307 -0.8023 -0.0357 0.7073 3.9201 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 22.4786629 0.0579658 387.79 <2e-16 *** Cyclic 7.8434592 0.0411364 190.67 <2e-16 *** Yday -0.0036100 0.0002757 -13.09 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.347 on 2190 degrees of freedom Multiple R-squared: 0.9455, Adjusted R-squared: 0.9455 F-statistic: 1.9e+04 on 2 and 2190 DF, p-value: < 2.2e-16