I think that the only ones who could offer an exact reason would be the Sense employees as I have scoured the community for a reason. I could actually only find one response from Ryan which was basically, “what you hope to gain by doing so?”
Here’s my two cents as to why it might be. I have a couple ideas. First let give two examples. We all have seen Googles ReCapcha. In the case of ReCapcha Google knows the answer to 2 of the 3 entries. It wants you to help with the 3rd. If you get the two known entries correct then it doesn’t matter what you pick as the third it will let you through. But the idea here is that a large percentage of the population will pick the correct one. Google then sees a 90% rate on the item in question being identified as whatever it was they asked you to identify. Normally these are objects that Google wants to then use for Self Driving AI or Maps recognition (notice how most are crosswalks, buses, bikes, signal lights etc.) This process isn’t machine learning, it’s people learning and purely statistical. The answer is then used in Machine Learning later on to develop the model and predictive algorithm for the services they offer. The key here is that the data used to determine the truth was thousands, if not millions, of people. There doesn’t need to be outlier protections because statistically it’s correctly identified.
Now when you do machine learning with data from a customer or “time” you often have to build in algorithms that detect anomalous data. It’s critical that this be done otherwise the anomalous data can skew and mislead the training process of the machine learning algorithm resulting in longer training times, less accurate models and ultimately poor results. Sense doesn’t rely on people to determine what is a truth, it relies on the models and predictive algorithms of huge amounts of data. What happened with you was that Sense detected the anomaly and alerted you. The data that your monitor was providing would have, most likely, been considered a “Collective Outlier” as the values as a collection deviate significantly from the entire data set of all Sense monitors, but the values of the individual data points are not themselves anomalous in either a contextual or global sense.
So there’s the two examples and here’s my guess as to why they want you to reset.
- The old data has been capped as anomalous and therefore it’s just a bunch of useless data that’s expensive (over time) and taking up space. The two most common ways to handle this data is to drop it or cap it. Because the data has been flagged it’s removed from the data set (capped) and provides no value to the training. I would assume that all your data from the detection date prior was capped since they probably don’t know when the anomaly first occurred.
- The averages, which probably aren’t any form of intelligence or learning, could be skewed for a while. If for example Sense uses a 12 month rolling reset for the annual usage numbers then a year from now it would be correct. Until then it would contain inaccurate data. If they take an average based on lifetime of the account then it could be a long time before the correct averages are shown. It could also be that these numbers will skew Sense’s ability to average many customers in the same area, etc.
So my assumption is that the request for you to reset the data is more because they want you to, rather than you needing to. They want you to take the initiative to clean up the mess that they have in their database which is now unusable. If it was my app, I would rather drop it over capping it and I think that’s what they are asking of you.
I could be totally wrong here, but there’s no other explanation that I could find and I had a 3 hour drive back home tonight to think about “why”… This was my best guess.