Netflix famously hosted a competition to see if anyone could create a movie recommendation algorithm that beat their internally developed algorithm. They were beat, and they used the winning algorithm to provide users with a much better service. Would Sense ever be open to trying something similar?
The Sense algorithm (if you can call it that) is less like Netflix recommendations and more like, say, how you might imagine Uber drivers get directed by central command. The forces on those decisions are very wide ranging and ultimately, fundamentally, proprietary.
Having been in a couple of similar Kaggle competitions, there are 3 things needed to make something like this work:
- a many-house training raw dataset that is the same as Sense’s (microsecond samples) with commensurate accurate device tagging for hundreds of devices.
- a similar testing raw dataset without tagging, representing many homes and hundreds of devices, some of which are not included in the training set.
- a “correction key” of tagged data for the testing dataset so Sense can objectively evaluate the accuracy of all submissions.
Given the size of the datasets, I cannot see a pragmatic way for Sense to release them, nor do I see an economical way for Sense to offer that much data access via the cloud. Besides the cost considerations on Sense’s side, there is also the issue of privacy - Sense is signed up to keep our data private, and using Sense-only data isn’t sufficient.
Finally, given the data volumes, I’m fairly certain that neither most Kagglers or Sense would want to fund the required specialized training resources. The stakes in the Netflix contest was 1M$, but that was for a company with 1B$ in revenues when the contest started. It would probably take a similar bounty, either in prize money and/or GPU time to do the same for Sense.
The large processing requirements for training on such a huge dataset aren’t something I had thought about. I could potentially see one of aws, azure, google, or ibm offering a free or discounted access to those resources, either for publicity, as a way to get more data scientists leveraging their platforms, or something else. But, it would definitely be a bigger hurdle than I was thinking.
Getting participants shouldn’t be too hard. First, I think a moderate prize would be a very reasonable investment for sense, as limited device detection is by far the biggest complaint people have with sense right now, but device detection is also sense’s primary selling point and differentiation from competition. Being a little better here would almost certainly be worth the expense. Also, I think a lot of people would do it for the practice, challenge/experience, and their own ego/curiosity.
If you take a quick look through Kaggle I’ll bet you’ll see a direct correlation between potential algorithm income vs. prize money (where’s the chart? ). It speaks volumes that the current top competition is to correlate news with stock movements. Money tends to move things along at scale.
Perhaps taking a step back to the question of whether device disaggregation and, more generally, energy tracking and saving, should be given more attention along the lines of an X-Prize, read: If utility-scale and governmental institutions (and tax payers) invested in global disaggregation technology you would think the increased scale would be beneficial to the problems at hand.
“Free” Senses for everyone or at least tax breaks!