@monty,
A few observations from my perspective:
- If you truly believe in open sharing of data to leverage university work, then you should switch from Nest to Ecobee. Ecobee offers anonymized user datasets to a curated pool of researchers. I’ve had the pleasure of seeing some of Dr. Chong’s (Cornell) research on the subject of heating and cooling algorithms that has evolved using Ecobee data.
https://www.ecobee.com/wp-content/uploads/2017/01/DYD_Researcher-handbook_R7.pdf
Disclaimer - I gave up on Nest about 3 years ago when they seemingly lost all innovation after the Google acquisition. Ecobee starting offering room sensors then, Google just started. - Read up on MIT and CMU work on REDD and BLUED datasets used for most US university research on electrical disaggregation. Both datasets are straightforward measurements of voltage and current at a 16KHz sample rate, plus ground truth data measurements at either 1 Hz or 1 sample per minute. Even at a 16kHz sample rate (266.67 samples per 60Hz cycle), the researchers do some pre-processing on the data to reduce data size by removing repetitive data, though they offer the raw data as well.
http://redd.csail.mit.edu/readme.txt - Sense is specced at sampling 4M data points per second, or a 1 MHz sample rate (2 voltage samples and 2 current samples). Given that data measurement rate vs the Sense data upload rate several users have metered, the Sense probe is clearly doing lots of pre-processing of the raw current and voltage measurements to extract relevant features and minimize the amount of data uploaded. I’m sure this reduced format and associates processing are part of Sense’s secret sauce. Researchers couldn’t use it without deep explanations of the preprocessing and the format is likely subject to change (firmware update anyone).
- What’s more, is that university bound data sets would likely require ground truth measurements for all of the end devices. Very few researchers would want to initially jump into the game with an gigantic, completely untagged data set. So I don’t think the current Sense dataset is a good match for university sharing.