Much of what we do in the energy industry feels very esoteric. So, whenever an opportunity presents itself to incorporate work from other disciplines, like data science, I get really excited. That is what’s happening with ASHRAE Great Energy Predictor Challenge III that just launched on Kaggle and incorporates NMEC into an industry contest.
What is the Great Energy Predictor Challenge?
This objective of this year’s Energy Predictor Challenge is to create the best model to help answer the question is: how much does it cost to cool a skyscraper in the summer? The challenge involves using whole building energy data to create a model that estimates building energy use using a fantastic dataset of energy consumption in over 1,000 buildings. The model should predict what a building’s energy usage would have been if the building had not implemented an energy efficiency project.
Big cash prizes
Contestant efforts will be rewarded with big prizes. First place will take home $10,000. Followed by $7,000, $5,000, $2,000 and $1,000 for second through fifth place respectively.
Whole Building Energy Models
We use models like these in our industry to verify energy savings. Some building owner’s use financing based on pay for performance which requires an accurate model to establish fair financing terms and accurate payback analysis. However, whole building meter data analysis is still very new since this type of data has only recently become widely available. So, the competition seeks to draw on the expertise of data scientists and energy engineers to develop accurate models.
I’ve spent much of my career verifying the amount of energy savings achieved after recommended energy saving improvements (aka energy efficiency measures) were made to a building. We refer to this as measurement and verification or M&V for short.
Normalized Metered Energy Consumption (NMEC)
One method for verifying energy saving is to use whole facility energy meter data collected before (baseline period) and after (reporting period) the improvement. It essentially comes down to Energy Savings = (Baseline Period Energy – Reporting Period Energy) ± Adjustments.
Conditions such as weather or building occupancy that impact energy usage may differ in these two periods. To account for this, a regression model is fit that describes baseline energy use as a function of weather (or other independent variables) and then used to predict what energy use in the reporting period would have been if no changes had been made. This is referred to as Option C in the International Performance Measurement and Verification Protocol (IPMVP) and has taken on the name Normalized Metered Energy Consumption (NMEC) more recently in California energy policy.
Challenge of NMEC
The challenge in performing M&V in this manner is to develop an accurate prediction of what energy use would have been in the reporting period based on a model developed from the baseline period. The methods I have used over the years include those defined in the ASHRAE Inverse Modeling Toolkit and the time of week and temperature (TOWT) model published by researchers at Lawrence Berkeley National Lab (LBNL). These methods have worked well much of the time, but the quest for better predictors always exists. That’s why this year’s Energy Predictor Challenge can yield big benefits as it can draw on a global network of sharp minds to create models that can be used for many use cases from energy usage predictions from M&V to fault detection to load forecasting. Contestant’s models will be shared with the world via open source code for everyone to use.
What is Kaggle?
The challenge is hosted on a platform called Kaggle. It is relatively new to me, but it’s been frequented by a growing community of data scientists for many years. Kaggle is a web platform that holds competitions challenging anyone with machine learning skills to build the best model for predicting all kinds of things. Looking at the list of currently active competitions you will see anything from predicting how many yards an NFL player will gain after receiving a handoff to predicting sales prices of real estate. The platform allows for datasets to be provided for a competition, provides the ability for users to share notebooks that detail their work step by step, accepts results submissions, maintains leader boards, and discussion forums.
This is the third challenge of this type sponsored by ASHRAE. The first two were held in 1993 and 1994 and are described briefly on the challenge site. There was a long lag until this year’s because it took the emergence of many factors to create the optimum conditions for another competition. These include the availability of whole-building, interval energy data, advances in data science, the growth of the Internet and emergence of platforms like Kaggle. Now hosting such a competition is much easier and much more interactive.
Join the Great Energy Predictor Challenge! (Or at least check it out)
This Great Energy Predictor III competition allows energy data scientists to test their ideas not only during this competition, but long into the future. So, check it out and tell all your data science or energy modeling friends about it as well. If you’re not proficient in data modeling then look through some of the notebooks people have posted. There’s a lot going on over there and the competition runs until December 19th.