Machine Learning Part 1: The Machine Data

Machine Learning Part 1: The Machine Data

Part 1 – The Machine DataA Detailed Perspective of Steps for Machine Learning and Predictive Chilled Water System Analytics in Facility Applications


FacilityConneX Approach to Predictive Analytics

Here we illustrate an effective FCX predictive methodology for organizations to diagnose and detect events that mandate the need for maintenance of their chillers. Our approach and analytics utilize different machine learning methods and domain knowledge to detect equipment degradation (chillers in this example), which requires maintenance, and perform diagnostics and prognostics.

The typical FCX approach to predictive analytics includes the following few steps:

  • Development of probabilistic/statistic predictive model from the given historical data
  • Developing and training relevant machine learning methods
  • Connecting these two with equipment domain knowledge

Chillers and Their Key Performance Indicators

Air conditioning systems, and particularly chillers, consume a major share of the total energy usage in buildings. In this example – each chiller has two separate water loops, a condenser water loop and an evaporator water loop. In between is a refrigerant loop. The evaporator water loop is connected to the living space from which the heat is absorbed by the condenser water loop and released in the cooling tower that operates in the open.

To manage chiller operational and energy efficiency, several key performance indicators (KPIs) need to be identified, monitored and analyzed. The proper data analytics approach is to gather data for all the relevant KPIs and perform exploratory data analysis to narrow down the most relevant KPIs. To do this properly and eliminate any artifacts that may be present in the data we need to perform clever data preprocessing.


For example, while observing the chiller data, it may be noticed that during the start and the stop operation of the chiller, the data fluctuates and gets very noisy. This data as well as the data when the chiller status is off should not be used in the training dataset used for creating the historically predictive model of the chiller. An example of this situation is illustrated in Figure 1.

The FXC cloud platform collects the chiller data every 15 minutes. A reasonable step in creating the predictive model is to calculate the average of the preprocessed data for a one-hour length period and for a 24-hour period. Once this is done for all the relevant KPIs, the mean and the standard deviation of each KPI are calculated. Based on the results, the data outliers are excluded from the training dataset by just keeping the data that falls within 3 standard deviations from the mean.

This step also involves identification of known maintenance events so that we can carefully form the training dataset around them.


The objective of exploratory data analysis is to identify the most important KPIs and draw conclusions from the physics involved for a given piece of equipment. One data science approach is to create a correlation table of the KPIs and see which KPIs have the strongest correlation.

Figure 1. Condenser water temperatures, leaving in blue and entering in green. The chiller status of operations (on/off) is in red. It can be noted that during the start and the stop of operation of the chiller, the temperature data gets very noisy and fluctuates. This data as well as the data during the off status should not be used for creating the historically trained probabilistic model of the chiller.

It is desirable to gather data and perform all this analysis on as many pieces of same equipment as possible. By corelating a set of KPI’s for 3 chillers belonging to a FacilityConneX hospital customer, it can be noted that the strongest correlation is among the electric demand, the outside temperature, the chilled water flow and the leaving and the entering temperatures of the evaporator and the condenser water loop.

After making all these steps and identifying these KPIs, for the creation of the train dataset for the predictive model, it is desirable also to create derived KPIs. For this particular case it is desirable to create the following derived KPIs that will be used for training the predictive model:

  • Electric demand in kW/Ton, which combines the electric demand and the water flow into a single KPI.
  • Condenser water temperature difference (entering – leaving).
  • Evaporator (or chilled) water temperature difference (leaving – entering).

While the outside temperature can be kept as it is. In this case we identify the Electric demand in kW/Ton as independent variable that our model needs to predict, while the rest of the KPIs are the dependent variables in the model.

Once we have the subset of the derived KPIs, we still need to do some data preparation and cleaning based on the equipment domain knowledge and the physics related to the chillers. We would like to create a baseline of a predictive model that reflects the optimal chiller operation. Once we succeed in that, our model will be able to diagnose poor chiller performance and suggest predictive maintenance.

To do this, we need to restrict our training dataset to a range of optimal values which, in this case we took to be the value of the electric demand in kW/Ton smaller than 1. The equipment domain knowledge in this case suggests an optimal value of 0.6 for the electric demand in kW/Ton. This step finalizes the training dataset.

In Part 2, you will learn more about the Machine Learning Model and how to leverage the Predictive Results.

Leave a reply

Your email address will not be published. Required fields are marked *