Skip to main content

Research on cooling load estimation through optimal hybrid models based on Naive Bayes


Cooling load estimation is crucial for energy conservation in cooling systems, with applications like advanced air-conditioning control and chiller optimization. Traditional methods include energy simulation and regression analysis, but artificial intelligence outperforms them. Artificial intelligence models autonomously capture complex patterns, adapt, and scale with more data. They excel at predicting cooling loads influenced by various factors, like weather, building materials, and occupancy, leading to dynamic, responsive predictions and energy optimization. Traditional methods simplify real-world complexities, highlighting artificial intelligence’s role in precise cooling load forecasting for energy-efficient building management. This study evaluates Naive Bayes-based models for estimating building cooling load consumption. These models encompass a single model, one optimized with the Mountain Gazelle Optimizer and another optimized with the horse herd optimization algorithm. The training dataset consists of 70% of the data, which incorporates eight input variables related to the geometric and glazing characteristics of the buildings. Following the validation of 15% of the dataset, the performance of the remaining 15% is tested. Based on analysis through evaluation metrics, among the three candidate models, Naive Bayes optimized with the Mountain Gazelle Optimizer (NBMG) demonstrates remarkable accuracy and stability, reducing prediction errors by an average of 18% and 31% compared to the other two models (NB and NBHH) and achieving a maximum R2 value of 0.983 for cooling load prediction.


In the contemporary era, the escalating demand for energy, primarily from residential and commercial sectors, poses challenges in efficiently managing industries like transportation and construction while striving to conserve energy [1, 2]. Recent studies emphasize the substantial contribution of a growing population to energy consumption in residential buildings [3, 4]. Efficiently managing a building’s energy consumption requires a thorough understanding of its performance, starting with the identification of energy sources and usage patterns. Key energy resources in buildings include district heating supply, electricity, and natural gas, with applications such as HV heating, ventilation, and air-conditioning (HVAC) systems, lighting, elevators, hot water, and kitchen equipment consuming this energy [1]. Among these, HVAC systems, important for residential infrastructure, significantly impact cooling load (CL) and heating load (HL), constituting around 40% of energy consumption in office buildings [5, 6]. Improving energy efficiency in urban residential buildings and employing dynamic load prediction in construction management are crucial measures to enhance HVAC system performance and conserve energy [7]. Forecasting dynamic air-conditioning loads is essential for HVAC system design, enabling adjustments to initiation times, curbing peak demand, optimizing costs, and improving energy utilization in cooling storage systems [8]. Accurately predicting building cooling loads is challenging due to various influencing factors, including optical and thermal characteristics and meteorological data [9,10,11].

Achieving sustainability in thermal management relies on efficiently separating latent and sensible loads in the cooling process. An effective strategy involves integrating an indirect evaporative cooler (IEC) with a dehumidification system, providing both enhanced cooling efficiency and a sustainable solution to rising energy demands. The improved IEC, featuring three significant modifications, becomes a cornerstone in this approach, pushing the coefficient of performance (COP) for cooling to an impressive 78. The dehumidification component, operating at a COP of approximately 4–5, complements the cooling-only COP, resulting in an overall COP of 7–8 [12].

Efforts to create energy-efficient buildings and enhance energy conservation are necessary in managing energy demand and resources. A primary strategy involves early predictions of HL and CL in residential structures. Accurate forecasting requires data on building specifications and local weather conditions [13]. Climatic elements such as temperature, wind speed, solar radiation, atmospheric pressure, and humidity significantly influence the prediction of building cooling and heating loads. Factors like relative compactness, roof dimensions, wall and glazing areas, roof height, and overall surface area should be considered when assessing a building’s load [14]. Building energy simulation tools play a crucial role in designing energy-efficient buildings, allowing for performance maximization and comparisons between buildings. Simulation outcomes have demonstrated high accuracy in replicating real-world measurements [15]. Although time-intensive and requiring proficient users, simulation software effectively assesses the influence of building design factors. In some cases, contemporary techniques like statistical analysis, artificial neural networks, and machine learning are adopted to predict cooling and heating loads and analyze the impact of different parameters [16].

HVAC system optimization involves three main categories: simulation, regression analysis, and artificial intelligence (AI). Simulation tools like DOE-2 [17], ESP-r [9], TRNSYS [10], and EnergyPlus [11, 18] are utilized for cooling load estimation when comprehensive building data is available. However, challenges arise in accurately measuring various parameters, and simplifying building models demands significant time and resources [19]. Simulation software is limited to real-time applications like online prediction or optimal operational control [20]. Regression analysis, known for its ease of use and computational efficiency, is preferred for diverse building types [21], employing both linear and nonlinear techniques [22, 23]. Additionally, research emphasizes the efficacy of ML and AI in building energy forecasting, favoring nonlinear approaches [24, 25]. Building cooling load prediction commonly involves key factors such as outdoor temperature, relative humidity, solar irradiation, and indoor occupancy schedules [26, 27]. Feature extraction methods, including engineering, statistical, and structural approaches, help condense raw data into informative formats, addressing the complexity introduced by historical data [21].

Numerous data mining methods have been applied to predict residential building energy requirements, including principal component analysis (PCA) [28], extreme learning machine (ELM) [29, 30], support vector machines (SVM) [31,32,33], k-means [34], deep learning [32, 33, 35,36,37], decision trees (DT) [38], various regression approaches, artificial neural networks [16, 39, 40], and hybrid models [41,42,43,44]. Researchers have employed diverse methodologies to forecast heating and cooling loads and energy demand in various building contexts. For instance, one study [45] predicted building heating load using the MLP method with meteorological data, while another simultaneously [46] predicted both cooling and heating loads with meteorological and date data inputs. Another study [16] examined a building’s energy performance using machine learning techniques, including general linear regression, artificial neural networks, decision trees, support vector regression (SVR), and ensemble inference models for cooling and heating load forecasting. Structural and interior design factors’ impact on cooling loads was explored through diverse regression models [47], and HVAC system energy demand was estimated from cooling and heating load requirements using different regression models. Commercial buildings’ cooling load and electric demand were forecasted for short-term and ultrashort-term management [48], enhancing energy efficiency through a hybrid SVR approach. Additionally, the SVR method was applied [49] to project cooling loads in a large coastal office building in China, introducing a novel vector-based SVR model for increased robustness and forecasting precision [50].

Naive Bayes is a fundamental probabilistic machine learning algorithm widely employed in various fields, including natural language processing, spam filtering, and classification tasks. It is rooted in Bayes’ theorem and assumes conditional independence between features, which is where the “naive” in its name originates. This simplifying assumption enables Naive Bayes to efficiently estimate the probability of a data point belonging to a particular class. Despite its simplicity, Naive Bayes often exhibits impressive classification performance, especially when dealing with high-dimensional and large datasets. To date, there is no article to use Naïve Bayes as the prediction model in the case of CL of the buildings. In this study, Naïve Bayes single model prediction performance is compared with two optimized counterparts (optimized with Mountain Gazelle Optimizer (MGO) and the horse herd optimization algorithm (HHO)). The following sections present an academic description of the model and selected optimizers and a comparative analysis between developed models.


Data collection

The main goal of this study is to forecast the cooling load (CL) in buildings. This is achieved by using experimental data extracted from energy consumption patterns documented in previous studies [51, 52]. Table 1 reports the statistical properties (minimum, maximum, average, and standard deviation) of the variables included in the training of the developed prediction models and the output. Input parameters include relative compactness (indicating the building’s surface area-to-volume ratio), surface area, roof area, wall area, orientation, overall height, glazing area (encompassing glazing, frame, and sash components), and the distribution of glazing area, and cooling load is the expected output variable.

Table 1 The statistic properties of the input variable of NB [51, 52]

Figure 1 visually represents the correlation among the variables examined in this study. The analysis depicted in the figure reveals compelling insights. Specifically, it becomes apparent that the overall height and relative compactness exhibit the most substantial positive impact on the cooling load. In contrast, roof area and surface area emerge as variables with the most pronounced negative influence on the cooling load. This graphical representation not only highlights the interrelationships between the variables but also emphasizes the varying degrees of impact each variable has on the cooling load.

Fig. 1
figure 1

The correlation between input and output parameters

Overview of machine learning methods and optimizers

Naive Bayes (NB)

The Naive Bayes (NB) classifier stands as a robust probabilistic model founded on Bayes’ theorem, which simplifies modeling by assuming independence among input variables. Its potential for substantial improvements in prediction accuracy becomes evident when combined with kernel density approximations, as highlighted in [53, 54].

The NB is a sophisticated system that smoothly integrates the Naive Bayes probability model into its decision-making process. This classifier relies on the maximum a posteriori (MAP) decision rule, a well-established method for identifying the most probable hypothesis from a given set of options. Additionally, there is a closely related classifier called the Bayes classifier. This robust algorithm is responsible for assigning class labels \(y={C}_{k}\), where k can range from 1 to K. This involves a detailed evaluation of various factors and variables, leading to the categorization of data points into predefined classes.

$$y=argmaxp\left({C}_{k}\right){\prod }_{i=1}^{n}p(({x}_{i}\left|{C}_{k}\right.))$$

In the provided equation, the variable \(y\) represents the predicted class label assigned by the Naive Bayes classifier. The term \({C}_{k}\) denotes a specific class, where \(k\) ranges from 1 to \(K\), indicating the total number of classes. The variable \(n\) represents the total number of input features or variables, and \({x}_{i}\) refers to the \(i-th\) input feature or variable. The term \(p\) \(\left({C}_{k}\right)\) represents the prior probability of class \({C}_{k}\), while \(p({x}_{i}\mid {C}_{k})\) denotes the conditional probability of observing \({x}_{i}\) given the class \({C}_{k}\).

Mountain gazelle optimizer (MGO)

The MGO algorithm is inspired by the behavior of mountain gazelles, which are grouped into bachelor herds, maternity herds, and solitary, territorial males. It aims to find optimal solutions by designating adult male gazelles in herd territories as global optima. Mathematically defined, the algorithm balances exploitation and exploration, gradually moving toward optimal solutions using four specified exploration mechanisms [55].

Territorial solitary males

Mature mountain gazelles establish solitary territories, vigorously defending them from other males seeking access to females. Equation (2) models these territories.

$$TSM={male}_{gzl}-\left|\left({ri}_{1}\times YH-{ri}_{2}\times X\left(t\right)\right)\times F\right|\times {Cof}_{r}$$

Equation (2) describes \({male}_{gzl}\) as the adult man is the most effective overall solution, as seen by the position vector. The variables \({ri}_{2}\) and \({ri}_{1}\) are random integers that can take on a value of either 1 or 2 [55]. YH denoted the coefficient vector of utilizing Eq. (3), and one can determine the young male herd. Similarly, \(F\) is computed using Eq. (4). In each iteration, the coefficient vector \({Cof}_{r}\), selected at random, undergoes updates and is employed to augment the search capability. This coefficient vector is specified using Eq. (3).

$$YH={X}_{ra}\times \lfloor{r}_{1}\rfloor+{M}_{pr}\times \lceil{r}_{2}\rceil, ra=\left\{\lceil\frac{N}{3}\rceil\dots N\right\}$$

Here, \({X}_{ra}\) denotes a random solution (young 1 male) within the range of \(ra\). \({M}_{pr}\) refers to the average number of search agents, which is equal to \(\lceil\frac{N}{3}\rceil\), and \(N\) is the total number of gazelles, while \({r}_{1}\) and \({r}_{2}\) are random values in \(\left[0, 1\right]\).

$$F={N}_{1}(D)\times {\text{exp}}\left(2-Iter\times \left(\frac{2}{MaxIter}\right)\right)$$

Equation (4) incorporates multiple variables associated with the problem’s dimensions. A randomly generated number following a standard distribution denoted as \({N}_{1}\) and \(exp\) is the equation that employs the exponential function. \(Iter\) shows the ongoing iteration number in the process, and \(MaxIter\) signifies the total count of iterations.

$${Cof}_{i}=\left\{\begin{array}{c}\left(x+1\right)+{r}_{3},\\ x\times {N}_{2}\left(D\right),\\ \genfrac{}{}{0pt}{}{{r}_{4}\left(D\right),}{{N}_{3}\left(D\right)\times {N}_{4}{\left(D\right)}^{2}\times {\text{cos}}\left(\left({r}_{4}\times 2\right)\times {N}_{3}\left(D\right)\right),}\end{array}\right.$$
$$x=-1+Iter\times \left(\frac{-1}{MaxIter}\right)$$

Additionally, \({r}_{3}\), \({r}_{4}\), and \(rand\) are random numbers from 0 to 1 [55]. \({N}_{2}\), \({N}_{3}\), and \({N}_{4}\) denote random numbers drawn from a typical distribution, and it is related to the dimensions of the problem. \(Iter\) indicates the current iteration number, while \(MaxIter\) is the number of iterations to be performed.

Maternity herds

Maternity herds hold a crucial position within the mountain gazelles’ life cycle since they are principally responsible for producing strong male gazelles. Furthermore, male gazelles may actively participate in the delivery process of the offspring and confront the presence of younger males attempting to mate with females. This behavioral interplay is expressed mathematically in Eq. (7).

$$MH=\left(YH+{Cof}_{1,r}\right)+({ri}_{3}\times {male}_{gzl}-{ri}_{4}\times {X}_{rand})\times {Cof}_{1,r}$$

Here, \(YH\) signifies the young men’s impact factor vector, which is determined by using Eq. (3). \({Cof}_{2,r}\) and \({Cof}_{3,r}\) random vectors for the coefficients are determined independently using Eq. (5). \({ri}_{3}\) and \({ri}_{4}\) are random integers that can take on a value of either 1 or 2. \({male}_{gzl}\) denoted the best global solution (adult male) in the current iteration. Ultimately, \({X}_{rand}\) corresponds to the location vector of a gazelle chosen at random from the entire herd.

Bachelor male herds

Male gazelles create territories after they reach adulthood and engage in mating pursuit, a period marked by intense competition between young and adult males for territory control and access to females, as mathematically captured in Eq. (8).

$$YMH=\left(X\left(t\right)-D\right)+({ri}_{5}\times {male}_{gazelle}-{ri}_{6}\times YH)\times {Cof}_{r}$$
$$D=(\left|X\left(t\right)\right|+\left|{male}_{gzl}\right|)\times (2\times {r}_{6}-1)$$

where \(X\left(t\right)\) indicates the gazelle’s current iteration’s location vector. The variables \({ri}_{5}\) and \({ri}_{6}\) are random integers that can take a value of either 1 or 2. The ideal answer designates the male gazelle’s location vector as \({male}_{gzl}\). \({r}_{6}\) is also a random number from 0 to 1.

Migration to search for food

Equation (10), which describes how mountain gazelles forage for food, takes into account their extraordinary sprinting and leaping speed.

$$MSF=\left(ul-ll\right)\times {r}_{7}+ll$$

where \(ul\) and \(ll\) represent the lower and upper limits of the problem, respectively. Furthermore, \({r}_{7}\) is a random integer in \(\left[0, 1\right]\), and it is selected randomly.

The pseudo-code of MGO is available as follows:

\(\% MGO\) setting

Inputs: The population size \(N\) and maximum number of iterations \(I\)

Outputs: Gazelle’s location and fitness potential

\(\%\) initialization

Create a random population using \({X}_{i}(i = 1, 2, ..., N)\)

Calculate the gazelle’s fitness level

While (the stopping condition is not met) do

For (each gazelle (\({X}_{i}\))) do

\(\%\) Alone male realm

Calculate TSM using Eq. (2)

\(\%\) Mother and child herd

Calculate \(MH\) using Eq. (7)

\(\%\) Young male herd

Calculate \(YMH\) using Eq. (8)

\(\%\) Migration to search for food

Calculate \(MSF\) using Eq. (10)

Calculate the fitness values of \(TSM,\ MH,\ YMH,\) and \(MSF\) and then add them to the habitat

End for

Sort the entire population in ascending order

Update \({best}_{{\text{Gazelle}}}\)

Save the \(N\) best gazelles in the max number of population

end, while

Return \({X}_{{\text{BestGazelle}}}\),\(best\ Fitness\)

Horse herd optimization algorithm (HOA)

The HOA is based on how horses behave in the wild [56]. This information is based on six specific behaviors: grazing, hierarchy, imitation, sociability, roaming, and defense mechanisms. These actions are the foundation of HOA, directing the movement of horses in each cycle, as detailed in Eq. (11):

$${X}_{m}^{Iter,A}={\overrightarrow{V}}_{m}^{Iter,A}+{X}_{m}^{(Iter-1),A}, A (Age)=\alpha ,\beta ,\gamma ,\delta$$

where \({X}_{m}^{Iter,A}\) denotes the position of the \(m-th\) horse, \(A\) represents the age range, and \(Iter\) is the current iteration. \(A\) also reflects the horse’s age range, while \({\overrightarrow{V}}_{m}^{Iter,A}\) indicates the velocity vector of the horse. Horses typically live between 25 and 30 years, exhibiting various behaviors throughout their lifespan. These behaviors are categorized into \(\delta\) (0–5 years), \(\gamma\) (5–10 years), and \(\alpha\) (older than 15 years) groups. An extensive response matrix determines how old horses are sorted by how well they perform. The top 10% form group \(\alpha\), the next 20% belong to group \(\beta\), and the remaining 30% and 40% are categorized as groups \(\gamma\) and \(\delta\), respectively. Motion vectors corresponding to equines of varying age groups and computational cycles within the algorithm are established following these behavioral patterns.

$$\begin{array}{l}{\overrightarrow{V}}_{m}^{Iter,\alpha }={\overrightarrow{G}}_{m}^{Iter,\alpha }+{\overrightarrow{D}}_{m}^{Iter,\alpha }\\ {\overrightarrow{V}}_{m}^{Iter,\beta }={\overrightarrow{G}}_{m}^{Iter,\beta }+{\overrightarrow{H}}_{m}^{Iter,\beta }+{\overrightarrow{S}}_{m}^{Iter,\beta }+{\overrightarrow{D}}_{m}^{Iter,\beta }\\ \begin{array}{l}{\overrightarrow{V}}_{m}^{Iter,\gamma }={\overrightarrow{G}}_{m}^{Iter,\gamma }+{\overrightarrow{H}}_{m}^{Iter,\gamma }+{\overrightarrow{S}}_{m}^{Iter,\gamma }+{\overrightarrow{I}}_{m}^{Iter,\gamma }+{\overrightarrow{D}}_{m}^{Iter,\gamma }+{\overrightarrow{R}}_{m}^{Iter,\gamma }\\ {\overrightarrow{V}}_{m}^{Iter,\delta }={\overrightarrow{G}}_{m}^{Iter,\delta }+{\overrightarrow{I}}_{m}^{Iter,\delta }+{\overrightarrow{R}}_{m}^{Iter,\delta }\end{array}\end{array}$$

To elucidate the derivation of the global matrix, Eqs. (13) and (14) are utilized, and a relationship between positions (\(X\)) and their respective cost values (\(C(X)\)) is established.

$$X=\left[\begin{array}{ccc}{x}_{\mathrm{1,1}}& \begin{array}{cc}{x}_{\mathrm{1,2}}& \dots \end{array}& {x}_{1,d}\\ \begin{array}{c}{x}_{\mathrm{2,1}}\\ \vdots \end{array}& \begin{array}{c}\begin{array}{cc}{x}_{\mathrm{2,2}}& \dots \end{array}\\ \begin{array}{cc}\vdots & \ddots \end{array}\end{array}& \begin{array}{c}{x}_{2,d}\\ \vdots \end{array}\\ {x}_{m,1}& \begin{array}{cc}{x}_{m,2}& \dots \end{array}& {x}_{m,d}\end{array}\right], C\left(X\right)=\left[\begin{array}{c}{c}_{1}\\ \begin{array}{c}{c}_{2}\\ \vdots \end{array}\\ {c}_{m}\end{array}\right]$$
$$Global\ Matrix=\left[X C\left(X\right)\right]=\left[\begin{array}{ccc}{x}_{\mathrm{1,1}}& \begin{array}{cc}{x}_{\mathrm{1,2}}& \dots \end{array}& \begin{array}{cc}{x}_{1,d}& {c}_{1}\end{array}\\ \begin{array}{c}{x}_{\mathrm{2,1}}\\ \vdots \end{array}& \begin{array}{c}\begin{array}{cc}{x}_{\mathrm{2,2}}& \dots \end{array}\\ \begin{array}{cc}\vdots & \ddots \end{array}\end{array}& \begin{array}{c}\begin{array}{cc}{x}_{2,d}& {c}_{2}\end{array}\\ \begin{array}{cc}\vdots & \vdots \end{array}\end{array}\\ {x}_{m,1}& \begin{array}{cc}{x}_{m,2}& \dots \end{array}& \begin{array}{cc}{x}_{m,d}& {c}_{m}\end{array}\end{array}\right]$$

Here, \(m\) indicates the count of horses, and \(d\) is the dimensions of the problem. After that, the global matrix is arranged according to the final column, which signifies costs. The horse’s age is recorded in this column. The velocity of horses under 5 years age range is as follows:

$${\overrightarrow{V}}_{m}^{Iter,\delta }=\left[{g}_{m}^{\left(Iter-1\right),\delta } {\omega }_{g}\left(\breve {u}+P\breve {l}\right)\left[{X}_{m}^{\left(Iter-1\right)}\right]\right]+[{i}_{m}^{\left(Iter-1\right),\delta } {\omega }_{i}[(\frac{1}{{P}^{N}}{\sum\limits_{j=1}^{{P}^{N}}}{\widehat{X}}_{j}^{Iter-1})-{X}^{Iter-1}]]]+[{r}_{m}^{\left(Iter-1\right),\delta } {\omega }_{r}P{X}^{Iter-1}$$

The velocity of horses between 5 and 10 years age range:

$${\overrightarrow{V}}_{m}^{Iter,\gamma }=\left[{g}_{m}^{\left(Iter-1\right),\gamma } {\omega }_{g}\left(\breve {u}+P\breve {l}\right)\left[{X}_{m}^{\left(Iter-1\right)}\right]\right]+{[h}_{m}^{\left(Iter-1\right),\gamma }{\omega }_{h}\left[{{X}_{*}^{\left(Iter-1\right)}-X}_{m}^{\left(Iter-1\right)}\right]]+[{S}_{m}^{\left(Iter-1\right),\gamma } {\omega }_{S}[(\frac{1}{N}{\sum\limits_{j=1}^{N}}{X}_{j}^{Iter-1})-{X}^{Iter-1}]] +[{i}_{m}^{\left(Iter-1\right),\gamma } {\omega }_{i}[(\frac{1}{{P}^{N}}{\sum\limits_{j=1}^{{P}^{N}}}{\widehat{X}}_{j}^{Iter-1})-{X}^{Iter-1}]]-[{d}_{m}^{\left(Iter-1\right),\gamma } {\omega }_{d}[(\frac{1}{{q}^{N}}{\sum\limits_{j=1}^{{q}^{N}}}{\widehat{X}}_{j}^{Iter-1})-{X}^{Iter-1}]]+[{r}_{m}^{\left(Iter-1\right),AGE} {\omega }_{r}P{X}^{Iter-1}]$$

The velocity of horses between 10 and 15 years age range:

$${\overrightarrow{V}}_{m}^{Iter,\beta }=\left[{g}_{m}^{\left(Iter-1\right),\beta } {\omega }_{g}\left(\breve {u}+P\breve {l}\right)\left[{X}_{m}^{\left(Iter-1\right)}\right]\right]+{[h}_{m}^{\left(Iter-1\right),\beta }{\omega }_{h}\left[{{X}_{*}^{\left(Iter-1\right)}-X}_{m}^{\left(Iter-1\right)}\right]]+[{S}_{m}^{\left(Iter-1\right),\beta } {\omega }_{S}[(\frac{1}{N}{{\sum\limits_{j=1}^{N}}}{X}_{j}^{Iter-1})-{X}^{Iter-1}]]-[{d}_{m}^{\left(Iter-1\right),\beta } {\omega }_{d}[(\frac{1}{{q}^{N}}{\sum\limits_{j=1}^{{q}^{N}}}{\breve {X}}_{j}^{Iter-1})-{X}^{Iter-1}]]$$

Horses that are 15 years or older exhibit the following velocity:

$${\overrightarrow{V}}_{m}^{Iter,\alpha }=\left[{g}_{m}^{\left(Iter-1\right),\alpha } {\omega }_{g}\left(\breve{u}+P\breve{l}\right)\left[{X}_{m}^{\left(Iter-1\right)}\right]\right]-[{d}_{m}^{\left(Iter-1\right),\alpha } {\omega }_{d}[(\frac{1}{{q}^{N}}{\sum\limits_{j=1}^{{q}^{N}}}{\breve X}_{j}^{Iter-1})-{X}^{Iter-1}]]$$

Results and discussion

Hyperparameter results

External configurations referred to as hyperparameters—such as alpha and binarize—are important in shaping a model’s behavior. Distinguished from parameters, these hyperparameters are predetermined and not acquired through the learning process of the data. The optimization of model performance significantly relies on the fine-tuning of hyperparameters, a nuanced process that demands both experimentation and the strategic application of optimization techniques. Table 2 outlines the hyperparameter values for the NBMG and NBHH models. By providing intricate insights into the intricacies of hyperparameter configurations, it becomes an indispensable tool for comprehending and, crucially, reproducing model setups. This exposition not only elevates the technical aspects of the research but also contributes to the broader scholarly discourse in the field of machine learning.

Table 2 The results of hyperparameters for NB

Prediction performance analysis

The assessment of the predictive effectiveness of the constructed models involved the utilization of five distinct metrics, which relied on actual observed values (\({T}_{i}\)) and corresponding predicted values (\({P}_{i}\)). Here, the symbols \(\overline{T }\) and \(\overline{P }\) denote the mean of all the outcomes subjected to testing and predicting. In contrast, \(n\) signifies the total count of samples encompassed within the analyzed dataset. A description of these metrics is presented as follows:

  1. (1)

    The coefficient of determination (R2) numerically represents the portion of the variability in the dependent variable that can be anticipated through the independent variables integrated into the model.

    $${R}^{2}={\left(\frac{{\sum }_{i=1}^{n}({T}_{i}- \overline{T })({P}_{i}-\overline{P })}{\sqrt{\left[{\sum }_{i=1}^{n}{({T}_{i}-\overline{P })}^{2}\right]\left[{\sum }_{i=1}^{n}{({P}_{i}-\overline{P })}^{2}\right]}}\right)}^{2}$$
  2. (2)

    Root-mean-square error (RMSE) denotes the square root of the squared disparities’ mean between the projected and observed values. This quantifies the typical magnitude of the discrepancies the model introduces when forecasting the target variable.

    $$RMSE=\sqrt{\frac{{\sum }_{i=1}^{n}{({P}_{i}-{T}_{i})}^{2}}{n}}$$
  3. (3)

    Mean squared error (MSE) calculates the average of the squared differences between predicted and actual values, measuring how well a model’s predictions match the actual data. Lower MSE values indicate better predictive accuracy and a closer fit to the observed data.

    $$MSE= \frac{1}{n}{\sum }_{i=1}^{n}{({P}_{i}-{T}_{i})}^{2}$$
  4. (4)

    Nash–Sutcliffe efficiency (NSE) assesses how well a model’s predictions match observed values, considering the variability of the observed data. Higher NSE values indicate better model performance, with 1 indicating a perfect match.

    $$NSE=1-\frac{{\sum }_{i=1}^{n}{({P}_{i}-{T}_{i})}^{2}}{{\sum }_{i=1}^{n}{({T}_{i}-\overline{T })}^{2}}$$
  5. (5)

    MDAPE (mean directional absolute percentage error) expresses the average percentage difference between the predicted and actual values, considering the direction of the errors (underestimation or overestimation).

    $$RAE=\frac{{\sum }_{i=1}^{n}\left|{P}_{i}-{T}_{i}\right|}{{\sum }_{i=1}^{n}\left|{T}_{i}-\overline{T }\right|}$$

The following discussion comprehensively analyzes the model’s performance in predicting CL based on Table 3:

  • NB (single model): A minimum R2 value of 0.963 is reported for this model. High error values of (\(RMSE=2.147\), \(MSE=4.610\), and \(MDAPE=7.482\)) indicated low accuracy of this traditional model, especially in the testing phase. Low NSE values of 0.966, 0.958, and 0.949 in the training, validation, and testing phases confirm the high variability of estimated data.

  • NBMG (NB + MGO): High R2 values of 0.986, 0.980, and 0.974 in training, validation, and testing phases and low error values, especially in the case of NBMG, which are almost twice lower than NB single model indicate superior optimization performance of MGO in enhancing CL prediction capability of NB.

  • NBHH (NB + HHO): This model with marginal lower R2 (lower than 1%) and higher error values (on average 20%) has weaker performance than NBMG. However, the MGO algorithm has notably enhanced the NB’s prediction accuracy.

Table 3 The result of developed models for NB

Figure 2 visually illustrates the trends in error values (RMSE, MSE) and R2 for the three models developed in this study. The comparative analysis reveals a consistent decrease in R2 values from training to testing across all models, indicating a weakness in the training ability of the models. Notably, all data columns for R2 values of NBMG are higher than those of NB but show similar heights to NBHH. In terms of RMSE and MSE error values, the NBMG model, particularly during the training phase, demonstrated significantly lower error values compared to the other models. As detailed in Table 3 and depicted in Fig. 2, the NBMG model showcased the best performance in predicting CL values, boasting an impressive R2 of 0.986, RMSE of 1.129 KW, and MSE of 1.275 KW.

Fig. 2
figure 2

The comparison of parameters

Figure 3 provides a comprehensive visual representation through a scatter plot, elucidating the relationship between predicted and measured samples for the CL. The scrutiny of these samples unfolds across three distinct phases, each phase offering valuable insights into the model’s performance. The allocation of sample points in the plot is guided by two main metrics: RMSE, which characterizes the dispersion within the figure, and R2, a measure that assesses the degree of collinearity among the sample points. In this visual exploration, the coincidence of a high R2 value with a low RMSE value signifies an optimal state where the predicted values closely align with the measured values, approximating the center (\(X = Y\)). To facilitate interpretation, two dashed lines are introduced onto the plot, delineating 15% overestimation and underestimation. Significantly, upon closer examination, the NBMG and NBHH hybrid models emerge as standout performers. These models, marked by their lowest RMSE values and highest R2 values, showcase a level of performance that surpasses the NB single model. It is worth highlighting that while the NBHH model exhibits some comparative weakness against the NBMG model, it does present certain data points with overestimation exceeding 15%. This nuanced observation adds depth to the understanding of the models’ performance dynamics across various scenarios and contributes to a more comprehensive evaluation of their predictive capabilities.

Fig. 3
figure 3

The scatter plot for developed hybrid models

Figure 4 employs a line plot in this investigation to comprehensively compare the variation in error values across three developed models. The range of errors for NBMG is approximately half that of NBHH, underscoring the advantageous capability of the MGO algorithm. Furthermore, in the case of NBMG, the error rate during the training phase is only half that observed in the other two phases, suggesting that MGO exhibits superior prediction performance during the training phase compared to the other models. This observation is corroborated by Fig. 5, which illustrates the normal distribution of errors for MGO, displaying a narrow bell-shaped curve indicative of a high concentration of errors near 0%.

Fig. 4
figure 4

The error rate percentage for the hybrid models is based on the line plot

Fig. 5
figure 5

The normal distribution plot of errors among the developed models

Figure 6 presents Taylor diagrams that vividly depict the performance of the employed predictive models, namely NB, NBMG, and NBHH. These diagrams serve as statistical syntheses, integrating both observed and predicted CL and incorporating essential metrics such as RMSE, correlation coefficients (CC), and normalized standard deviations. The visual representation within the figure provides a comprehensive overview of the model performances. Notably, the NBMG model, an amalgamation of the NB model, and the MGO optimizer emerge as the optimal predictive model. The outcomes of this model closely align with the ideal benchmark observed in the experimental data. This alignment signifies the effectiveness of the NBMG model in capturing the intricate patterns of the cooling load, emphasizing its superior predictive capabilities compared to the other models under consideration.

Fig. 6
figure 6

The Taylor diagram for developed models

Examining the kernel smooth distribution of errors during the prediction of CL values across the training, validation, and testing phases, Fig. 7 provides a graphical insight into the performance of three distinct models (NB, NBMG, and NBHH). Notably, the NB model displayed the highest errors during the testing phase, whereas the NBMG model showcased the lowest errors. Consistent favorability toward the NBMG hybrid model emerged across all stages of analysis. In the testing phase of the NB model, errors ranged widely from − 25 to 30. Conversely, the NBMG model, exhibiting superior performance during the training phase, featured errors predominantly concentrated within a narrower range of − 15 to 15. This emphasis on a refined error distribution underscores the heightened predictive accuracy of the NBMG model, especially when compared to the broader range observed in the NB model’s testing phase.

Fig. 7
figure 7

The kernel smooth plot of errors among the developed models


Accurate building cooling load forecasting is vital for optimizing HVAC systems, reducing costs, and enhancing energy efficiency. However, it remains challenging due to the complex interplay of building characteristics and meteorological data. Prior studies emphasize the effectiveness of machine learning in building energy forecasting, favoring nonlinear approaches. Naive Bayes, a foundational machine learning algorithm, was unexplored in this context. Naive Bayes-based models encompassed a single model, one optimized with the Mountain Gazelle Optimizer (MGO) and another optimized with the horse herd optimization (HHO) algorithm. The research findings underscore the exceptional performance of the NBMG model, consistently outperforming its counterparts by reducing prediction errors by an average of 20% and achieving a maximum R2 value of 0.982 for cooling load prediction. This highlights the substantial potential of machine learning, as NBMG exemplifies, to significantly enhance the precision of energy consumption forecasts. Consequently, it empowers decision-makers in energy conservation and retrofit strategies, contributing to the overarching goals of sustainable building operations and reduced environmental impact.

Availability of data and materials

Data can be shared upon request.


  1. Leitao J, Gil P, Ribeiro B, Cardoso A (2020) A survey on home energy management. IEEE Access 8:5699–5722

    Article  Google Scholar 

  2. Gong H, Rallabandi V, McIntyre ML, Hossain E, Ionel DM (2021) Peak reduction and long term load forecasting for large residential communities including smart homes with energy storage. IEEE Access 9:19345–19355

    Article  Google Scholar 

  3. Hannan MA, Faisal M, Ker PJ, Mun LH, Parvin K, Mahlia TMI, Blaabjerg F (2018) A review of Internet of energy based building energy management systems: issues and recommendations, Ieee. Access 6:38997–39014

    Article  Google Scholar 

  4. Sadeghian O, Moradzadeh A, Mohammadi-Ivatloo B, Abapour M, Anvari-Moghaddam A, Lim JS, Marquez FPG (2021) A comprehensive review on energy saving options and saving potential in low voltage electricity distribution networks: building and public lighting. Sustain Cities Soc 72:103064

    Article  Google Scholar 

  5. Sadeghian O, Moradzadeh A, Mohammadi-Ivatloo B, Abapour M, Garcia Marquez FP (2020) Generation units maintenance in combined heat and power integrated systems using the mixed integer quadratic programming approach. Energies (Basel). 13:2840

    Article  Google Scholar 

  6. Nami H, Anvari-Moghaddam A, Arabkoohsar A (2020) Application of CCHPs in a centralized domestic heating, cooling and power network—thermodynamic and economic implications. Sustain Cities Soc 60:102151

    Article  Google Scholar 

  7. Chen Q, Xia M, Lu T, Jiang X, Liu W, Sun Q (2019) Short-term load forecasting based on deep learning for end-user transformer subject to volatile electric heating loads. IEEE Access 7:162697–162707

    Article  Google Scholar 

  8. Yao Y, Lian Z, Liu S, Hou Z (2004) Hourly cooling load prediction by a combined forecasting model based on analytic hierarchy process. Int J Therm Sci 43:1107–1118

    Article  Google Scholar 

  9. Probst O (2004) Cooling load of buildings and code compliance. Appl Energy 77:171–186

    Article  ADS  Google Scholar 

  10. Bojić M, Yik F (2005) Cooling energy evaluation for high-rise residential buildings in Hong Kong. Energy Build 37:345–351

    Article  Google Scholar 

  11. Ansari FA, Mokhtar AS, Abbas KA, Adam NM (2005) A simple approach for building cooling load estimation. Am J Environ Sci 1:209–212

    Article  Google Scholar 

  12. Shahzad MW, Burhan M, Ybyraiymkul D, Oh SJ, Ng KC (2019) An improved indirect evaporative cooler experimental investigation. Appl Energy. 256:113934.

    Article  Google Scholar 

  13. Moradzadeh A, Moayyed H, Zakeri S, Mohammadi-Ivatloo B, Aguiar AP (2021) Deep learning-assisted short-term load forecasting for sustainable management of energy in microgrid. Inventions 6:15

    Article  Google Scholar 

  14. Chen S, Zhang X, Wei S, Yang T, Guan J, Yang W, Qu L, Xu Y (2019) An energy planning oriented method for analyzing spatial-temporal characteristics of electric loads for heating/cooling in district buildings with a case study of one university campus. Sustain Cities Soc 51:101629

    Article  Google Scholar 

  15. Tsanas A, Goulermas JY, Vartela V, Tsiapras D, Theodorakis G, Fisher AC, Sfirakis P (2009) The Windkessel model revisited: a qualitative analysis of the circulatory system. Med Eng Phys 31:581–588

    Article  PubMed  Google Scholar 

  16. Chou J-S, Bui D-K (2014) Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy Build 82:437–446

    Article  Google Scholar 

  17. Bojic M, Yik F, Wan K, Burnett J (2000) Investigations of cooling loads in high-rise residential buildings in Hong Kong, in: Thermal Sciences 2000. Proceedings of the International Thermal Science Seminar. Volume 1, Begel House Inc

  18. Chou SK, Chang WL (1997) Large building cooling load and energy use estimation. Int J Energy Res 21:169–183

    Article  Google Scholar 

  19. Sodha MS, Kaur B, Kumar A, Bansal NK (1986) Comparison of the admittance and Fourier methods for predicting heating/cooling loads. Sol Energy (United Kingdom) 36

  20. Mui KW, Wong LT (2007) Cooling load calculations in subtropical climate. Build Environ 42:2498–2504

    Article  Google Scholar 

  21. Shin M, Do SL (2016) Prediction of cooling energy use in buildings using an enthalpy-based cooling degree days method in a hot and humid climate. Energy Build 110:57–70

    Article  Google Scholar 

  22. Yun K, Luck R, Mago PJ, Cho H (2012) Building hourly thermal load prediction using an indexed ARX model. Energy Build 54:225–233

    Article  Google Scholar 

  23. Korolija I, Zhang Y, Marjanovic-Halburd L, Hanby VI (2013) Regression models for predicting UK office building energy consumption from heating and cooling demands. Energy Build 59:214–227

    Article  Google Scholar 

  24. Deb C, Eang LS, Yang J, Santamouris M (2016) Forecasting diurnal cooling energy load for institutional buildings using artificial neural networks. Energy Build 121:284–297

    Article  Google Scholar 

  25. Gunay B, Shen W, Newsham G (2017) Inverse blackbox modeling of the heating and cooling load in office buildings. Energy Build 142:200–210

    Article  Google Scholar 

  26. Kavaklioglu K (2011) Modeling and prediction of Turkey’s electricity consumption using support vector regression. Appl Energy 88:368–375

    Article  ADS  Google Scholar 

  27. Li Q, Meng Q, Cai J, Yoshino H, Mochida A (2009) Applying support vector machine to predict hourly cooling load in the building. Appl Energy 86:2249–2256

    Article  ADS  Google Scholar 

  28. Moradzadeh A, Sadeghian O, Pourhossein K, Mohammadi-Ivatloo B, Anvari-Moghaddam A (2020) Improving residential load disaggregation for sustainable development of energy via principal component analysis. Sustainability 12:3158

    Article  Google Scholar 

  29. Zhao J, Liu X (2018) A hybrid method of dynamic cooling and heating load forecasting for office buildings based on artificial intelligence and regression analysis. Energy Build 174:293–308

    Article  Google Scholar 

  30. Roy SS, Roy R, Balas VE (2018) Estimating heating load in buildings using multivariate adaptive regression splines, extreme learning machine, a hybrid model of MARS and ELM. Renew Sustain Energy Rev 82:4256–4268

    Article  Google Scholar 

  31. Moradzadeh A, Zeinal-Kheiri S, Mohammadi-Ivatloo B, Abapour M, Anvari-Moghaddam A (2020) Support vector machine-assisted improvement residential load disaggregation, in: 2020 28th Iranian Conference on Electrical Engineering (ICEE). IEEE 1–6

  32. Luo XJ, Oyedele LO, Ajayi AO, Akinade OO (2020) Comparative study of machine learning-based multi-objective prediction framework for multiple building energy loads. Sustain Cities Soc 61:102283

    Article  Google Scholar 

  33. Moradzadeh A, Zakeri S, Shoaran M, Mohammadi-Ivatloo B, Mohammadi F (2020) Short-term load forecasting of microgrid via hybrid support vector regression and long short-term memory algorithms. Sustainability 12:7076

    Article  Google Scholar 

  34. Ding Y, Su H, Kong X, Zhang Z (2020) Ultra-short-term building cooling load prediction model based on feature set construction and ensemble machine learning. IEEE Access 8:178733–178745

    Article  Google Scholar 

  35. Wang Z, Hong T, Piette MA (2019) Data fusion in predicting internal heat gains for office buildings through a deep learning approach. Appl Energy 240:386–398

    Article  ADS  Google Scholar 

  36. Roy SS, Samui P, Nagtode I, Jain H, Shivaramakrishnan V, Mohammadi-Ivatloo B (2020) Forecasting heating and cooling loads of buildings: a comparative performance analysis. J Ambient Intell Humaniz Comput 11:1253–1264

    Article  Google Scholar 

  37. Song J, Xue G, Pan X, Ma Y, Li H (2020) Hourly heat load prediction model based on temporal convolutional neural network. IEEE Access 8:16726–16741

    Article  Google Scholar 

  38. Yu Z, Haghighat F, Fung BCM, Yoshino H (2010) A decision tree method for building energy demand modeling. Energy Build 42:1637–1646

    Article  Google Scholar 

  39. Ahmad T, Chen H (2018) Short and medium-term forecasting of cooling and heating load demand in building environment with data-mining based approaches. Energy Build 166:460–476

    Article  Google Scholar 

  40. Moradzadeh A, Mansour-Saatloo A, Mohammadi-Ivatloo B, Anvari-Moghaddam A (2020) Performance evaluation of two machine learning techniques in heating and cooling loads forecasting of residential buildings. Appl Sci 10:3829

    Article  CAS  Google Scholar 

  41. Geysen D, De Somer O, Johansson C, Brage J, Vanhoudt D (2018) Operational thermal load forecasting in district heating networks using machine learning and expert advice. Energy Build 162:144–153

    Article  Google Scholar 

  42. Cui B, Fan C, Munk J, Mao N, Xiao F, Dong J, Kuruganti T (2019) A hybrid building thermal modeling approach for predicting temperatures in typical, detached, two-story houses. Appl Energy 236:101–116

    Article  ADS  Google Scholar 

  43. Wang R, Lu S, Feng W (2020) A novel improved model for building energy consumption prediction based on model integration. Appl Energy 262:114561

    Article  Google Scholar 

  44. Chen Q, M Kum Ja, Burhan M, Akhtar FH, Shahzad MW, Ybyraiymkul D, Ng KC (2021) A hybrid indirect evaporative cooling-mechanical vapor compression process for energy-efficient air conditioning. Energy Convers Manag. 248:114798.

    Article  Google Scholar 

  45. Wong SL, Wan KKW, Lam TNT (2010) Artificial neural networks for energy analysis of office buildings with daylighting. Appl Energy 87:551–557

    Article  ADS  Google Scholar 

  46. Paudel S, Elmtiri M, Kling WL, Le Corre O, Lacarrière B (2014) Pseudo dynamic transitional modeling of building heating energy demand using artificial neural network. Energy Build 70:81–93

    Article  Google Scholar 

  47. Schiavon S, Lee KH, Bauman F, Webster T (2010) Influence of raised floor on zone design cooling load in commercial buildings. Energy Build 42:1182–1191

    Article  Google Scholar 

  48. Fan C, Wang J, Gang W, Li S (2019) Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Appl Energy 236:700–710

    Article  ADS  Google Scholar 

  49. Zhong H, Wang J, Jia H, Mu Y, Lv S (2019) Vector field-based support vector regression for building energy consumption prediction. Appl Energy 242:403–414

    Article  ADS  Google Scholar 

  50. B.S.A.J. khiavi; B.N.E.K.A.R.T.K. hadi Sadaghat (2023) The utilization of a Naïve Bayes model for predicting the energy consumption of buildings. J Art Intel Syst Modelling 01.

  51. Zhou G, Moayedi H, Bahiraei M, Lyu Z (2020) Employing artificial bee colony and particle swarm techniques for optimizing a neural network in prediction of heating and cooling loads of residential buildings. J Clean Prod 254:120082

    Article  Google Scholar 

  52. Pessenlehner W, Mahdavi A (2023) Building morphology, transparence, and energy performance, na

    Google Scholar 

  53. Hastie T, Tibshirani R, Friedman JH, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction, Springer, New York City

  54. Piryonesi SM, El-Diraby TE (2020) Role of data analytics in infrastructure asset management: overcoming data size and quality problems. J Transportation Eng Part B: Pavements 146:4020022

    Article  Google Scholar 

  55. Abdollahzadeh B, Gharehchopogh FS, Khodadadi N, Mirjalili S (2022) Mountain gazelle optimizer: a new nature-inspired metaheuristic algorithm for global optimization problems. Adv Eng Softw 174:103282

    Article  Google Scholar 

  56. MiarNaeimi F, Azizyan G, Rashki M (2021) Horse herd optimization algorithm: a nature-inspired algorithm for high-dimensional optimization problems. Knowl Based Syst 213:106711

    Article  Google Scholar 

Download references


I would like to take this opportunity to acknowledge that there are no individuals or organizations that require acknowledgment for their contributions to this work.


This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations



The author contributed to the study’s conception and design. Data collection, simulation, and analysis were performed by “YX.”

Corresponding author

Correspondence to Ying Xu.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Y. Research on cooling load estimation through optimal hybrid models based on Naive Bayes. J. Eng. Appl. Sci. 71, 75 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: