Research on cooling load estimation through optimal hybrid models based on Naive Bayes

Cooling load estimation is crucial for energy conservation in cooling systems, with applications like advanced air-conditioning control and chiller optimization. Traditional methods include energy simulation and regression analysis, but artificial intelligence outperforms them. Artificial intelligence models autonomously capture complex patterns, adapt, and scale with more data. They excel at predicting cooling loads influenced by various factors, like weather, building materials, and occupancy, leading to dynamic, responsive predictions and energy optimization. Traditional meth-ods simplify real-world complexities, highlighting artificial intelligence’s role in precise cooling load forecasting for energy-efficient building management. This study evaluates Naive Bayes-based models for estimating building cooling load consumption. These models encompass a single model, one optimized with the Mountain Gazelle Optimizer and another optimized with the horse herd optimization algorithm. The training dataset consists of 70% of the data, which incorporates eight input variables related to the geometric and glazing characteristics of the buildings. Following the validation of 15% of the dataset, the performance of the remaining 15% is tested. Based on analysis through evaluation metrics, among the three candidate models, Naive Bayes optimized with the Mountain Gazelle Optimizer (NBMG) demonstrates remarkable accuracy and stability, reducing prediction errors by an average of 18% and 31% compared to the other two models (NB and NBHH) and achieving a maximum R 2 value of 0.983 for cooling load prediction.


Introduction
In the contemporary era, the escalating demand for energy, primarily from residential and commercial sectors, poses challenges in efficiently managing industries like transportation and construction while striving to conserve energy [1,2].Recent studies emphasize the substantial contribution of a growing population to energy consumption in residential buildings [3,4].Efficiently managing a building's energy consumption requires a thorough understanding of its performance, starting with the identification of energy sources and usage patterns.Key energy resources in buildings include district heating supply, electricity, and natural gas, with applications such as HV heating, ventilation, and air-conditioning (HVAC) systems, lighting, elevators, hot water, and kitchen equipment consuming this energy [1].Among these, HVAC systems, important for residential infrastructure, significantly impact cooling load (CL) and heating load (HL), constituting around 40% of energy consumption in office buildings [5,6].Improving energy efficiency in urban residential buildings and employing dynamic load prediction in construction management are crucial measures to enhance HVAC system performance and conserve energy [7].Forecasting dynamic air-conditioning loads is essential for HVAC system design, enabling adjustments to initiation times, curbing peak demand, optimizing costs, and improving energy utilization in cooling storage systems [8].Accurately predicting building cooling loads is challenging due to various influencing factors, including optical and thermal characteristics and meteorological data [9][10][11].
Achieving sustainability in thermal management relies on efficiently separating latent and sensible loads in the cooling process.An effective strategy involves integrating an indirect evaporative cooler (IEC) with a dehumidification system, providing both enhanced cooling efficiency and a sustainable solution to rising energy demands.The improved IEC, featuring three significant modifications, becomes a cornerstone in this approach, pushing the coefficient of performance (COP) for cooling to an impressive 78.The dehumidification component, operating at a COP of approximately 4-5, complements the cooling-only COP, resulting in an overall COP of 7-8 [12].
Efforts to create energy-efficient buildings and enhance energy conservation are necessary in managing energy demand and resources.A primary strategy involves early predictions of HL and CL in residential structures.Accurate forecasting requires data on building specifications and local weather conditions [13].Climatic elements such as temperature, wind speed, solar radiation, atmospheric pressure, and humidity significantly influence the prediction of building cooling and heating loads.Factors like relative compactness, roof dimensions, wall and glazing areas, roof height, and overall surface area should be considered when assessing a building's load [14].Building energy simulation tools play a crucial role in designing energy-efficient buildings, allowing for performance maximization and comparisons between buildings.Simulation outcomes have demonstrated high accuracy in replicating real-world measurements [15].Although time-intensive and requiring proficient users, simulation software effectively assesses the influence of building design factors.In some cases, contemporary techniques like statistical analysis, artificial neural networks, and machine learning are adopted to predict cooling and heating loads and analyze the impact of different parameters [16].
HVAC system optimization involves three main categories: simulation, regression analysis, and artificial intelligence (AI).Simulation tools like DOE-2 [17], ESP-r [9], TRNSYS [10], and EnergyPlus [11,18] are utilized for cooling load estimation when comprehensive building data is available.However, challenges arise in accurately measuring various parameters, and simplifying building models demands significant time and resources [19].Simulation software is limited to real-time applications like online prediction or optimal operational control [20].Regression analysis, known for its ease of use and computational efficiency, is preferred for diverse building types [21], employing both linear and nonlinear techniques [22,23].Additionally, research emphasizes the efficacy of ML and AI in building energy forecasting, favoring nonlinear approaches [24,25].Building cooling load prediction commonly involves key factors such as outdoor temperature, relative humidity, solar irradiation, and indoor occupancy schedules [26,27].Feature extraction methods, including engineering, statistical, and structural approaches, help condense raw data into informative formats, addressing the complexity introduced by historical data [21].
Numerous data mining methods have been applied to predict residential building energy requirements, including principal component analysis (PCA) [28], extreme learning machine (ELM) [29,30], support vector machines (SVM) [31][32][33], k-means [34], deep learning [32,33,[35][36][37], decision trees (DT) [38], various regression approaches, artificial neural networks [16,39,40], and hybrid models [41][42][43][44].Researchers have employed diverse methodologies to forecast heating and cooling loads and energy demand in various building contexts.For instance, one study [45] predicted building heating load using the MLP method with meteorological data, while another simultaneously [46] predicted both cooling and heating loads with meteorological and date data inputs.Another study [16] examined a building's energy performance using machine learning techniques, including general linear regression, artificial neural networks, decision trees, support vector regression (SVR), and ensemble inference models for cooling and heating load forecasting.Structural and interior design factors' impact on cooling loads was explored through diverse regression models [47], and HVAC system energy demand was estimated from cooling and heating load requirements using different regression models.Commercial buildings' cooling load and electric demand were forecasted for short-term and ultrashort-term management [48], enhancing energy efficiency through a hybrid SVR approach.Additionally, the SVR method was applied [49] to project cooling loads in a large coastal office building in China, introducing a novel vector-based SVR model for increased robustness and forecasting precision [50].
Naive Bayes is a fundamental probabilistic machine learning algorithm widely employed in various fields, including natural language processing, spam filtering, and classification tasks.It is rooted in Bayes' theorem and assumes conditional independence between features, which is where the "naive" in its name originates.This simplifying assumption enables Naive Bayes to efficiently estimate the probability of a data point belonging to a particular class.Despite its simplicity, Naive Bayes often exhibits impressive classification performance, especially when dealing with high-dimensional and large datasets.To date, there is no article to use Naïve Bayes as the prediction model in the case of CL of the buildings.In this study, Naïve Bayes single model prediction performance is compared with two optimized counterparts (optimized with Mountain Gazelle Optimizer (MGO) and the horse herd optimization algorithm (HHO)).The following sections present an academic description of the model and selected optimizers and a comparative analysis between developed models.

Data collection
The main goal of this study is to forecast the cooling load (CL) in buildings.This is achieved by using experimental data extracted from energy consumption patterns documented in previous studies [51,52].Table 1 reports the statistical properties (minimum, maximum, average, and standard deviation) of the variables included in the training of the developed prediction models and the output.Input parameters include relative compactness (indicating the building's surface area-to-volume ratio), surface area, roof area, wall area, orientation, overall height, glazing area (encompassing glazing, frame, and sash components), and the distribution of glazing area, and cooling load is the expected output variable.
Figure 1 visually represents the correlation among the variables examined in this study.The analysis depicted in the figure reveals compelling insights.Specifically, it becomes apparent that the overall height and relative compactness exhibit the most substantial positive impact on the cooling load.In contrast, roof area and surface area emerge as Table 1 The statistic properties of the input variable of NB [51,52] Variables variables with the most pronounced negative influence on the cooling load.This graphical representation not only highlights the interrelationships between the variables but also emphasizes the varying degrees of impact each variable has on the cooling load.

Naive Bayes (NB)
The Naive Bayes (NB) classifier stands as a robust probabilistic model founded on Bayes' theorem, which simplifies modeling by assuming independence among input variables.Its potential for substantial improvements in prediction accuracy becomes evident when combined with kernel density approximations, as highlighted in [53,54].
The NB is a sophisticated system that smoothly integrates the Naive Bayes probability model into its decision-making process.This classifier relies on the maximum a posteriori (MAP) decision rule, a well-established method for identifying the most probable hypothesis from a given set of options.Additionally, there is a closely related classifier called the Bayes classifier.This robust algorithm is responsible for assigning class labels y = C k , where k can range from 1 to K.This involves a detailed evaluation of various fac- tors and variables, leading to the categorization of data points into predefined classes.
In the provided equation, the variable y represents the predicted class label assigned by the Naive Bayes classifier.The term C k denotes a specific class, where k ranges from 1 to K , indicating the total number of classes.The variable n represents the total number of input features or variables, and x i refers to the i − th input feature or variable.The term p(C k ) represents the prior probability of class C k , while p(x i | C k ) denotes the con- ditional probability of observing x i given the class C k .

Mountain gazelle optimizer (MGO)
The MGO algorithm is inspired by the behavior of mountain gazelles, which are grouped into bachelor herds, maternity herds, and solitary, territorial males.It aims to find optimal solutions by designating adult male gazelles in herd territories as global optima.Mathematically defined, the algorithm balances exploitation and exploration, gradually moving toward optimal solutions using four specified exploration mechanisms [55].
Territorial solitary males Mature mountain gazelles establish solitary territories, vigorously defending them from other males seeking access to females.Equation (2) models these territories.Equation (2) describes male gzl as the adult man is the most effective overall solution, as seen by the position vector.The variables ri 2 and ri 1 are random integers that can take on a value of either 1 or 2 [55].YH denoted the coefficient vector of utilizing Eq. ( 3), and one can determine the young male herd.Similarly, F is computed using Eq. ( 4).In each iteration, the coefficient vector Cof r , selected at random, undergoes (1) updates and is employed to augment the search capability.This coefficient vector is specified using Eq.(3).
Here, X ra denotes a random solution (young 1 male) within the range of ra .M pr refers to the average number of search agents, which is equal to ⌈ N 3 ⌉ , and N is the total number of gazelles, while r 1 and r 2 are random values in [0, 1].
Equation ( 4) incorporates multiple variables associated with the problem's dimensions.A randomly generated number following a standard distribution denoted as N 1 and exp is the equation that employs the exponential function.Iter shows the ongoing iteration number in the process, and MaxIter signifies the total count of iterations.
Additionally, r 3 , r 4 , and rand are random numbers from 0 to 1 [55].N 2 , N 3 , and N 4 denote random numbers drawn from a typical distribution, and it is related to the dimensions of the problem.Iter indicates the current iteration number, while MaxIter is the number of iterations to be performed.
Maternity herds Maternity herds hold a crucial position within the mountain gazelles' life cycle since they are principally responsible for producing strong male gazelles.Furthermore, male gazelles may actively participate in the delivery process of the offspring and confront the presence of younger males attempting to mate with females.This behavioral interplay is expressed mathematically in Eq. ( 7).
Here, YH signifies the young men's impact factor vector, which is determined by using Eq.(3).Cof 2,r and Cof 3,r random vectors for the coefficients are determined indepen- dently using Eq. ( 5).ri 3 and ri 4 are random integers that can take on a value of either 1 or 2. male gzl denoted the best global solution (adult male) in the current iteration.Ulti- mately, X rand corresponds to the location vector of a gazelle chosen at random from the entire herd.
Bachelor male herds Male gazelles create territories after they reach adulthood and engage in mating pursuit, a period marked by intense competition between young and (3) adult males for territory control and access to females, as mathematically captured in Eq. (8).
where X(t) indicates the gazelle's current iteration's location vector.The variables ri 5 and ri 6 are random integers that can take a value of either 1 or 2. The ideal answer designates the male gazelle's location vector as male gzl .r 6 is also a random number from 0 to 1.
Migration to search for food Equation (10), which describes how mountain gazelles forage for food, takes into account their extraordinary sprinting and leaping speed.
where ul and ll represent the lower and upper limits of the problem, respectively.Fur- thermore, r 7 is a random integer in [0, 1] , and it is selected randomly.The pseudo-code of MGO is available as follows: %MGO setting Inputs: The population size N and maximum number of iterations I Outputs: Gazelle's location and fitness potential % initialization Create a random population using X i (i = 1, 2, ..., N) Calculate the gazelle's fitness level While (the stopping condition is not met) do For (each gazelle ( X i )) do % Alone male realm Calculate TSM using Eq.(2) % Mother and child herd Calculate MH using Eq. ( 7) % Young male herd Calculate YMH using Eq. ( 8) % Migration to search for food Calculate MSF using Eq.(10) Calculate the fitness values of TSM, MH, YMH, and MSF and then add them to the habitat End for Sort the entire population in ascending order Update best Gazelle Save the N best gazelles in the max number of population end, while Return X BestGazelle ,best Fitness

Horse herd optimization algorithm (HOA)
The HOA is based on how horses behave in the wild [56].This information is based on six specific behaviors: grazing, hierarchy, imitation, sociability, roaming, and defense mechanisms.These actions are the foundation of HOA, directing the movement of horses in each cycle, as detailed in Eq. ( 11 various behaviors throughout their lifespan.These behaviors are categorized into δ (0-5 years), γ (5-10 years), and α (older than 15 years) groups.An extensive response matrix determines how old horses are sorted by how well they perform.The top 10% form group α , the next 20% belong to group β , and the remaining 30% and 40% are catego- rized as groups γ and δ , respectively.Motion vectors corresponding to equines of vary- ing age groups and computational cycles within the algorithm are established following these behavioral patterns.
To elucidate the derivation of the global matrix, Eqs. ( 13) and ( 14) are utilized, and a relationship between positions ( X ) and their respective cost values ( C(X) ) is established.
Here, m indicates the count of horses, and d is the dimensions of the problem.After that, the global matrix is arranged according to the final column, which signifies costs.The horse's age is recorded in this column.The velocity of horses under 5 years age range is as follows: The velocity of horses between 5 and 10 years age range: (12) The velocity of horses between 10 and 15 years age range: Horses that are 15 years or older exhibit the following velocity:

Hyperparameter results
External configurations referred to as hyperparameters-such as alpha and binarizeare important in shaping a model's behavior.Distinguished from parameters, these hyperparameters are predetermined and not acquired through the learning process of the data.The optimization of model performance significantly relies on the fine-tuning of hyperparameters, a nuanced process that demands both experimentation and the strategic application of optimization techniques.Table 2 outlines the hyperparameter values for the NBMG and NBHH models.By providing intricate insights into the intricacies of hyperparameter configurations, it becomes an indispensable tool for comprehending and, crucially, reproducing model setups.This exposition not only elevates the technical aspects of the research but also contributes to the broader scholarly discourse in the field of machine learning.

Prediction performance analysis
The assessment of the predictive effectiveness of the constructed models involved the utilization of five distinct metrics, which relied on actual observed values ( T i ) and cor- responding predicted values ( P i ).Here, the symbols T and P denote the mean of all the outcomes subjected to testing and predicting.In contrast, n signifies the total count of samples encompassed within the analyzed dataset.A description of these metrics is presented as follows: (1) The coefficient of determination (R 2 ) numerically represents the portion of the variability in the dependent variable that can be anticipated through the independent variables integrated into the model.(2) Root-mean-square error (RMSE) denotes the square root of the squared disparities' mean between the projected and observed values.This quantifies the typical magnitude of the discrepancies the model introduces when forecasting the target variable.
(3) Mean squared error (MSE) calculates the average of the squared differences between predicted and actual values, measuring how well a model's predictions match the actual data.Lower MSE values indicate better predictive accuracy and a closer fit to the observed data.
(4) Nash-Sutcliffe efficiency (NSE) assesses how well a model's predictions match observed values, considering the variability of the observed data.Higher NSE values indicate better model performance, with 1 indicating a perfect match.
(5) MDAPE (mean directional absolute percentage error) expresses the average percentage difference between the predicted and actual values, considering the direction of the errors (underestimation or overestimation).
The following discussion comprehensively analyzes the model's performance in predicting CL based on Table 3:    3 and depicted in Fig. 2, the NBMG model showcased the best performance in predicting CL values, boasting an impressive R 2 of 0.986, RMSE of 1.129 KW, and MSE of 1.275 KW.
Figure 3 provides a comprehensive visual representation through a scatter plot, elucidating the relationship between predicted and measured samples for the CL.The scrutiny of these samples unfolds across three distinct phases, each phase offering valuable insights into the model's performance.The allocation of sample points in the plot is guided by two main metrics: RMSE, which characterizes the dispersion within the figure, and R 2 , a measure that assesses the degree of collinearity among the sample points.In this visual exploration, the coincidence of a high R 2 value with a low RMSE value signifies an optimal state where the predicted values closely align with the measured values, approximating the center ( X = Y ).To facilitate interpreta- tion, two dashed lines are introduced onto the plot, delineating 15% overestimation and underestimation.Significantly, upon closer examination, the NBMG and NBHH hybrid models emerge as standout performers.These models, marked by their lowest RMSE values and highest R 2 values, showcase a level of performance that surpasses the NB single model.It is worth highlighting that while the NBHH model exhibits some comparative weakness against the NBMG model, it does present certain data points with overestimation exceeding 15%.This nuanced observation adds depth to the understanding of the models' performance dynamics across various scenarios and contributes to a more comprehensive evaluation of their predictive capabilities.Figure 4 employs a line plot in this investigation to comprehensively compare the variation in error values across three developed models.The range of errors for NBMG is approximately half that of NBHH, underscoring the advantageous capability of the MGO algorithm.Furthermore, in the case of NBMG, the error rate during the training phase is only half that observed in the other two phases, suggesting that MGO exhibits superior prediction performance during the training phase compared to the other models.This observation is corroborated by Fig. 5, which illustrates the normal distribution of errors for MGO, displaying a narrow bell-shaped curve indicative of a high concentration of errors near 0%.
Figure 6 presents Taylor diagrams that vividly depict the performance of the employed predictive models, namely NB, NBMG, and NBHH.These diagrams serve as statistical syntheses, integrating both observed and predicted CL and incorporating essential metrics such as RMSE, correlation coefficients (CC), and normalized standard deviations.The visual representation within the figure provides a comprehensive overview of the model performances.Notably, the NBMG model, an amalgamation of the NB model, and the MGO optimizer emerge as the optimal predictive model.The outcomes of this

Conclusions
Accurate building cooling load forecasting is vital for optimizing HVAC systems, reducing costs, and enhancing energy efficiency.However, it remains challenging due to the complex interplay of building characteristics and meteorological data.Prior studies emphasize the effectiveness of machine learning in building energy forecasting, favoring nonlinear approaches.Naive Bayes, a foundational machine learning algorithm, was Fig. 5 The normal distribution plot of errors among the developed models Fig. 6 The Taylor diagram for developed models unexplored in this context.Naive Bayes-based models encompassed a single model, one optimized with the Mountain Gazelle Optimizer (MGO) and another optimized with the horse herd optimization (HHO) algorithm.The research findings underscore the exceptional performance of the NBMG model, consistently outperforming its counterparts by reducing prediction errors by an average of 20% and achieving a maximum R 2 value of 0.982 for cooling load prediction.This highlights the substantial potential of machine learning, as NBMG exemplifies, to significantly enhance the precision of energy consumption forecasts.Consequently, it empowers decision-makers in energy conservation and retrofit strategies, contributing to the overarching goals of sustainable building operations and reduced environmental impact.
): where X Iter,A m denotes the position of the m − th horse, A represents the age range, and Iter is the current iteration.A also reflects the horse's age range, while − → V Iter,A m indicates the velocity vector of the horse.Horses typically live between 25 and 30 years, exhibiting

(
are almost twice lower than NB single model indicate superior optimization performance of MGO in enhancing CL prediction capability of NB.• NBHH (NB + HHO): This model with marginal lower R 2 (lower than 1%) and higher error values (on average 20%) has weaker performance than NBMG.However, the MGO algorithm has notably enhanced the NB's prediction accuracy.

Figure 2
Figure2visually illustrates the trends in error values (RMSE, MSE) and R2 for the three models developed in this study.The comparative analysis reveals a consistent decrease in R 2 values from training to testing across all models, indicating a weakness in the training ability of the models.Notably, all data columns for R 2 values of NBMG are higher than those of NB but show similar heights to NBHH.In terms of RMSE and MSE error values, the NBMG model, particularly during the training phase, demonstrated significantly lower error values compared to the other models.As detailed in Table3and depicted in Fig.2, the NBMG model showcased the best performance in predicting CL values, boasting an impressive R 2 of 0.986, RMSE of 1.129 KW, and MSE of 1.275 KW.Figure3provides a comprehensive visual representation through a scatter plot, elucidating the relationship between predicted and measured samples for the CL.The scrutiny of these samples unfolds across three distinct phases, each phase offering valuable insights into the model's performance.The allocation of sample points in the plot is guided by two main metrics: RMSE, which characterizes the dispersion within the figure, and R 2 , a measure that assesses the degree of collinearity among the sample points.In this visual exploration, the coincidence of a high R 2 value with a low RMSE value signifies an optimal state where the predicted values closely align with the measured values, approximating the center ( X = Y ).To facilitate interpreta- tion, two dashed lines are introduced onto the plot, delineating 15% overestimation and underestimation.Significantly, upon closer examination, the NBMG and NBHH hybrid models emerge as standout performers.These models, marked by their lowest RMSE values and highest R 2 values, showcase a level of performance that surpasses the NB single model.It is worth highlighting that while the NBHH model exhibits

Fig. 2
Fig. 2 The comparison of parameters

Fig. 3
Fig.3The scatter plot for developed hybrid models

Table 2
The results of hyperparameters for NB of 0.963 is reported for this model.High error values of ( RMSE = 2.147 , MSE = 4.610 , and MDAPE = 7.482 ) indicated low accuracy of this traditional model, especially in the testing phase.Low NSE values of 0.966, 0.958, and 0.949 in the training, validation, and testing phases confirm the high variability of estimated data.• NBMG (NB + MGO): High R 2 values of 0.986, 0.980, and 0.974 in training, validation, and testing phases and low error values, especially in the case of NBMG, which

Table 3
The result of developed models for NB