### Support vector regression (SVR)

SVR is a powerful machine learning tool that solves non-linear regression problems and ensures a globally optimal solution. Generally, it involves transforming samples from the input space into a high-dimensional feature space using nonlinear transformations. The network architecture of the SVR model is shown in Fig. 1. The regression function of the SVR model can be defined as follows [29]:

$$F={\omega}^T\varphi (x)+b$$

(1)

where *F* is the forecasted output, *ω* is the weight vector, *b* is the bias, and *φ*(*x*) is the high-dimensional input vector.

The *ω* and *b* coefficients can be calculated by minimizing the risk function *R*(*F*) given by:

$$R(F)=\frac{1}{2}{\left\Vert \omega \right\Vert}^2+C\frac{1}{n}\sum \limits_{i=1}^n{L}_{\varepsilon}\left({y}_i,{F}_i\right)$$

(2)

where *C* represents the penalty parameter used to determine the trade-off between function intricacy and losses, *ε* represents the loss, and *L*_{ε}(*y*_{i}, *F*_{i}) is the *ε*-insensitive loss function.

Equation (2) can be transformed into the following form [29]:

$$\operatorname{minimize}\frac{1}{2}{\left\Vert \omega \right\Vert}^2+C\sum \limits_{i=1}^n\left({\xi}_i+{\xi}_i^{\ast}\right)$$

(3)

where *ξ* and *ξ*^{∗} are two positive slack variables.

As described below, the optimization problem in (3) is easier to solve when expressed in its dual formulation.

$$\operatorname{minimize}\left[-\frac{1}{2}\sum \limits_{i,j=1}^n\left({\alpha}_i+{\alpha}_i^{\ast}\right)\left({\alpha}_j+{\alpha}_j^{\ast}\right)\left(\varphi \left({x}_i\right)\cdot \varphi \left({x}_j\right)\right)-\varepsilon \sum \limits_{i=1}^n\left({\alpha}_i+{\alpha}_i^{\ast}\right)+\sum \limits_{i=1}^n{y}_i\left({\alpha}_i+{\alpha}_i^{\ast}\right)\right]$$

(4)

where *α* and *α*^{∗} are nonlinear Lagrangian multipliers.

The dual maximization problem in (3) can be solved in the following way to obtain the SVR function *F* [27]:

$$F\left(x,{\alpha}_i,{\alpha}_i^{\ast}\right)=\sum \limits_{i=1}^n\left({\alpha}_i-{\alpha}_i^{\ast}\right)k\left({x}_i+{x}_j\right)+b$$

(5)

where *k* is a kernel function which can either be a linear, polynomial, sigmoid, or radial basis. Usually, the radial basis function (RBF) has been used due to its simplicity and reliability [30]. This function can be expressed as follows:

$$k\left({x}_i,{x}_j\right)=\exp \left(\frac{{\left\Vert {x}_i-{x}_j\right\Vert}^2}{2\gamma}\right)$$

(6)

where *γ* represents the bandwidth of the RBF function.

### Bald eagle search (BES) algorithm

BES is a novel nature-inspired optimization algorithm developed by Alsattar et al. [28]. This method mimics the social behavior of bald eagles, which are known for their clever hunting techniques. The BES algorithm is mathematically modeled in three stages as shown in Fig. 2: (i) selecting space, (ii) searching in space, and (iii) swooping. The eagle chooses the space with a lot of prey in the first stage. The eagle then moves indoors this space to look for prey in the second stage. In the third stage, the eagle swings from the best-identified position in the second stage towards the prey [29]. The mathematical modeling of the BES algorithm can be summarized as follows:

#### Selecting space

The selection of the appropriate space can be expressed by equation [28].

$${P}_{new,i}={P}_{best}+\alpha \times r\left({P}_{mean}-{P}_i\right)$$

(7)

where *P*_{best} is the search space chosen by bald eagles, *α* is the control parameter in the interval [1.5 2], *r* is a random number in the range [0 1], *P*_{mean} designates that these eagles have used up all of the information from the preceding points, amd *P*_{i} is the current position of the eagle.

#### Searching in space

In this stage, the eagle updates its position based on Eq. (8) [28]:

$${P}_{new,i}={P}_i+y(i)\times \left({P}_i-{P}_{i+1}\right)+x(i)\times \left({P}_i-{P}_{mean}\right)$$

(8)

where

$$x(i)= xr(i)/\max \left(\left| xr\right|\right)$$

(9)

$$y(i)= yr(i)/\max \left(\left| yr\right|\right)$$

(10)

$$xr(i)=r(i)\times \sin \left(\theta (i)\right)$$

(11)

$$yr(i)=r(i)\times \cos \left(\theta (i)\right)$$

(12)

$$\theta (i)=a\times \pi \times \mathit{\operatorname{rand}}$$

(13)

$$r(i)=\theta (i)\times R\times \mathit{\operatorname{rand}}$$

(14)

where *a* and *R* are coefficients that take values in the ranges [5 10] and [0.5 2], respectively.

#### Swooping

The swooping strategy of eagles can be described by [28]:

$${P}_{new,i}=\mathit{\operatorname{rand}}\times {P}_{best}+{x}_1(i)\times \left({P}_i-{c}_1\times {P}_{mean}\right)+{y}_1(i)\times \left({P}_i-{c}_2\times {P}_{best}\right)$$

(15)

where *c*_{1}, *c*_{2} ∈ [1, 2]

$${x}_1(i)= xr(i)/\max \left(\left| xr\right|\right)$$

(16)

$${y}_1(i)= yr(i)/\max \left(\left| yr\right|\right)$$

(17)

$$xr(i)=r(i)\times \sinh \left(\theta (i)\right)$$

(18)

$$yr(i)=r(i)\times \cosh \left(\theta (i)\right)$$

(19)

$$\theta (i)=a\times \pi \times \mathit{\operatorname{rand}}$$

(20)

Based on the aforementioned stages, the initial arbitrarily generated collection of candidate solutions is enhanced during several iterations until the global optimum is achieved.