A design of new wind power forecasting approach based on IVMD-WSA-IC-LSTM model

The wind power forecasting (WPF) technology can reduce the adverse impact of wind power grid connection. Based on the characteristics of wind power data, an algorithm based on improved variational mode decomposition (IVMD) and long short-term memory (LSTM) Network is proposed to predict the wind power, and hyper parameter optimization search of LSTM using Whale Swarm Algorithm with Iterative Counter (WSA-IC). Firstly, through correlation analysis, the characteristics of 10 different wind power data are screened, and two kinds of data with large correlation with wind power are determined as input of the mode. Secondly, IVMD is used to calculate the maximum envelope kurtosis, determine the best decomposition parameters of the variational mode decomposition (VMD), and the original wind power and wind speed sequences are decomposed to obtain the IMF with different time scales. Finally, to address the problems of difficult optimization of hyper parameter and difficulty in obtaining optimal solutions for LSTM neural network modes, the WSA-IC algorithm is proposed to optimize its key hyper parameter, and the IVMD-WSA-IC-LSTM forecasting mode is established to obtain the short-term forecasting results of wind power. The algorithm is tested with the data of China Longyuan Power Group Corporation Limited. Compared with other common forecasting approaches using same data, the mean absolute error (MAE) of the forecasting approach is reduced to 0.007859, the mean square error (MSE) is reduced to 0.00011, and the determination coefficient is improved to 0.998828, which has higher forecasting accuracy.


Introduction
Wind power, as a new energy, has developed rapidly in recent years [1].Wind power plays an important role in our country and is widely used.However, randomness, intermittent and uncertainty of wind power will adversely affect the stable and safe of power grid.Accurate wind power forecasting is beneficial to efficient operation, safety analysis, and energy trading of power grid [2].
In response to the problems of low accuracy and poor anti-interference ability of traditional power forecasting algorithms, researchers propose some artificial intelligencebased WPF algorithms.Xu et al. [3] proposes a new WPF approach by mining the data of numerical weather prediction (NWP).The WPF approach based on LSTM neural Page 2 of 14 Li and Xiang Journal of Engineering and Applied Science (2023) 70:91 network often has better accuracy than some artificial intelligence approaches [4,5].LSTM can effectively solve the problem of RNN's inability to capture long-term dependence, but it has high complexity, long training time and multiple hyperparameters difficult to select.Yang et al. [6] proposed a bidirectional long short-term memory (Bi-LSTM) to predict multiple attributes of a product.Delgado et al. [7] presents a recurrent neural network-based variant LSTM-based power forecasting.A new prediction approach for wind power is proposed in Sun et al. [8], in which LSTM network, wavelet decomposition (WT), and principal component analysis (PCA) are combined together.
In Pu et al. [9], an ultra-short-term WPF mode based on PSO and LSTM neural network combination is proposed, which improves the precision of ultra-short-term WPF.However, particle swarm optimization (PSO) has some problems such as slow convergence rate and local optimal solution.The optimization algorithms such as Drosophila Optimization (DO), Whale Swarm Optimization (WSO), and Bat Swarm Optimization (BSO) have emerged.In the Peng et al. [10], an improved DO algorithm is used to propose a wind speed forecasting mode to obtain wind power data with higher forecasting accuracy by expanding the parameter search range and improving the convergence speed of the algorithm.However, the original data is characterized by the rapid variational of wave crests and troughs, which leads to some defects in the forecasting mode.For data problems, Shi et al. [11] uses wavelet decomposition (WD) and artificial neural network (ANN) for forecasting, which improves forecasting performance, but wavelet packet decomposition in the Zheng et al. [12] requires artificial wavelet function setting, and noise interference may occur if improper decomposition result is set.In the Cheng et al. [13], a short-term WPF mode based on the combination of empirical mode decomposition (EMD) [14] and radial basis function neural network (RBFNN) [15] was proposed to make the decomposed wind power component series with strong regularity and improve the forecasting accuracy.Wang et al. [16,17] adopts empirical mode decomposition empirical model decomposition (EMD) method to better solve the problem of modal aliasing phenomenon and susceptibility to noise interference, but the total number of modes K is difficult to determine, affecting the accuracy the mode.These methods solve the nonlinear problem of wind power data forecasting to some extent, but there are some problems such as poor generalization ability of mode, long training time and easy to over-fit or under-fit.Konstantin et al. [18] proposes RNN mode based on LSTM to predict wind power, which can make more accurate forecasting by using long-term correlation in time series, and solve the defects that other neural networks cannot learn for a long time.The Deng et al. [19] proposed a method to optimize the design of key hyper parameter of LSTM neural networks by combining the cuckoo algorithm.In the Chang et al. [20], a sparrow search algorithm is proposed to optimize the learning rate and number of neurons in LSTM and CNN-LSTM networks.However, it is difficult to select the optimal hyper parameter of LSTM neural network mode, and intelligent optimization algorithms such as cuckoo bird algorithm and sparrow algorithm need to set niche parameters for different problems, so it is difficult to obtain global optimal solutions.WSA (Whale Swarm Algorithm) is a meta heuristic algorithm, which belongs to Swarm intelligence algorithm [21].WSA-IC(Whale Swarm Algorithm with Iterative Counter) does not need to introduce small habitat parameters and can effectively jump out of local optimal solutions by identifying extreme value points based on stability thresholds and fitness thresholds during the iterative process.An IVMD-WSA-IC-LSTM WPF algorithm is proposed.The maximum envelope kurtosis is calculated by IVMD to determine the optimal decomposition parameter K of VMD, and the WSA-IC is used to improve the multimodal optimization iteration rule to obtain the optimal hyper parameter and global optimal solution of LSTM to improve the WPF accuracy.Compared with other forecasting algorithms, the forecasting error is smaller, the average absolute error is lower, the coefficient of determination and the forecasting accuracy are higher.
The first part of the paper introduces the VMD algorithm, and improves in the second part.In the third part, the improved whale swarm optimization algorithm is introduced to optimize the hyper parameter of LSTM neural network, and the fourth part tests the algorithm.

VMD Algorithm
VMD takes the original signal data into a number of Eigen mode components IMF with different center frequencies with finite bandwidth [22].By constructing and solving the variational problem, the optimal solution and its central frequency of each Eigen mode function are solved by iteration [23].
Firstly, the variational problem [24] is constructed, assuming that the original signal f is decomposed into k components, and the decomposition signal is guaranteed to be a modal component of finite bandwidth with central frequency.The constraint variational expression is shown in Eq. ( 1): where u k is mode signal; f is time series; k is mode number; * refers to convolution opera- tor; w k refers to center frequency of kth mode component; δ(t) refers to Dirac function, indicating density; and e −jw k t is the phase of signal rotation with time.
Secondly, the variational problem is transformed into a non-constraint problem, to facilitate the optimal solution of the Variational constraint mode and obtain an augmented Lagrange expression [25].
where, α is the secondary penalty factor, λ is the Lagrangian multiplier.
Finally, by using alternate direction multiplier method, the sum is optimized and the "saddle point" of Eq. ( 2) is searched, which is the optimal solution of Eq. ( 1).Expressions of u n+1 k ,w n+1 k , n+1 k are obtained by alternately updating: (1) where, ε is the convergence criterion and N is the maximum number of iterations.

IVMD algorIthM
Since VMD decomposition is to decompose the original data sequence from low frequency to high frequency one by one, the low frequency IMF sequence is more likely to reflect the big trend of data, when envelope kurtosis is maximum, the frequency of IMF sequence is the highest, at this time, no further VMD decomposition is necessary.Therefore, finding the maximum envelope kurtosis means finding the optimum number of layers K.The maximum envelope kurtosis solution process is as follows: Assuming that the number of decomposition layers of VMD is K, the envelope of each IMF can be calculated from the following equation, i.e.: where, xt i k is the absolute value obtained by Hilbert transformation, and xt i k (t) is the ith IMF when the number of decomposition layers K.
The ith IMF envelope kurtosis calculation formula is where, is the fourth order central moment of xt i k .The envelope kurtosis ek i (i = 1, 2, ..., k) of each IMF can be calculated from Eq. ( 9), therefore, the local maximum envelope kurtosis ek max k is The local maximum value ek max k when K = (2, …, k) is calculated respectively from Eq. ( 9), and the global maximum value of maximum envelope kurtosis ek max g is calculated by Eq. (10), and the trend diagram of maximum envelope kurtosis is drawn.The value of K corresponding to global maximum value is the optimal decomposition layer number of maximum VMD. (4) WSA is a new meta-heuristic algorithm, belonging to group intelligence algorithm [26][27][28][29][30].It is used to solve the merit-seeking problem by imitating behaviors such as searching and hunting among whale populations using ultrasonic waves as information carrier.
The position iterative formula of the whale swarm algorithm is: where, η is the ultrasonic attenuation factor; d xy is the distance between X and Y; ρ 0 represents the initial ultrasonic intensity; y t i indicates the position of whale y i at step t; x t i and x t+1 i is the position of whale x i at step t and t + 1 respectively; and β is a random number of [0, ρ 0 e −ηd X,Y ].
WSA-IC improves the iteration rule of WSA and solves the problem of getting into local optimality easily without specifying different attenuation coefficient values for different problems, and without introducing any niche parameters.WSA-IC can identify and jump out the found extremum effectively during iterative process, which is helpful to find global optimal solution.

Improvement of WSA iteration rules
Improve WSA location update rule: generate a replica of that current whale, and if the replica whale does not move to a better position than the original whale under the guidance of its "better and nearest" whale, then the original whale remains in its original position, otherwise the original whale moves to the copy whale position.Without introducing arbitrary niche parameters [31], this strategy effectively generates multiple subgroups, which enhances the local optimization ability of the algorithm and helps to find multiple optimal solutions.

Identify and jump out of found extreme points
Two new parameters, namely stability threshold T s and fitness threshold T f , are introduced for identifying and jumping out of the found extreme points.Firstly, the preset T s is used to determine whether a whale has reached a steady state (i.e., an extreme point is found), and an iteration counter c is set for each whale to record the number of iterations for the un-updated position of the whale.Secondly, the global optimal solution is identified by T f .By checking the iterative counters of the whales that have not made position updates in each iteration, if the whale reaches the steady state ( X.c = T s ), the size of T f and |f gbest − f (X)| determine whether to update to the current global optimal solution set, then initialize the whale to jump out of the found extreme value point.Otherwise, the iteration counter is incremented by 1. (11)

LSTM forecasting mode
LSTM network is an improvement of recurrent neural network (RNN), which effectively solves the gradient disappearance and explosion problems existing in the original RNN by adding a gating structure.LSTM is known as a gate-based cyclic neural network.The LSTM network is able to obtain important information for a specific period of time and can maintain a certain time interval.c Stores information longer than short-term memory, but much shorter than long-term memory.The structure of the LSTM cycle unit is shown in Fig. 1, and the calculation formula is as follows: where ⊙ is the product of vector element; c t-1 is the memory cell at the previous time; ct is the candidate state, tanh is the activation function; W, U, and b are network parameters; σ() is the Sigmoid activation function, whose output interval is (0, 1); x t is the input at the current time; h t-1 is the external state at the previous time; W, U and b are network parameters. ( Fig. 1 The structure of the LSTM cycle unit

IVMD-WSA-IC-LSTM forecasting mode
The accuracy of WPF will be affected by the setting of LSTM hyper parameter.First, a WSA-IC-LSTM forecasting algorithm mode is designed based on the learning rate l r , the learning rate planning r f , the number of hidden layer neurons U h and the number of samples b s of the LSTM mode as the hyper parameter of the WSA-IC algorithm for finding the optimal object, and the fitness value of MSE is taken as the target function.During the training process, the corresponding optimal hyper parameter with the smallest MSE is searched.Using WSA-IC for hyper parameter search can avoid the problem of randomness caused by artificial parameter adjustment, and the optimization effect is stable.
The flow of component forecasting based on WSA-IC-LSTM is shown in Fig. 2. The steps of WAS-IC-LSTM are as follows: Step 1: Initialize the whale population.Including whale position initialization, iteration counter set to zero; Step 2: Initialize the optimization parameter.initialize superparameters such as l r , r f , U h1 , U h2 , bs, etc.; Step 3: Input training set, train LSTM network and calculate training error MSE; Step 4: Judge whether the MSE meets the termination condition.If the termination condition is satisfied, save the optimal hyper parameter at this time and execute step 5. Otherwise, repeat step 2; Step 5: The optimal hyper parameter of LSTM is set for mode training; Step 6: Output LSTM forecasting in preparation for subsequent work.
Then a combined forecasting method based on IVMD-WSA-IC-LSTM is proposed: Step 1: Using IVMD to decompose the preprocessed data into K IMF components; Step 2: Normalize each component; Step 3: Initialize the boundary threshold, fitness threshold, stability threshold, distance threshold and maximum evaluation times of WSA-IC algorithm, and initialize the number of whale population of WSA-IC algorithm; Step 4: The forecasting mode of LSTM wind power is constructed, the parameter optimization range is determined, and a whale population is generated.The learning rate l r , learning rate planning r f , the number of neurons in hidden layer U h1 and U h2 , the number of batch samples bs are taken as hyper parameter variables, and the parameter optimization range is determined, and each IMF component is divided into training set and test set; Step 5: Defining the fitness, using the mean square error MSE of the LSTM mode output values as the fitness value; Step 6: Each IMF component is sent to LSTM network training, superparameter optimization using WSA-IC algorithm is used to obtain the optimalhyper parameterat minimum time of MSE.The optimal superparameter of LSTM is set for training, and the predicted value of each component is output.
Step 7: The forecasting results of each IMF component are superimposed to obtain a final power forecasting value.

Algorithm implementation and testing
Hardware platform configuration: Windows 10 operating system, 16 G memory, GTX1650 super graphics card, using Python 3.8 for programming.Select the 245-day data of Baidu KDD CUP 2022 No. 1 blower, taking 10 min as an adoption point, with 35,280 sampling points in total.The forecasting mode adopts rolling multi-step forecasting mode.Since 10 min is taken as the sampling point, the time window (input sequence length) N of forecasting of wind power is set to 24 and the measurement step length (output sequence length) l is set as 10.

IVMD decomposition
The wind power active power and wind speed series decomposed by IVMD are the data of 245 days with 10 min as a sampling point of the above data set, and the maximum envelope kurtosis method is adopted for analysis.K is taken as 2-9.By calculating the local maximum envelope kurtosis under different values of K, the K value corresponding to the global maximum envelope kurtosis is found, as shown in Table 1.
According to Table 1, when K = 6, the maximum envelope kurtosis is the global maximum envelope kurtosis, so the optimum number of VMD layers K = 6 is determined.
Next, the input parameters such as penalty factor of VMD α, noise tolerance τ, direct current component parameter DC, convergence tolerance criterion tol, initialized center frequency parameter init, and the number of layers K of the VMD are set as shown in Table 2.
The VMD decomposition was performed on the power and wind speed series respectively, and for the convenience of observation, 2000 sampling points were intercepted to do the decomposition results schematically.The power IVMD decomposition results are shown in Fig. 3. Adopting the power sub-sequence (IMF1-IMF6) and wind speed sequence (IMF1-IMF6) after IVMD decomposition, it can be seen from the figure that there are obvious differences in the frequency of modes among each sub-sequence, which effectively avoids the problem of mode aliasing and end effect.The new subsequence is constructed by reconstituting the corresponding position, using the reconstructed sequence to predict the components, and finally each forecasting results are superimposed to obtain the optimized forecasting result.

Analysis of new WPF mode
In order to verify the forecasting effect of forecasting algorithm, BP neural network forecasting algorithm, RNN forecasting algorithm, LSTM network forecasting algorithm, LSTM network forecasting algorithm based on WSA-IC optimization and IVMD-LSTM forecasting algorithm are introduced.Among them, BP forecasting algorithm, RNN forecasting algorithm, LSTM forecasting algorithm and IVMD-LSTM forecasting algorithm set the same hyper parameter.The IVMD-LSTM algorithm and IVMD-WSA-IC-LSTM algorithm set the same VMD decomposition times, and the parameter settings are shown in the following table:  the WSA-IC-LSTM forecasting algorithm finds the optimal hyper parameter through WSA-IC.The hyper parameter settings for the BP, RNN, LSTM, and WSA-IC-LSTM forecasting algorithm modes are shown in Table 4: The error analysis indicators for the forecasting effects of the four algorithm modes are shown in Table 5.Here, mean absolute error (MAE) refers to the average value of the absolute error between the real value and the predicted value, mean squared error (MSE) is the expected value of the residual sum of squares, R 2 represents the numerical feature of the relationship between a variable and other random variables.
where n is the number of samples, y i is the test value of wind power, and ŷi is the pre- dicted value of wind power.

Fig. 2
Fig. 2 The flow of component forecasting based on WSA-IC-LSTM Li and Xiang Journal of Engineering and Applied Science (2023) 70:91

Table 1
The local maximum envelope kurtosis under different K

Table 2
The number of layers of the VMD

Table 4
The optimal hyper parameters

Table 5
The comparison diagrams of the forecasting results