Enhanced fault diagnosis of wind energy conversion systems using ensemble learning based on sine cosine algorithm

Attouri, Khadija; Dhibi, Khaled; Mansouri, Majdi; Hajji, Mansour; Bouzrara, Kais; Nounou, Hazem

doi:10.1186/s44147-023-00227-3

Research
Open access
Published: 02 June 2023

Enhanced fault diagnosis of wind energy conversion systems using ensemble learning based on sine cosine algorithm

Khadija Attouri¹,
Khaled Dhibi²,
Majdi Mansouri ORCID: orcid.org/0000-0001-6390-4304³,
Mansour Hajji¹,
Kais Bouzrara⁴ &
…
Hazem Nounou³

Journal of Engineering and Applied Science volume 70, Article number: 56 (2023) Cite this article

1107 Accesses
3 Citations
Metrics details

Abstract

This paper investigates the problem of incipient fault detection and diagnosis (FDD) in wind energy conversion systems (WECS) using an innovative and effective approach called the ensemble learning-sine cosine optimization algorithm (EL-SCOA). The evolved strategy involves two primary steps: first, a sine-cosine algorithm is used to extract and optimize features in order to only select the most descriptive ones. Second, to further improve the capability, thereby providing the highest accuracy performance, the newly gathered dataset is introduced as input to an ensemble learning paradigm, which merges the benefits of boosting and bagging techniques with an artificial neural network classifier. The essential goal of the developed proposal is to discriminate between the diverse operating conditions (one healthy and six faulty conditions). Three potential and frequent types of faults that can affect the system behaviors including short-circuit, open-circuit, and wear-out are considered and thereby injected at diverse locations and sides (grid and generator sides) in order to evaluate the availability and performance of the proposed technique when compared to the conventional FDD methods. The diagnosis performance is analyzed in terms of accuracy, recall, precision, and computation time. The acquired outcomes demonstrate the efficiency of the suggested diagnostic paradigm compared to conventional FDD techniques (accuracy rate has been successfully achieved 98.35%).

Introduction

Year-on-year for the last few decades, wind energy has become one of the most promising, inexhaustible, green, clean, non-polluting, and sustainable energy sources. From 2011 until 2020, its production capacity expanded from 220 GW to 733 GW [1]. Unfortunately, this kind of energy is affected by several and various failures due to its complexity, which leads to the loss of its efficiency and reliability. Generators [2, 3], gearboxes [4, 5], power converters [6, 7], and blades [8, 9] are typically the most common faults. In this regard, several methods and techniques are investigated in the literature in order to ensure the safety, integrity, and performance of the operation of such systems [6, 10]. The study in [11] proposed two hybrid numerical weather prediction models and an artificial neural network model for wind power forecasting over extremely complicated terrain, the first model created predicts the energy output of each wind turbine directly, while the second model forecasts first the wind speed before converting it to power using a fitted power curve. By using an artificial neural network (ANN)-based distribution static compensator, the authors of [12] emphasized a new control strategy to enhance the power quality (DSTATCOM) in WECS. Mansouri et al. [13] employed a detection and diagnosis strategy for diverse incipient faults of the WECS under various states. In [14], the authors disposed of an advanced fault detection and diagnosis (FDD) approach for wind energy conversion (WEC) systems based on reduced-gaussian process regression-based random forest (RGPR-RF). Regarding the scientific community has been closely monitoring ensemble learning (EL) approaches, which mix several and numerous machine learning models to create the most optimal and best possible predictive model. The success of the ensemble model can be attributed to a variety of factors, including statistical, computational, and representation learning [15], bias-variance decomposition [16], and strength-correlation [17]. There are numerous surveys in the literature that mostly concentrate on the review of ensemble learning, such as the learning of ensemble models in classification problems [18,19,20,21]; regression problems [22, 23]; and clustering problems [24]. Indeed, an effective neural network-based ensemble technique was employed in [25]. The authors of this paper used bagging, boosting, and random subspace combination approaches together with an ensemble classifier constructed using neural network techniques. The work in [26] employed the benefits of the support vector machine, K-nearest neighbor, and the decision tree in an improved ensemble learning (EL)-based intelligent fault diagnosis paradigm that aims to guarantee the high efficiency of grid-connected photovoltaic (GCPV) systems. The initial step in data mining is known as preprocessing [27], and it entails cleaning and arranging the dataset to suit the requirements of the input for the subsequent stages.

Accordingly, one potential pre-processing step is feature selection (FS), which is a method for keeping a subset of features from a dataset that can accurately represent the data without outliers or redundancies [27]. In fact, several and numerous applications, such as data classification [28,29,30], data clustering [31,32,33], image processing [34,35,36], and text categorization [37, 38], deployed and utilized the FS technique. To examine the FS issue, various and several distinct versions of SCOA have been emphasized [39,40,41,42,43,44,45,46]. Besides, various optimization algorithms, for instance, the genetic algorithm (GA) [47, 48], the backtracking search algorithm (BSA) [49], the coral reef optimization (CRO) [50], the particle swarm optimization (PSO) [51], and the fruit fly optimization algorithm (FOA) [52], are introduced to keep and depict the appropriate parameters for artificial intelligent (AI) methodologies.

This work proposes an improved and effective ensemble learning approach for fault detection and diagnosis in wind conversion systems. The contribution of this paper is threefold: firstly, pre-processing data is obtained. Secondly, a sine-cosine optimization algorithm is performed in order to avoid redundant features and select and extract only the more relevant observations from the entire set of features. Finally, the significant obtained features are fed to an ensemble learning algorithm to improve the classification performance and enhance the WECS model’s reliability and ability to distinguish between the diverse operating modes. In this work, therefore, we inserted frequent, potential, and diverse types of failures: wear-out faults, open-circuit faults, and short-circuit faults, at different sides and locations (grid and generator sides) in order to examine the reliability of the developed strategy compared to the state-of-the-art methods, including artificial neural network (ANN), K-nearest neighbor (KNN), cascade forward neural network (CFNN), feed forward neural network (FFNN), generalized regression neural network (GRNN), and support vector machine (SVM). The rest of this paper is arranged as follows:

The suggested ensemble learning-based sine-cosine optimization algorithm strategy is highlighted in “Methods” section, and the concepts of each employed technique are described. The proposed technique will be tested on wind energy conversion systems in “Results and discussion” section, and the maintained results are analyzed and summarized. “Conclusion” section of this paper offers a conclusion.

Methods

EL-SCOA approach

The evolved strategy involves three major steps, including data processing and treatment, feature optimization and selection, and fault detection and diagnosis (FDD). The main goal of the suggested technique, called the ensemble learning-based sine-cosine optimization algorithm (EL-SCOA), is to improve the fault diagnosis capabilities and efficiency of WECS. Unlike conventional diagnosis methods, which apply the raw data directly, the established proposal extracts and selects the best descriptive and intensive features from the original dataset and feeds them as inputs to the classifier for diagnosis purposes. The classifier uses bagging and boosting algorithms as ensemble techniques and ANN as a baseline classifier in order to identify, classify, and discriminate between the various states that may occur in the WECS.

The block diagram that illustrates the important steps of the evolved approach for FDD purposes is shown in Fig. 1.

The EL-SCOA is divided into two major categories: the training set and the testing set. The detailed descriptions are explained in Algorithm 1

Concept theoretical framework of artificial neural network (ANN)

Artificial NNs are computational models inspired by the networks of the human biological brain. These networks have gotten big attention until now [13, 48, 53]. The ANN utilizes a network pattern to generate decisions. Indeed, the input, hidden, and output layers are the three levels that make up the ANN structure. Each layer is made up of sets of nodes. The output layer provides the network response after the information has been processed by the hidden layer and received by the input layer. The number of inputs corresponds to the number of neurons in the input layer, similar to this, the number of output layer neurons corresponds to the number of ANN outputs, for instance in our work the ANN classifier is trained using WECS measurement variables (${x}_{1}...{x}_{m}, m = 12$) as inputs, and ($N=7$) labels as their corresponding desired outputs as depicted in Fig. 2. Nevertheless, the number of neurons in the hidden layer is determined experimentally. It consists of various experiments by varying the number of neurons in the hidden layer (10 hidden layers are employed in this study). In contrast to a complex ANN structure, a simple ANN architecture provides accurate predictions. Moreover, a signal of weight ${w}_{ij}$ interconnects every two neurons of successive layers. Each neuron transfers the information to the neurons in the next layer after processing it through an activation function ($f$). The most frequently employed function is the sigmoid activation function since it is a nonlinear function that can be differentiated [54]. This function, a logistic function with a range of 0 to 1, has the following formula:

$$f=\frac{1}{1+{e}^{-x}}$$

(1)

The weight, signal weight adjustment, prediction error, and the output of neural network equations are expressed in [55].

Ensemble learning theory

The ensemble learning methodology incorporates and combines various and several individual models in order to generate one optimal predictive model, thereby upgrading the performance and the classification results of the FDD techniques. Generally, boosting and bagging strategies are the most well-known and used in the literature.

Boosting strategy

In ensemble models, the boosting methodology, often known as a sequential ensemble [56], is used in ensemble models to improve the generalization of learning models that have weak generalizations [57]. Boosting is an ensemble technique where the predictors are created sequentially instead of independently. Indeed, boosting is based on the idea that subsequent predictors should learn from their previous errors and, accordingly, the obtained predictions become more accurate.

Bagging strategy

One of the most common techniques for generating ensemble-based algorithms is bagging [58], also named bootstrap aggregating. A bagging technique is deployed to enhance the performance of an ensemble classifier. Additionally, the intensive objective of this technique is to generate a series of independent observations with the same size and distribution as the raw dataset. Create a series of samples and generate an ensemble predictor that is more precise than the single predictor generated on the raw dataset. In fact, bagging concerns two tasks: the first is the creation of bagged observations and the transfer of each bag of observations to the base models, and the second is a technique for merging the predictions of the various predictors. Incorporating the output of the base predictors may differ because majority voting is utilized for classification issues and averaging is used for regression issues in order to create the ensemble output.

Concept of the sine-cosine optimization algorithm (SCOA)

The SCOA is a swarm-based optimization methodology that was first suggested by Mirjalili in 2016 [57, 58]. It is based on periodic behaviors that use the sine and cosine functions and is motivated by the transcendental function theory. Similarly to other optimization techniques, SCOA performs optimization through the use of mathematical rules. It is probable that a number of initial randomnesses possible solutions varied either away from or toward the final position (optimal solution). Some dynamic and randomized parameters highlight search exploration and exploitation via diverse optimization milestones [59]. The two phases position update equation used by the SCOA is demonstrated as follows:

$${p}_{ij}^{k+1}=\left\{\begin{array}{c}{p}_{ij}^{k}+{r}_{1}\times \mathrm{sin}\left(r2\right)\times \left|{r}_{3}\times {P}_{j}^{k}-{p}_{ij}^{k}\right|, {r}_{4}<0.5\\ {p}_{ij}^{k}+{r}_{1}\times \mathrm{cos}\left({r}_{2}\right)\times \left|{r}_{3}\times {P}_{j}^{k}-{p}_{ij}^{k}\right|,{r}_{4}\ge 0.5\end{array}\right.$$

(2)

Where $p$ denotes the position of the ith individual in the jth dimension at the (k+1)th iteration. $P$ depicts the global best position in $j\mathrm{th}$ dimension at $k\mathrm{th}$ iteration. The parameter ${r}_{1}$ decreases linearly with the iterative process, which is utilized to ensure the balance between exploration and exploitation. The parameter ${r}_{1}$ is depicted as

$${r}_{1}=\sigma -k\frac{\sigma }{\overline{k} }$$

(3)

Where $\overline{k }$ indicates the maximum number of iterations, and $\sigma$ is a constant number. ${r}_{2}$ is the random number uniformly distributed in $\left[0, 2\pi \right]$, ${r}_{3}$ is the random number uniformly distributed in the range $\left[0, 2\right]$, and ${r}_{4}$ is a random number in $\left[0, 1\right]$, which is used to switch with an equal probability between sine and cosine trigonometric functions. When ${r}_{3}>1$, the exchange of information between ${P}_{j}^{k}$ and ${p}_{ij}^{k}$ increases; although when ${r}_{3}<1$ the influence between ${P}_{j}^{k}$ and ${p}_{ij}^{k}$ is reduced. Figure 3 depicts the SCOA search mode diagram.

System description

In this research, a variable-speed wind turbine based on a squirrel cage induction generator (SCIG) is considered, as displayed in Fig. 4.

The squirrel cage induction machine (SCIG), which will be monitored and controlled by the stator-side AC/DC converter, and the grid-side DC/AC converter sub-system are the two major categories of the employed system’s model. This structure permits an infinitely variable speed operation. Additionally, regardless of the machine’s rotation speed, the required voltage is converted into direct current and voltage. Furthermore, for this structure, the generator grid side is based on an Insulated Gate Bipolar Transistor (IGBT), where its configuration is the same as that of the converter grid side. Table 1 illustrates the diverse properties of wind turbines.

Table 1 Properties of wind turbine

Full size table

Grid converter and generator converter are the two levels of the power conversion topology used in the wind chain. Each converter has a total of three arms. Each arm is made up of high and low IGBTs, as shown in Fig. 5.

Results and discussion

Data collection

In this study, we utilize data obtained from a healthy WTCS, which are then injected with several faulty scenarios: open-circuit (OC), short-circuit (SC), and wear-out. In other words, we initially considered how the system behaves in a healthy condition, and then we independently injected each faulty scenario, considering how each failure impacts and affects the system’s behavior in that situation. In fact, we do not take into account the transitional regime that appears when we switch from a healthy state to a faulty condition. These faults are thoroughly described in Table 2. The internal resistance of two ohms is used to indicate the final fault (WO fault). Accordingly, each mode behavior is adequately described over 2000 10-time-lagged samples within a second-time duration with 20 KHz as the sampling frequency.

Table 2 Description of the diverse labeled failures injected in the WEC system

Full size table

Seven operating modes of WECS are used in this study, including one healthy case (designated as C₁) and six different faulty states (C₂… C₇). Twelve measurement variables, which are used to represent these seven scenarios, are shown in further detail in Table 3.

Table 3 Labeling, description and ranges of the measured and monitored system variables

Full size table

The generator variables ${i}_{\mathrm{sd}}, {i}_{\mathrm{sq}}$, and the grid variables ${i}_{\mathrm{sd}}, {i}_{\mathrm{sq}}, {i}_{\mathrm{sar}}, {i}_{\mathrm{sbr}}$ can be calculated and obtained using the Park transformation, with $\theta (\mathrm{rad})$,

$$\left[\begin{array}{c}{i}_{sd}\\ {i}_{sq}\end{array}\right]=\frac{2}{3}\left[\begin{array}{c}cos\theta \mathrm{cos}\left(\theta -\frac{2\pi }{3}\right) cos(\theta +\frac{2\pi }{3})\\ sin\theta \mathrm{sin}\left(\theta -\frac{2\pi }{3}\right) sin(\theta +\frac{2\pi }{3})\end{array}\right]\left[\begin{array}{c}{i}_{sa}\\ {i}_{sb}\\ {i}_{sc}\end{array}\right]$$

(4)

Table 4 depicts the distinct operating scenarios. In both the training and testing phases, we used 50% of the observations.

Table 4 Construction of database for fault detection and diagnosis system

Full size table

Certain electrical and mechanical variables under various faulty situations are displayed in the following figures.

Evaluation metrics

Different metrics, often known as performance metrics or evaluation metrics are used to fully assess the effectiveness or quality of the model. These performance metrics enable us to evaluate how well our model performed the supplied data. In this manner, we can improve the model’s performance by tuning the hyperparameters. The approved criteria are accuracy (%), which denotes the rate of samples that are correctly predicted over the total number observations. Recall (%) which denotes, in the pertinent class, the rate of positive samples correctly predicted to the observations. Precision (%) denotes the number of positive samples correctly predicted divided by the number of total predicted positive observations. Computation time (CT(s)) represents the time required to carry out the algorithm Figs. 6, 7, 8, 9 and 10.

$$\mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$

(5)

$$\mathrm{Recall}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$

(6)

$$\mathrm{Precision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$

(7)

Where $\mathrm{TP}$ (true positive) is the number of observations that are correctly identified, $\mathrm{TN}$ (true negative) represents the number of observations that are correctly dismissed, $\mathrm{FP}$ (false positive) is the number of observations that are incorrectly dismissed and $\mathrm{FN}$ (false negative) is the number of observations that are incorrectly identified.

Discussion

In order to demonstrate, and show the effectiveness of the presented approach in terms of diagnostic recall, precision, accuracy, and computation time, a number of methods, including CFNN, FFNN, ANN, GRNN, KNN, and SVM, have been employed. The diverse existing methods are modeled and tested in a MATLAB toolbox. To evaluate the overall effectiveness of the provided strategies, the accuracy was calculated using a 10-fold cross-validation metric. For the FFNN, CFNN, GRNN, and ANN, 10 hidden layers with a total of 50 hidden neurons in each layer were chosen. In order to introduce non-linearity, a sigmoid function is used in the hidden layers. The $K$ and $C$ parameters for SVM are set with the lowest RMSE value, and the $K$ value for KNN is equal to 3. The maximum number of iterations for the SCOA is 100, and the number of solutions that are chosen is 10.

The comparison analysis in Table 5 showed that the proposed strategy EL, which combines the bagging and boosting strategies with the ANN classifier, performed much better than the ANN and the other methodologies in terms of accuracy (98.88%) and outperformed the other models. In spite of the fact that the suggested EL approach performs better and produces good results in terms of classification accuracy compared to conventional techniques, it still suffers from a difficult training phase and a high time complexity. To deal with this drawback, we actually employed a sine-cosine optimization algorithm (SCOA) in order to pick and select the best descriptive features and reduce the computation time, which represents a significant challenge in the fault diagnosis domain, as well as accelerate the learning and the classification tasks. As a result, the computation time is significantly decreased from 23.74 s to 12.00 s, with only a minor difference in its accuracy (by 0.53%). The inefficient KNN and SVM classification outcomes are attributable to the direct usage of raw data, demonstrating the success of the suggested approach that selects the more significant features before performing the classification task. Six features (out of 12) of the developed EL-SCOA strategy were best selected, as shown in the following table (Table 6).

Table 5 Performance evaluations of various classification methods

Full size table

Table 6 The selected features and the performance evaluations of the evolved classification strategies

Full size table

Table 7 illustrates the obtained testing classification outcomes of the diverse classes by the use of a confusion matrix (CM) to further demonstrate the effectiveness of the evolved methodology. Indeed, the samples that were successfully labeled to the healthy condition (C1) and the various faulty operating states (C2 to C7) as well as the samples that were incorrectly labeled, are both displayed in this matrix. Specifically, the X and Y axes highlight the true classes and the projected conditions, respectively.

Table 7 Confusion matrix for the EL-based SCOA in the testing phase

Full size table

Table 7 demonstrates that the EL-based SCOA strategy correctly identifies the 2000 observations from the 2000 true positives for the conditions operating modes (C₂, C₃, and C₇), indicating that these modes are correctly classified and there was no misclassification. However, there is a misclassification for the healthy state (C₁), faulty modes 3 (C₄), 4 (C₅), and 5 (C₆), as evidenced by the classification of 142 observations from the healthy class as the class (C₅), 9 observations from the class (C₄) as the class (C₅), and 4 samples from the class (C₆) as the class (C₁).

Conclusions

This paper developed an enhanced fault detection and diagnosis approach called an ensemble learning-based sine-cosine optimization algorithm (EL-SCOA) for wind energy conversion (WEC) systems. The presented methodology was addressed so that the sine-cosine algorithm is proposed in order to optimize, select, and extract the most informative features from the raw data, where the maintained selected features are fed to the classification technique for diagnosis purposes. The classification method incorporates bagging and boosting as ensemble methods and an ANN as a baseline classifier. The proposed paradigm attempted to discriminate between various operating states (short circuit, open circuit, and wear-out faults) introduced at various locations and sides (generator and grid sides). As compared to other existing methods including ANN, KNN, CFNN, FFNN, GRNN, and SVM, the experimental outcomes show that the suggested strategy performs very well. As a result, the effectiveness of the suggested technique inspires us to further examine its computation time and memory storage in future research. In order to simultaneously improve diagnosis accuracy and decrease WEC system execution time, a strategy that combines data size reductions and the aforementioned technique will be proposed.

Availability of data and materials

Data will be made available on request.

Abbreviations

SCA:: Sine cosine algorithm
SCOA:: Sine cosine optimization algorithm
EL:: Ensemble learning
WECS:: Wind energy conversion system
WT:: Wind turbine
FDD:: Fault detection and diagnosis
FE:: Feature extraction
FS:: Feature selection
ANN:: Artificial neural network
NN:: Neural network
RNN:: Recurrent NN
FFNN:: Feed-forward NN
CFNN:: Cascade forward NN
GRNN:: Generalized regression NN
RF:: Random forest
KNN:: K-nearest neighbors
SC:: Short circuit
OC:: Open circuit
WO:: Wear out
SVM:: Support vector machine
CT:: Computation time
CM:: Confusion matrix

References

Murgas B, Henao A, Guzman L (2021) Evaluation of investments in wind energy projects, under uncertainty. state of the art review. Appl Sci 11(21):10213
Article Google Scholar
Singh G, Sundaram K (2022) Methods to improve wind turbine generator bearing temperature imbalance for onshore wind turbines. Wind Eng 46(1):150–159
Article Google Scholar
Xu Y, Nascimento NMM, de Sousa PHF, Nogueira FG, Torrico BC, Han T, Jia C, Rebouças filho PP (2021) Multi-sensor edge computing architecture for identification of failures short-circuits in wind turbine generators. Appl Soft Comput 101:107053
Article Google Scholar
López-Uruñuela FJ, Fernandez-Diaz B, Pagano F, López-Ortega A, Pinedo B, Bayón R, Aguirrebeitia J (2021) Broad review of “white etching crack” failure in wind turbine gearbox bearings: Main factors and experimental investigations. Int J Fatigue 145:106091
Article Google Scholar
Wang L, Zhang Z, Long H, Xu J, Liu R (2016) Wind turbine gearbox failure identification with deep neural networks. IEEE Transact Industrial Inform 13(3):1360–1368
Article Google Scholar
Kouadri A, Hajji M, Harkat M-F, Abodayeh K, Mansouri M, Nounou H, Nounou M (2020) Hidden markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems. Renew Energ 150:598–606
Article MATH Google Scholar
Xiao C, Liu Z, Zhang T, Zhang X (2021) Deep learning method for fault detection of wind turbine converter. Appl Sci 11(3):1280
Article Google Scholar
Ravikumar K, Subbiah R, Ranganathan N, Bensingh J, Kader A, Nayak SK (2020) A review on fatigue damages in the wind turbines: Challenges in determining and reducing fatigue failures in wind turbine blades. Wind Eng 44(4):434–451
Article Google Scholar
Mishnaevsky L Jr (2022) Root causes and mechanisms of failure of wind turbine blades: Overview. Materials. 15:2959
Article Google Scholar
Fezai R, Dhibi K, Mansouri M, Trabelsi M, Hajji M, Bouzrara K, Nounou H, Nounou M (2020) Effective random forest-based fault detection and diagnosis for wind energy conversion systems. IEEE Sensors J 21(5):6914–6921
Article Google Scholar
Donadio L, Fang J, Porté-Agel F (2021) Numerical weather prediction and artificial neural network coupling for wind energy forecast. Energies 14(2):338
Article Google Scholar
Irfan MM, Malaji S, Patsa C, Rangarajan SS, Hussain SS (2022) Control of dstatcom using ann-bp algorithm for the grid connected wind energy system. Energies 15(19):6988
Article Google Scholar
Mansouri M, Dhibi K, Nounou H, Nounou M (2022) An effective fault diagnosis technique for wind energy conversion systems based on an improved particle swarm optimization. Sustainability 14(18):11195
Article Google Scholar
Mansouri M, Fezai R, Trabelsi M, Nounou H, Nounou M, Bouzrara K (2021) Reduced gaussian process regression based random forest approach for fault diagnosis of wind energy conversion systems. IET Renew Power Gener 15(15):3612–3621
Article Google Scholar
Dietterich TG (2000) Ensemble methods in machine learning. Multiple Classifier Systems: First International Workshop, MCS 2000 Cagliari, Italy, June 21–23, 2000 Proceedings 1. Springer, Cagliari, Italy, pp 1–15
Google Scholar
R. Kohavi, D. H. Wolpert, et al., (1996). Bias plus variance decomposition for zero-one loss functions, in: ICML, Vol. 96, Citeseer, 275–83.
Breiman L (2001) Random forests. Machine Learn 45:5–32
Article MATH Google Scholar
Zhao J, Gao X, Yang (2005). A survey of neural network ensembles, in: 2005 international conference on neural networks and brain, vol 1. IEEE, Beijing, pp 438–442
Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1–39
Article Google Scholar
Gopika D, Azhagusundari B (2014) An analysis on ensemble methods in classification tasks
Google Scholar
Yang P, Hwayang Y, Zhou BB, Zomaya AY (2010) A review of ensemble methods in bioinformatics. Current Bioinformatics. 5(4):296–308
Article Google Scholar
Mendes-Moreira J, Soares C, Jorge AM, Sousa JFD (2012) Ensemble approaches for regression: a survey. ACM Comput Surveys (csur) 45(1):1–40
Article MATH Google Scholar
Ren Y, Suganthan P, Srikanth N (2015) Ensemble methods for wind and solar power forecasting—a state-of-the-art review. Renew Sustain Energ Rev 50:82–91
Article Google Scholar
Vega-Pons S, Ruiz-Shulcloper J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recogn Artif Intell 25(03):337–372
Article MathSciNet Google Scholar
Dhibi K, Mansouri M, Bouzrara K, Nounou H, Nounou M (2022) Reduced neural network based ensemble approach for fault detection and diagnosis of wind energy converter systems. Renew Energ. 194:778–787
Article Google Scholar
Dhibi K, Mansouri M, Bouzrara K, Nounou H, Nounou M (2021) An enhanced ensemble learning-based fault detection and diagnosis for grid-connected pv systems. IEEE Access 9:155622–155633
Article Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Machine Learn Res. 3:1157–1182
MATH Google Scholar
Hua J, Tembe WD, Dougherty ER (2009) Performance of feature-selection methods in the classification of high-dimension data. Pattern Recogn 42(3):409–424
Article MATH Google Scholar
Gómez-Verdejo V, Verleysen M, Fleury J (2009) Information-theoretic feature selection for functional data classification. Neurocomputing 72(16–18):3580–3589
Article Google Scholar
R. Z. Al-Abdallah, A. S. Jaradat, I. A. Doush, Y. A. Jaradat. (2017) .A binary classifier based on firefly algorithm. Jordan J Comput Inform Technol (JJCIT); 3(3):172-185
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Transact Knowledge Data Eng 17(4):491–502
Article Google Scholar
Boutemedjet S, Bouguila N, Ziou D (2008) A hybrid feature extraction selection approach for high-dimensional non-gaussian data clustering. IEEE Transact Pattern Analysis Machine Intell 31(8):1429–1443
Article Google Scholar
ElMustafa S, Jaradat A, Doush IA, Mansour N (2017) Community detection using intelligent water drops optimisation algorithm. Int J Reason-based Intell Syst 9(1):52–65
Google Scholar
Huang K, Aviyente S (2008) Wavelet feature selection for image classification. IEEE Transact Image Process 17(9):1709–1720
Article MathSciNet Google Scholar
Chen B, Chen L, Chen Y (2013) Efficient ant colony optimization for image feature selection. Sign Process 93(6):1566–1576
Article Google Scholar
Sawalha R, Doush IA (2012) Face recognition using harmony search-based selected features. Int J Hybrid Inform Technol 5(2):1–16
Google Scholar
Shang W, Huang H, Zhu H, Lin Y, Qu Y, Wang Z (2007) A novel feature selection algorithm for text categorization. Exp Syst Appl 33(1):1–5
Article Google Scholar
Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. ACM Sigkdd Explor Newslett 6(1):80–89
Article Google Scholar
Neggaz N, Ewees AA, Abd Elaziz M, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Exp Syst Appl 145:113103
Article Google Scholar
Sindhu R, Ngadiran R, Yacob YM, Zahri NAH, Hariharan M (2017) Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. Neural Comput Appl 28:2947–2958
Article Google Scholar
Eid MM, El-kenawy E-SM, Ibrahim A (2021) A binary sine cosine modified whale optimization algorithm for feature selection. 2021 National Computing Colleges Conference (NCCC). IEEE, Taif, Saudi Arabia, pp 1–6
Google Scholar
Hussain K, Neggaz N, Zhu W, Houssein EH (2021) An efficient hybrid sine-cosine harris hawks optimization for low and high-dimensional feature selection. Exp Syst Appl 176:114778
Article Google Scholar
M. E. Abd Elaziz, A. A. Ewees, D. Oliva, P. Duan, S. Xiong. (2017). A hybrid method of sine cosine algorithm and differential evolution for feature selection, in: Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14–18, 2017, Proceedings, Part V 24, Springer:145–155.
Abualigah L, Dulaimi AJ (2021) A novel feature selection method for data mining tasks using hybrid sine cosine algorithm and genetic algorithm. Cluster Comput 24:2161–2176
Article Google Scholar
Abd Elaziz M, Oliva D, Xiong S (2017) An improved opposition-based sine cosine algorithm for global optimization. Expert Syst Appl 90:484–500
Article Google Scholar
R. Sindhu, R. Ngadiran, Y. M. Yacob, N. A. Hanin Zahri, M. Hariharan, K. Polat. (2019). A hybrid sca inspired bbo for feature selection problems. Math Problems Eng; 2019.
Dhunny A, Timmons D, Allam Z, Lollchund M, Cunden T (2020) An economic assessment of near-shore wind farm development using a weather research forecast-based genetic algorithm model. Energy 201:117541
Article Google Scholar
Hichri A, Hajji M, Mansouri M, Abodayeh K, Bouzrara K, Nounou H, Nounou M (2022) Genetic-algorithm-based neural network for fault detection and diagnosis: Application to grid-connected photovoltaic systems. Sustainability 14(17):10518
Article Google Scholar
Kartite J, Cherkaoui M (2017) Improved backtracking search algorithm for renewable energy system. Energy Procedia 141:126–130
Article Google Scholar
Salcedo-Sanz S, Gallo-Marazuela D, Pastor-Sánchez A, CarroCalvo L, Portilla-Figueras A, Prieto L (2014) Offshore wind farm design with the coral reefs optimization algorithm. Renew Energy. 63:109–115
Article Google Scholar
He Z, Chen Y, Shang Z, Li C, Li L, Xu M (2019) A novel wind speed forecasting model based on moving window and multi-objective particle swarm optimization algorithm. Appl Math Model 76:717–740
Article MathSciNet MATH Google Scholar
Zhang Q, Qian H, Chen Y, Lei D (2020) A short-term traffic forecasting model based on echo state network optimized by improved fruit fly optimization algorithm. Neurocomputing 416:117–124
Article Google Scholar
Hajji M, Yahyaoui Z, Mansouri M, Nounou H, Nounou M (2023) Fault detection and diagnosis in grid-connected pv systems under irradiance variations. Energ Rep 9:4005–4017
Article Google Scholar
Hippert HS, Pedreira CE, Souza RC (2001) Neural networks for short term load forecasting: a review and evaluation. IEEE Transact Power Syst 16(1):44–55
Article Google Scholar
Jamii J, Mansouri M, Trabelsi M, Mimouni MF, Shatanawi W (2022) Effective artificial neural network-based wind power generation and load demand forecasting for optimum energy management. Front Energy Res. 10:898413
Article Google Scholar
Ganaie MA, Hu M, Malik A, Tanveer M, Suganthan P (2022) Ensemble deep learning: a review. Eng Appl Artificial Intell 115:105151
Article Google Scholar
Zhang W, Jiang J, Shao Y, Cui B (2020) Snapshot boosting: a fast ensemble framework for deep neural networks. Sci China Inform Sci 63:1–12
Google Scholar
Breiman L (1996) Bagging predictors. Machine Learn 24:123–140
Article MATH Google Scholar
Shang C, Zhou T-T, Liu S (2022) Optimization of complex engineering problems using modified sine cosine algorithm. Sci Rep 12(1):20528
Article Google Scholar

Download references

Acknowledgements

The publication is the result of the Qatar National Research Fund (QNRF) research grant.

Funding

Funding provided by the Qatar National Library.

Author information

Authors and Affiliations

Research Unit Advanced Materials and Nanotechnologies (UR16ES03), Higher Institute of Applied Sciences and Technology of Kasserine, Kairouan University, 1200, Kasserine, Tunisia
Khadija Attouri & Mansour Hajji
Chemical Engineering Program, Texas A&M University at Qatar, 23874, Doha, Qatar
Khaled Dhibi
Electrical and Computer Engineering Program, Texas A&M University at Qatar, 23874, Doha, Qatar
Majdi Mansouri & Hazem Nounou
Laboratory of Automatic Signal and Image Processing, National Engineering School of Monastir, University of Monastir, 5019, Monastir, Tunisia
Kais Bouzrara

Authors

Khadija Attouri
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Dhibi
View author publications
You can also search for this author in PubMed Google Scholar
Majdi Mansouri
View author publications
You can also search for this author in PubMed Google Scholar
Mansour Hajji
View author publications
You can also search for this author in PubMed Google Scholar
Kais Bouzrara
View author publications
You can also search for this author in PubMed Google Scholar
Hazem Nounou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

KA wrote the original draft and worked on the software. KD worked on the software. MM and MH defined the methodology, reviewed, and edited the manuscript. Kais KB and HN supervised the work. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Majdi Mansouri.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Attouri, K., Dhibi, K., Mansouri, M. et al. Enhanced fault diagnosis of wind energy conversion systems using ensemble learning based on sine cosine algorithm. J. Eng. Appl. Sci. 70, 56 (2023). https://doi.org/10.1186/s44147-023-00227-3

Download citation

Received: 20 March 2023
Accepted: 17 May 2023
Published: 02 June 2023
DOI: https://doi.org/10.1186/s44147-023-00227-3

Enhanced fault diagnosis of wind energy conversion systems using ensemble learning based on sine cosine algorithm

Abstract

Introduction

Methods

EL-SCOA approach

Concept theoretical framework of artificial neural network (ANN)

Ensemble learning theory

Boosting strategy

Bagging strategy

Concept of the sine-cosine optimization algorithm (SCOA)

System description

Results and discussion

Data collection

Evaluation metrics

Discussion

Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords