Soft computing techniques to predict the compressive strength of groundnut shell ash-blended concrete

Sathiparan, Navaratnarajah; Jeyananthan, Pratheeba

doi:10.1186/s44147-023-00302-9

Research
Open access
Published: 06 November 2023

Soft computing techniques to predict the compressive strength of groundnut shell ash-blended concrete

Journal of Engineering and Applied Science volume 70, Article number: 134 (2023) Cite this article

887 Accesses
1 Citations
Metrics details

Abstract

Using groundnut shell ash (GSA) as a component in concrete mixtures is a viable approach to achieving sustainability in building practices. This particular kind of concrete has the potential to effectively mitigate the issues associated with high levels of CO₂ emissions and embodied energy, which are primarily attributed to the excessive utilization of cement in conventional construction materials. When GSA is utilized as a partial replacement for cement, the strength characteristics of concrete are influenced not only by the quantity of GSA replacement but also by several other factors, including cement content, water-to-cement ratio, coarse aggregate content, fine aggregate content, and curing length. This work demonstrates a predictive model for the compressive strength (CS) of GSA mixed concrete using ML methods. The models were constructed with 297 datasets obtained from published literature. These datasets included various input variables such as cement content, GSA content, fine aggregate content, coarse aggregate content, water need, and curing duration. The output variable included in the models was the CS of concrete. In this study, a set of seven machine learning algorithms was utilized as statistical assessment tools to identify the most precise and reliable model for predicting the CS of GSA mixed concrete. These techniques included linear regression, full quadratic model, artificial neural network, boosted decision tree regression, random forest regression, K nearest neighbors, and support vector regression. The present study evaluated several machine learning models, and it was shown that the random forest regression model had superior performance in forecasting the CS of GSA mixed concrete. The train data’s R² is 0.91, with RMSE of 2.48 MPa. Similarly, for the test data, the R² value is 0.89, with an RMSE of 2.42 MPa. The sensitivity analysis results of the random forest regression model indicate that the cement content primarily drives the material’s CS. Subsequently, the curing period and GSA content significantly impact the CS. This work systematically evaluates the CS of GSA mixed concrete, contributing to the existing body of knowledge and practical implementation in this domain.

Introduction

Cement manufacturing is a significant contributor to the emission of CO₂ into the atmosphere. It is accountable for around 8% of global anthropogenic CO₂ emissions. Cement production totals over 4 billion tonnes [48], and each tonne of cement releases approximately 900 kg of CO₂ [7]. The use of fossil fuels to generate heat to initiate the cement manufacturing procedure, along with the thermal breakdown of calcium carbonate during the clinker manufacturing process, leads to significant carbon dioxide emissions. A total of 30–40% of the energy used in this process comes from fuel combustion, while 60–70% comes from decarbonization [10, 16, 17, 60]. Despite its high CO₂ emissions, cement is necessary for most building materials, including concrete.

The construction industry is increasingly faced with the need to develop alternative cementitious materials that may serve as viable alternatives for cement in building applications [57]. This is due to the urgent need to reduce CO₂ emissions and embodied energy since these factors play a crucial role in mitigating global warming in the long term [42]. Building materials often use a diverse range of supplementary cementitious materials [8, 9]. These materials encompass metakaolin [55], silica fume [20], volcanic pozzolanas [22], granulated blast furnace slag [38], and limestone [61]. Utilizing these industrial waste by-products as a financially viable substitute for cement does not impair the mechanical properties and long-lasting nature of the construction materials. Nevertheless, it is expected that the accessibility of these industrial by-products will diminish. Furthermore, it should be noted that the accessibility of these resources is limited, especially in less developed nations [9]. The construction sector is very interested in using agro-waste as a cement substitute. Agro-wastes have been widely used as cement substitutes, including sugarcane bagasse ash [21], rice husk ash [26, 34, 53], and sawdust ash [5]. The existing research literature indicates that construction materials, including agro-waste, have been found to meet the minimal standards given in regional building codes. Moreover, the use of these agricultural by-products in the manufacturing of building materials results in a decrease in ecological harm [59]. Most agricultural waste is unprocessed, unused, and often indiscriminately burned, dumped, or landfilled [45].

One of these agricultural wastes is groundnut shells, a by-product of groundnut (peanuts) manufacturing. Global peanut production was peak at around 47 million tonnes in 2020. China was the largest producer, accounting for 40% (or 18 million tonnes) of world peanut production [49, 52]. About 21–29% of the weight of the peanut is in the shell [11, 13]. Thus, the peanut industry generates about 11 million tons of peanut shell waste yearly [40]. In addition, a significant quantity of peanut shells is utilized as biomass for energy. However, a greater volume of discarded peanut shells is disposed away with ordinary waste. Using groundnut shells and their derivatives as construction materials is a viable solution to mitigate the environmental challenges linked to cement consumption and the management of groundnut shell waste. Numerous research has been conducted on the usage of GSA as a potential alternative to cement in concrete and cement mortar. Additionally, GSA has been investigated as a stabilizing agent for soil, road foundation, and masonry blocks. Furthermore, its application as a precursor in the development of geopolymer materials has also been explored.

When GSA is utilized as a partial substitute for cement, the strength characteristics of concrete are influenced not only by the quantity of GSA replacement but also by several other factors, including cement content, W/C ratio, coarse aggregate content, fine aggregate content, and curing length. Therefore, it is crucial to examine the impact of these factors on the CS of GSA blended concrete and put forward a method for forecasting the CS of GSA blended concrete. Nevertheless, a prediction model for the CS of concrete with GSA has not yet been developed.

In recent years, engineers and academics have become increasingly interested in using ML techniques to predict the characteristics of building materials [14, 33, 50, 51]. The properties of GSA mixed concrete are sensitive to the mixing proportions and are influenced by many variables, making ML approaches the best option for predicting these properties. There is a suggestion to use more advanced techniques to minimize reliance on laboratory testing. Additionally, engineers should be equipped with essential tools and mathematical equations to predict the outcomes of tests [49, 56]. ML techniques may be used to provide alternate approaches and resolutions for both linear and nonlinear scenarios, whereby mathematical models are unsuccessful in precisely defining the interdependencies among the variables implicated in a given issue [15, 62].

The primary objective of the current study is to use ML approaches to forecast the CS of GSA mixed concrete. Consequently, mixed design elements are utilized to develop predictive models for CS, enabling the utilization of these models in the construction industry without the need for previous theoretical comprehension. To determine the most precise and dependable model for predicting the CS of GSA blended concrete, a statistical evaluation was conducted using seven distinct machine-learning techniques. These techniques included linear regression, full quadratic model, artificial neural network, boosted decision tree regression, random forest regression, k-nearest neighbors, and support vector regression. The proposed models provide a means to enhance the accuracy of predicting the CS of GSA mixed concrete.

Methods

The approach used in the present work encompasses a sequence of procedures, visually shown as a flowchart in Fig. 1. The primary procedures include the following actions:

Generating and gathering information about GSA mixed concrete from existing scholarly sources.
The predictors for the models include cement content (CC), groundnut shell ash content (GSA), fine aggregate (FA), coarse aggregate (CA), quantity of water content (WC), and curing duration (t). The target value of the models is the CS of concrete.
The acquired data is randomly divided into two groups, with two-thirds of the total data assigned to the training group and the remaining one-third allocated to the testing group.
Developing predictive models using LR, FQ, ANN, BDT, RFR, KNN, and SVR machine learning algorithms.
Assessing the proposed models based on the following performance indicators: R², RMSE, MAE, SI, a-20 index, and OBJ.
The present study uses SHAP analysis to perform sensitivity analysis to identify the crucial parameter for accurately forecasting the CS of GSA mixed concrete.

Data collection

The database used in this work was constructed using seventeen previously published experimental investigations, as shown in Table 1. These studies were employed to create 297 experimental datasets for the database. All datasets included in this study were generated via samples that adhered to internationally recognized standards for casting and testing. The parameters used in this database were as follows:

Cement content, CC (in kg/m³)
Groundnut shell ash content, GSA (in kg/m³)
Fine aggregate content, FA (in kg/m³)
Coarse aggregate content, CA (in kg/m³)
Water content, WC (in kg/m³)
Curing period, t (in days)
Compressive strength, CS (in MPa)

Table 1 The experimental data set collected from published literature

Full size table

The collected data were partitioned into two groups with the RAND function. The first subset of 198 data sets accounted for about two-thirds of the total data and was used for model development. The remaining dataset, which accounted for one-third of the total data, was utilized to verify the models constructed based on the first group.

Machine learning modeling

Linear regression

LR is a supervised ML approach that is utilized to determine the linear association between a dependent parameter and a set of independent parameters. The model postulates a linear association between the input parameters and the only output parameter. The purpose of the technique is to recognize the optimal linear equation that can effectively forecast the value of the dependent variable by using the independent variables. The linear regression (LR) model, as stated in Eq. (1), was used to predict the CS of GSA blended concrete.

$$CS = {\alpha }_{0}+{\alpha }_{1}(CC)+{\alpha }_{2}(GSA)+{\alpha }_{3}(FA)+{\alpha }_{4}(CA)+{\alpha }_{5}(WC)+{\alpha }_{6}(t)$$

(1)

where a₀ to a₆ are model parameters.

Full quadratic (FQ) model

The full quadratic regression model is a kind of regression analysis that represents the association between the independent and the dependent parameters as a polynomial of degree two in the independent parameters. Polynomial regression is a kind of linear regression that involves using a polynomial equation to represent data that demonstrates a nonlinear relationship among the dependent and independent parameters. Equation 2 introduces a complete quadratic formula that provides a relationship between CS and the first and second orders of each independent parameter [27].

$$CS = {\beta }_{0}+ {\beta }_{1}(CC) + {\beta }_{2}(GSA) + {\beta }_{3}(FA) +{\beta }_{4} (CA) + {\beta }_{5}(WC) + {\beta }_{6}(t) +{\beta }_{7} {(\mathrm{C}C)}^{2} + {\beta }_{8}{(GSA)}^{2}+{\beta }_{9} {(FA)}^{2} + {\beta }_{10}{(CA)}^{2}+ {\beta }_{11}{(WC)}^{2}2 +{\beta }_{12} {(t)}^{2} + {\beta }_{13}(CC)(GSA) +{\beta }_{14}(\mathrm{C}C)(FA) + {\beta }_{15}(\mathrm{C}C)(CA) + {\beta }_{16}(\mathrm{C}C)(WC) + {\beta }_{17}(CC)(t) +{\beta }_{18} (GSA)(FA) + {\beta }_{19}(GSA)(CA) + {\beta }_{20}(GSA)(WC) + {\beta }_{21}(GS{\rm B})(t) +{\beta }_{22} (FA)(CA) + {\beta }_{23}(FA)(WC) + {\beta }_{24}(FA)(t) + {\beta }_{25}(CA)(WC) +{\beta }_{26} (CA)(t) + {\beta }_{27}(WC)(t)$$

(2)

where β₀ to β₂₇ are model parameters.

Artificial neural network (ANN) model

ANNs are computer models that draw inspiration from the functioning of biological neural networks. ANNs consist of linked processing nodes, often called neurons, which can acquire knowledge and identify patterns within incoming data. ANNs are used for pattern recognition, data classification, and making predictions. ANNs have self-learning proficiencies and can provide better results as more data is available [24, 25]. The output of each layer is calculated by taking the sum of its inputs and applying a nonlinear function to it. Given the absence of a standardized approach for constructing the network architecture, the number of hidden layers and neurons was determined by implementing a parameter optimization technique [23]. To mitigate the issue of overfitting, a decision was made to maintain simplicity in the model architecture by using a solitary, hidden layer with three neurons. After several tests and cross-validation, these values were chosen.

Random forest regression

RFR is a kind of ensemble ML technique that creates several decision trees during the training phase. This method is used in regression problems, whereby the resultant prediction is the mean or average of the individual trees [6]. The RFR method is a widely used ML technique that aggregates the predictions of several decision trees to get a unified outcome. The acceptance of this tool has been driven by its user-friendly interface and versatile functionality, which enables it to address classification and regression tasks effectively.

Boosted decision tree

The BDT is an ML methodology that integrates numerous decision trees to enhance the precision of predictive outcomes [47]. It works by training each new tree to emphasize the training instances that were previously mis-modeled. This is done by fitting the residual of the trees that preceded it. Compared with random forest regression with a boosted decision tree, the main difference between the two methods is that in boosting, each tree is dependent on prior trees, while in random forests, each tree is independent of the others [41].

K-nearest neighbors

KNN is a nonparametric approach used in supervised learning to address classification and regression challenges [30]. The input comprises the k-nearest training instances within a given data collection. The main difference between KNN and ANN is that KNN is a simple algorithm that relies on the proximity of data points to make predictions, while ANNs are more complex models that can learn to recognize patterns in data through training.

Support vector regression

SVR is a supervised ML algorithm specifically designed to address regression problems. SVR is a computational technique that aims to identify a mathematical function that effectively models the association between input and output variables, minimizing the overall error [30]. Additionally, SVR permits some flexibility within a predefined range, allowing for some departure. The input data is transformed by SVR into a high-dimensional feature space, allowing a linear model to be fitted using kernel functions. SVR is robust to outliers and can handle nonlinear and high-dimensional data.

Performance indicators

The evaluation of the created models encompasses several metrics, including the R², RMSE, MAE, scatter index, a20-index, and OBJ. It is anticipated that the values of the a20-index will be equal to one for an ideal prediction model. The a20-index, as developed, has the benefit of possessing a tangible engineering interpretation. It quantifies the sample count that meets anticipated values within a 20% deviation from experimental values. Equations 3, 4, 5, 6, 7 and 8 are used to calculate each specified criterion.

$${R}^{2}={\left(\frac{{\sum }_{i}\left({P}_{i}-\overline{P }\right)\left({E}_{i}-\overline{E }\right)}{\sqrt{\sum_{i}{\left({P}_{i}-\overline{P }\right)}^{2}}\sqrt{\sum_{i}{\left({E}_{i}-\overline{E }\right)}^{2}}}\right)}^{2}$$

(3)

$$RMSE= \sqrt{\frac{\sum_{i=1}^{n}{\left({E}_{i}-{P}_{i}\right)}^{2}}{N}}$$

(4)

$$MAE= \frac{\sum_{i=1}^{n}\left(\left|{E}_{i}-{P}_{i}\right|\right)}{N}$$

(5)

$$SI= \frac{RMSE}{\overline{E} }$$

(6)

$$a{20}_{index}=\frac{N20}{N}$$

(7)

$$OBJ= \left(\frac{{n}_{tr}}{N}\times \frac{{RMSE}_{tr}+{MAE}_{tr}}{{R}_{tr}^{2}+1}\right)+\left(\frac{{n}_{te}}{N}\times \frac{{RMSE}_{te}+{MAE}_{te}}{{R}_{te}^{2}+1}\right)$$

(8)

where

Pi: Predicted CS

Ei: Experimental CS

$\overline{P }$: Mean of predicted CS

$E$: Mean of experimental CS

N: Total number of dataset

N20: Total number of predicted to the measured data of CS ratio ranged from 0.8 to 1.2

n_tr: Number of the training dataset

n_te: Number of the test dataset

The R² value and the a-20 index typically range from zero to one, with a value of 1 being regarded as the optimal outcome. The RMSE, MAE, and OBJ values range from zero to infinity. It is advisable to minimize these values, with zero being the optimal outcome. Additionally, if the value of the SI metric is less than 0.1, the model may be classified as exhibiting good performance. The SI value ranges from 0.1 to 0.2, 0.2 to 0.3, and more than 0.3, denoting the model’s performance as excellent, fair, and bad, respectively [1, 3, 27].

Results and discussion

Statistical analysis

Statistical analyses were conducted to evaluate the connection among the factors listed. The relationship between CS and the dependent variables is shown in Fig. 2. The findings suggest a reasonable association exists between the amount of cement used and the GSA (ground slag aggregate) content with respect to CS, as seen in Fig. 3. However, the relationship between fine aggregate, coarse aggregate, water, and curing time with CS was low. The statistical analysis findings are succinctly presented in Table 2.

Table 2 The statistical analysis of the dataset

Full size table

Machine learning model results

Figure 4 illustrates GSA blended concrete’s predicted vs. measured CS values for all seven machine-learning models discussed. Table 3 summarizes the performance indicators for each model.

Table 3 Performance indicators for various ML models

Full size table

The LR model is a fundamental mathematical model used to predict the CS of concrete. The outcome of the LR model is revealed in Eq. (9). Figure 8a illustrates the correlation between the anticipated and observed CS. The training dataset has an R² value of 0.608 and an RMSE value of 5.11 MPa. Moreover, the testing dataset exhibited an R² value of 0.643 and a RMSE of 4.29 MPa. Based on the obtained R² and RMSE data, it can be concluded that the performance of the LR model is unsatisfactory. The outcomes of the LR model are among the least effective, mainly owing to its simplistic mathematical formulation. The error range in the training dataset is −20 to 20%. This indicates that 55% of the data is within the range of 0.8 to 1.2 for the ratio used to estimate CS.

$$CS = -37.78 + 0.119(CC) + 0.024(GSA) + 0.016(FA) + 0.023(CA)-0.117(WC) + 0.135(t)$$

(9)

Due to its advanced mathematical formulation, the FQ model is one of the most successful mathematical models. It has been developed using mathematical criteria such as constants, linearity, variable product terms and interactions, and quadratic variables. The formula for the FQ model predicting the CS of GSA mixed concrete is shown in Eq. 10. Figure 4b shows the relationship between the predicted and measured CS of the FQ model. The R² and RMSE for the training data were 0.865 and 3.00 MPa, respectively, while for the test data, they were 0.766 and 3.48 MPa, respectively. For the expected observed CS ratio, 69% of the data falls between 0.8 and 1.2 in the training data set, with an error line of −20 to 20%. Although FQ model performs better than LR models, its prediction accuracy is still less than RFR and BDT models.

$$CS =\mathrm{ 154,115.4}-176.7(CC)-175.8(GSA) + 47.4(FA)-342.8(CA) + 903.7(WC)-1.5(\mathrm{t}) + {0.3(\mathrm{C}C)}^{2}+ {0.3(GSA)}^{2}-{0.006(FA)}^{2}+ {0.1(CA)}^{2} + {0.02(W)}^{2}-{0.005(t)}^{2} + 0.6(CC)(GSA)-0.2(CC)(FA) + 0.2(CC)(CA)-0.6(CC)(WC) + 0.007(CC)(t)-0.2(GSA)(FA) + 0.2(GSA)(CA)-0.6(GSA)(WC) + 0.006(GS{\rm B})(t) + 0.1(FA)(CA)-0.3(FA)(WC) + 0.001(FA)(t)-0.4(CA)(WC) + 0.0008(CA)(t)-0.01(WC)(t)$$

(10)

Other ML models except KNN show better performance indicator values than LR and FQ models. RFR models show R² closer to unity and lower RMSE, MAE, and SI values than other ML models. For the predicted to observed CS ratio, 81% of the data falls between 0.8 and 1.2 in the training data set, which is 8% higher than the next-best model (SVR). The precision of the RFR and BDT models in predicting the CS of concrete is comparatively good and ranked as 1 and 2, respectively. ANN and FQ models perform closer to each other and are ranked as 3 and 4, respectively. It is followed by SVR, KNN, and LR models. The RFR model has more points inside the 20% error envelope with 81% of the total data, followed by SVR and BDT at 73% and 72%, respectively. Overall, the RFR model is the best option for predicting the CS of GSA blended concrete.

Performance of machine learning models

Figure 5 displays the prediction error for all the examined machine-learning models, calculated as the difference between the expected and observed CS. The shown chart demonstrates that many data points have been identified as outliers, suggesting a higher level of inaccuracy. This phenomenon may arise because of inaccuracies in the experimental measurement of CS or discrepancies among the laboratory tests conducted in the literature. BDT, FQ, and RFR models show a narrow range of error distribution (highest–lowest error) as 17.56, 18.03, and 18.93 MPa, respectively. LR and KNN models show the most comprehensive range error distribution as 35.95 and 31.07 MPa, respectively.

The BDT and KNN models have errors evenly distributed on both sides. This suggests that the projected values are underestimated in some instances, while in others, they are overstated. However, under other models, the majority of errors are undervalued. Furthermore, except for the BDT model, all other models exhibit negative skewness. The SVR model has the greatest skewness value of −1.54, while the KNN model follows closely with a skewness of −0.86. Using several statistical and graphical techniques may significantly improve the assessment of prediction models. This implies that using a variety of statistical indicators and graphical illustrations to assess the efficacy of prediction models may provide a more thorough analysis.

Figure 6 depicts the Taylor diagram, a graphical representation utilized to evaluate the predictive performance of ML models. The Taylor diagram, a statistical tool, provides a visual framework for evaluating and comparing several models. The graphic illustrates the degree of alignment between each model and the reference data, as measured by correlation, standard deviation, and RMSE. The diagram can visually represent the comparative proficiency of each model concerning a reference model [58]. The proximity of the pentagram to the reference spot directly correlates with the model’s accuracy in forecasting CS. Among all ML models, RFR model exhibits the highest level of accuracy, while LR models provide the lowest level of accuracy. According to these criteria, the ML models may be ranked in the following order from highest to lowest performance: RFR > BDT > ANN > FQ > SVR > KNN > LR. The results demonstrate a strong correlation with the previously specified performance indicator values.

Sensitivity analysis

Nonlinear and complicated models such as ANN or RFR sometimes exhibit black box behavior due to their intricate nature [54]. The use of SHAP is quite advantageous in examining intricate machine-learning models encompassing a diverse range of parameters [43, 63]. The decision to use the findings of the random forest regression (RFR) model, which demonstrated superior performance in predicting CS, was made to gain insights into the outcomes via applying the SHAP (SHAPley Additive exPlanations) method.

Figure 7 depicts the average SHAP values about various characteristics, which are the independent or input variables, concerning the predictions of CS. These predictions are derived from the random forest regression (RFR) model. Based on the findings, it is evident that the cement content exhibits the highest SHAP value, indicating its significant effect on the prediction of CS. Concurrently, it was observed that the fine aggregate content exhibited the lowest SHAP value, suggesting a relatively lesser impact on the prediction of CS.

Figure 8 displays the SHAP summary plots depicting the predictions of CS for concrete using the RFR model. The color gradient represents the spectrum of feature values, while the x-axis denotes the SHAP value or the feature’s contribution towards the anticipated CS. The red dot represents a notably high feature value, indicating a correspondingly high SHAP score. A notable finding in the current research is identifying an extremely positive SHAP value of 16, indicating that the range of cement content examined can increase CS by 16 MPa over the average value. Conversely, a SHAP value of −16 on the far-left end (negative) indicates that a reduction in cement concentration might result in a loss in CS by 16 MPa below the mean value. The results from SHAP indicate that utilizing the game theory approach for calculating SHAP might enhance the understanding of the proposed hybrid ML models. Additionally, these findings demonstrate that the predictive accuracies of the models are both reasonable and satisfactory.

Conclusions

This paper presents a way to use ML algorithms to assess the CS of GSA mixed concrete. To train and assess the models, a total of 297 experimental data sets were collected from published literature. Several baseline predictors were constructed and trained, including linear regression, full quadratic model, artificial neural network, random forest regression, boosted tree regression, K closest neighbors, and support vector regression. The results of the study may potentially provide the following inferences:

Statistical analysis shows that CS is moderately correlated with cement content, GSA content, and curing time. However, it correlates poorly with fine aggregate content, coarse aggregate content, and water content.
Among the several ML models evaluated in this study, random forest regression (RFR) demonstrated superior performance in predicting CS, achieving an R² value of 0.91 and RMSE of 2.48 MPa for the training dataset and an R² value of 0.89 and an RMSE of 2.42 MPa for the testing dataset.
The RFR model was first graded concerning various statistical techniques, like MAE, SI, and OBJ. The training dataset’s mean absolute error (MAE) and structural integrity (SI) values were recorded as 1.80 MPa and 0.15, respectively. Similarly, for the testing dataset, the MAE and SI values were observed to be 1.83 MPa and 0.15 MPa, respectively. The RFR model exhibited the greatest a-20 index, with 81% and 80% values for the training and test datasets, respectively.
The results obtained from the feature significance analysis using SHAP demonstrate that the parameter with the greatest influence on the prediction of CS is the cement content. This is followed by the curing duration, which exhibit significant relevance in the prediction model.

This study systematically evaluates the predictive capabilities of the CS of GSA mixed concrete, contributing to the existing body of knowledge and practical implementation in this domain. It is crucial to bear in mind that augmenting the ML model with more data has the potential to enhance its performance. Consequently, it is vital to maintain a comprehensive data collection. Using precisely predicted model techniques may assist researchers and designers in selecting optimal input variables and making educated selections about the appropriate mix parameters to employ in developing sustainable concrete with desired attributes.

Availability of data and materials

Data can be made available on request by interested parties.

Abbreviations

ANN:: Artificial neural network
BTR:: Boosted tree regression
CC:: Cement content
CA:: Coarse aggregate
CS:: Compressive strength
FA:: Fine aggregate
FQ:: Full quadratic
GSA:: Groundnut shell ash
LR:: Linear regression
KNN:: K-nearest neighbors
MAE:: Mean absolute error
ML:: Machine learning
OBJ:: Objective function value
RFR:: Random forest regression
RMSE:: Root-mean-squared error
R² :: Coefficient of determination
SHAP:: SHAPley Additive exPlanations
SI:: Scatter index
SVR:: Support vector regression
t :: Curing period
WC:: Quantity of water requirement
W/C:: Water to cement ratio

References

Abdalla AA, Salih Mohammed A (2022) Theoretical models to evaluate the effect of SiO2 and CaO contents on the long-term compressive strength of cement mortar modified with cement kiln dust (CKD). Arch Civil Mech Eng 22(3):105
Article Google Scholar
Abro A, Kumar A, Keerio M, Bheel N (2021) An investigation on compressive strength of concrete blended with groundnut shell ash. Neutron 20(2):123–127
Article Google Scholar
Ahmed HU, Abdalla AA, Mohammed AS, Mohammed AA (2022) Mathematical modeling techniques to predict the compressive strength of high-strength concrete incorporated metakaolin with multiple mix proportions. Clean Mater 5:100132
Article Google Scholar
Alabadan B, Olutoye M, Zakariya M (2005) Partial replacement of ordinary portland cement (OPC) with bambara groundnut shell ash (BGSA) in concrete. Leonardo Electron J Pract Technol 6:43–48
Google Scholar
Assiamah S, Agyeman S, Adinkrah-Appiah K, Danso H (2022) Utilization of sawdust ash as cement replacement for landcrete interlocking blocks production and mortarless construction. Case Stud Construct Mater 16:e00945
Article Google Scholar
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114:24–31
Article Google Scholar
Benhelal E, Zahedi G, Shamsaei E, Bahadori A (2013) Global strategies and potentials to curb CO2 emissions in cement industry. J Clean Product 51:142–161
Article Google Scholar
Buari TA, Olutoge FA, Ayinnuola GM, Okeyinka OM, Adeleke JS (2019) Short term durability study of groundnut shell ash blended self consolidating high performance concrete in sulphate and acid environments. Asian J Civil Eng 20(5):649–658
Article Google Scholar
Chandra Paul S, Mbewe PBK, Kong SY, Šavija B (2019) Agricultural solid waste as source of supplementary cementitious materials in developing countries. Materials (Basel, Switzerland) 12(7):1112
Article Google Scholar
Claisse PA (2016) Chapter 17 - Introduction to cement and concrete. In: Claisse PA (ed) Civil Engineering Materials. Butterworth-Heinemann, Boston, pp 155–162
Chapter Google Scholar
Davis, J. P. and L. L. Dean (2016). Chapter 11 - Peanut composition, flavor and nutrition. Peanuts. H. T. Stalker and R. F. Wilson, AOCS Press, Urbana, p 289–345
Dharani D, Selvan A (2017) Durability studies on concrete by using groundnut shell ash as mineral admixture. Int J Innov Res SciTechnol 3(10):168–172
Google Scholar
Duc PA, Dharanipriya P, Velmurugan BK, Shanmugavadivu M (2019) Groundnut shell -a beneficial bio-waste. Biocatal Agric Biotechnol 20:101206
Article Google Scholar
Feng D-C, Liu Z-T, Wang X-D, Chen Y, Chang J-Q, Wei D-F, Jiang Z-M (2020) Machine learning-based compressive strength prediction for concrete: an adaptive boosting approach. Construct Build Mater 230:117000
Article Google Scholar
Gao W, Karbasi M, Derakhsh AM, Jalili A (2019) Development of a novel soft-computing framework for the simulation aims: a case study. Eng Comput 35(1):315–322
Article Google Scholar
Greenspec. (2022). Environmental impacts of concrete. Green building design Retrieved April 21, 2022, from https://www.greenspec.co.uk/building-design/environmental-impacts-of-concrete/
Habert G (2014) Chapter 10 - Assessing the environmental impact of conventional and ‘green’ cement production. In: Pacheco-Torgal F, Cabeza LF, Labrincha J, de Magalhães A (eds) Eco-efficient Construction and Building Materials. Woodhead Publishing, pp 199–238
Chapter Google Scholar
Ige J, Anifowose M, Amototo I, Adeyemi A, Olawuyi M (2017) Influence of groundnut shell ash (GSA) and calcium chloride (CaCl2) on strength of concrete. Int J Eng Tome 15(4):209–214
Google Scholar
Ikumapayi CM, Arum C, Alaneme KK (2021) Reactivity and hydration behavior in groundnut shell ash based pozzolanic concrete. Mater Today 38:508–513
Google Scholar
Imam A, Kumar V, Srivastava V (2018) Review study towards effect of silica fume on the fresh and hardened properties of concrete. Adv Concrete Construct 6(2):145–157
Google Scholar
Jahanzaib Khalil M, Aslam M, Ahmad S (2021) Utilization of sugarcane bagasse ash as cement replacement for the production of sustainable concrete – a review. Construct Build Mater 270:121371
Article Google Scholar
Játiva A, Ruales E, Etxeberria M (2021) Volcanic ash as a sustainable binder material: an extensive review. Materials 14(5):1302
Article Google Scholar
Jeyananthan P (2022) Prolonged viral shedding prediction on non-hospitalized, uncomplicated SARS-CoV-2 patients using their transcriptome data. Comput Methods Programs Biomed Update 2:100070
Article Google Scholar
Jeyananthan P (2023) Role of different types of RNA molecules in the severity prediction of SARS-CoV-2 patients. Pathol Res Pract 242:154311
Article Google Scholar
Jeyananthan P (2023) SARS-CoV-2 diagnosis using transcriptome data: a machine learning approach. SN Comput Sci 4(3):218
Article Google Scholar
Jittin V, Bahurudeen A, Ajinkya SD (2020) Utilisation of rice husk ash for cleaner production of different construction products. J Clean Product 263:121578
Article Google Scholar
Kakasor Ismael Jaf D, Ismael Abdulrahman P, Salih Mohammed A, Kurda R, Qaidi S. M. A, Asteris P. G (2023) “Machine learning techniques and multi-scale models to evaluate the impact of silicon dioxide (SiO2) and calcium oxide (CaO) in fly ash on the compressive strength of green concrete. Construct Build Mater 400:132604
Kanchidurai S, Nanthini T, Jai Shankar P (2017) Experimental studies on sisal fibre reinforced concrete with groundnut shell ash. ARPN J Eng Appl Sci 12(21):5914–5920
Google Scholar
Karthikeyan N, Saravanan M, Deepika M (2018) Performance of groundnut shell ash as partial replacement of cement in concrete. Int J SciResDev 6(4):525–528
Google Scholar
Kenyhercz v, Passalacqua N. V. (2016) Chapter 9 - Missing data imputation methods and their performance with biodistance analyses. In: Pilloud M. A., Hefner M. A. (eds) Biological Distance Analysis. Academic Press, San Diego, pp 181–194
Chapter Google Scholar
Krishnan C, Nizar N (2016) Groundnut shell ash as partial replacement of cement in concrete. IJRDO-J Mech Civil Eng 2(2):39–48
Google Scholar
Lakshmi N, Sagar P (2017) Study on partial replacement of groundnut shell ash with cement. Chall J Concrete Res Lett 8(3):84–90
Article Google Scholar
Marani A, Nehdi ML (2020) Machine learning prediction of compressive strength for phase change materials integrated cementitious composites. Construct Build Mater 265:120286
Article Google Scholar
Mayooran S, Ragavan S, Sathiparan N (2017) Comparative study on open air burnt low- and high-carbon rice husk ash as partial cement replacement in cement block production. J Build Eng 13:137–145
Article Google Scholar
Mujedu K, Adebara S (2016) The use of groundnut shell ash as a partial replacement for cement in concrete production. Int J Sci, EngEnviron Technol 1(3):32–39
Google Scholar
Nwofor TC and Sule S (2012) Stability of groundnut shell ash (GSA)/ordinary portland cement (OPC)concrete in Nigeria. Adv Appl Sci Res 3:2283–2287
Ogork E, Uche O, Elinwa A (2014) A study on groundnut husk ash (GHA) - concrete under acid attack. Int J Modern Eng Res 4(7):30–35
Google Scholar
Özbay E, Erdemir M, Durmuş Hİ (2016) Utilization and efficiency of ground granulated blast furnace slag on concrete properties – a review. Construct Build Mater 105:423–434
Article Google Scholar
Pandi K, Ganesan K, Manickavalli M (2018) Studies on the partial replacement of fine aggregate with groundnut shell ash in concrete. Int J Curr Eng Sci Res 5(10):1–5
Google Scholar
Perea-Moreno, M.-A., F. Manzano-Agugliaro, Q. Hernandez-Escobedo and A.-J. Perea-Moreno (2018). Peanut shell for energy: properties and its potential to respect the environment. Sustainability 10(9)
Plaia A, Buscemi S, Fürnkranz J, Mencía EL (2022) Comparing boosting and bagging for decision trees of rankings. J Classif 39(1):78–99
Article MathSciNet MATH Google Scholar
Poorveekan K, Ath KMS, Anburuvel A, Sathiparan N (2021) Investigation of the engineering properties of cementless stabilized earth blocks with alkali-activated eggshell and rice husk ash as a binder. Construct Build Mater 277:122371
Article Google Scholar
Quan Tran V, Quoc Dang V, Si Ho L (2022) Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach. Construct Build Mater 323:126578
Article Google Scholar
Raheem S, Oladiran G, Olutoge F, Odewumi T (2013) Strength properties of groundnut shell ash (GSA) blended concrete. J Civil Eng Construct Technol 4(9):275–284
Google Scholar
Sadh PK, Duhan S, Duhan JS (2018) Agro-industrial wastes and their utilization using solid state fermentation: a review. Bioresour Bioprocess 5(1):1
Article Google Scholar
Samuel V (2020) Groundnut shell ash: a local construction material in concrete production. Fane-Fane Int Multidiscip J 5(1):1–15
Google Scholar
Sarailidis G, Wagener T, Pianosi F (2023) Integrating scientific knowledge into machine learning using interactive decision trees. Comput Geosci 170:105248
Article Google Scholar
Sathiparan N (2021) Utilization prospects of eggshell powder in sustainable construction material – a review. Construct Build Mater 293:123465
Article Google Scholar
Sathiparan N, Anburuvel A, Selvam VV (2023) Utilization of agro-waste groundnut shell and its derivatives in sustainable construction and building materials – a review. J Build Eng 66:105866
Article Google Scholar
Sathiparan, N. and P. Jeyananthan (2023). Predicting compressive strength of cement-stabilized earth blocks using machine learning models incorporating cement content, ultrasonic pulse velocity, and electrical resistivity. Nondestruct Testing Eval :1–25. https://doi.org/10.1080/10589759.2023.2240940
Sathiparan N, Jeyananthan P (2023) Prediction of masonry prism strength using machine learning technique: effect of dimension and strength parameters. Materi Today Commun 35:106282
Article Google Scholar
Sathiparan, N., P. Jeyananthan and D. N. Subramaniam (2023). Effect of aggregate size, aggregate to cement ratio and compaction energy on ultrasonic pulse velocity of pervious concrete: prediction by an analytical model and machine learning techniques. Asian J Civil Eng
Seevaratnam V, Uthayakumar D, Sathiparan N (2020) Influence of rice husk ash on characteristics of earth cement blocks. MRS Adv 5(54):2793–2805
Article Google Scholar
Shah SFA, Chen B, Zahid M, Ahmad MR (2022) Compressive strength prediction of one-part alkali activated material enabled by interpretable machine learning. Construct Build Mater 360:129534
Article Google Scholar
Siddique R, Klaus J (2009) Influence of metakaolin on the properties of mortar and concrete: a review. Appl Clay Sci 43(3):392–400
Article Google Scholar
Subramaniam, D. N., P. Jeyananthan and N. Sathiparan (2023). Soft computing techniques to predict the electrical resistivity of pervious concrete. Asian J Civil Eng https://doi.org/10.1007/s42107-023-00806-y
Sundaralingam K, Peiris A, Anburuvel A, Sathiparan N (2022) Quarry dust as river sand replacement in cement masonry blocks: Effect on mechanical and durability characteristics. Materialia 21:101324
Article Google Scholar
Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res 106(D7):7183–7192
Article Google Scholar
Thanushan K, Sathiparan N (2022) Mechanical performance and durability of banana fibre and coconut coir reinforced cement stabilized soil blocks. Materialia 21:101309
Article Google Scholar
Theconstructor. (2022). “Manufacture of cement- materials and manufacturing process of portlan cement.” Building Technology Guide Retrieved April 21, 2022, from https://theconstructor.org/building/manufacture-of-cement/13709/
Wang D, Shi C, Farzadnia N, Shi Z, Jia H (2018) A review on effects of limestone powder on the properties of concrete. Construct Build Mater 192:153–166
Article Google Scholar
Wijekoon, S. H., T. Shajeefpiranath, D. N. Subramaniam and N. Sathiparan (2023). A mathematical model to predict the porosity and compressive strength of pervious concrete based on the aggregate size, aggregate-to-cement ratio and compaction effort. Asian J Civil Eng
Zhang J, Niu W, Yang Y, Hou D, Dong B (2022) Machine learning prediction models for compressive strength of calcined sludge-cement composites. Construct Build Mater 346:128442
Article Google Scholar

Download references

Acknowledgements

None.

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Civil Engineering, Faculty of Engineering, University of Jaffna, Ariviyal Nagar, Kilinochchi, Jaffna, Sri Lanka
Navaratnarajah Sathiparan
Department of Computer Engineering, Faculty of Engineering, University of Jaffna, Jaffna, Sri Lanka
Pratheeba Jeyananthan

Authors

Navaratnarajah Sathiparan
View author publications
You can also search for this author in PubMed Google Scholar
Pratheeba Jeyananthan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Study conception and design, NS and PJ; data acquisition, NS; analysis and interpretation of results, PJ; draft manuscript preparation, NS and PJ; and manuscript review and editing, NS.

Corresponding author

Correspondence to Navaratnarajah Sathiparan.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Sathiparan, N., Jeyananthan, P. Soft computing techniques to predict the compressive strength of groundnut shell ash-blended concrete. J. Eng. Appl. Sci. 70, 134 (2023). https://doi.org/10.1186/s44147-023-00302-9

Download citation

Received: 10 September 2023
Accepted: 09 October 2023
Published: 06 November 2023
DOI: https://doi.org/10.1186/s44147-023-00302-9

Soft computing techniques to predict the compressive strength of groundnut shell ash-blended concrete

Abstract

Introduction

Methods

Data collection

Machine learning modeling

Linear regression

Full quadratic (FQ) model

Artificial neural network (ANN) model

Random forest regression

Boosted decision tree

K-nearest neighbors

Support vector regression

Performance indicators

Results and discussion

Statistical analysis

Machine learning model results

Performance of machine learning models

Sensitivity analysis

Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords