The unconfined compressive strength estimation of rocks using a novel hybridization technique based on the regulated Gaussian processor

The unconfined compressive strength (UCS) of rocks is a crucial factor in geotechnical engineering, assuming a central role in various civil engineering undertakings, including tunnel construction, mining operations, and the design of foundations. The precision in forecasting UCS holds paramount importance in upholding the security and steadfastness of these endeavors. This article introduces a fresh methodology for UCS prognostication by amalgamating Gaussian process regression (GPR) with two pioneering optimization techniques: sand cat swarm optimization (SCSO) and the equilibrium slime mould algorithm (ESMA). Conventional techniques for UCS prediction frequently encounter obstacles like gradual convergence and the potential for becoming ensnared in local minima. In this investigation, GPR is the foundational predictive model due to its adeptness in managing nonlinear associations within the dataset. The fusion of GPR with cutting-edge optimizers is envisioned to elevate the precision and expeditiousness of UCS prognostications. An extensive collection of rock samples, each accompanied by UCS measurements, is harnessed to assess the suggested methodology. The efficacy of the GPSC and GPES models is juxtaposed with the conventional GPR technique. The findings reveal that incorporating SCSO and ESMA optimizers into GPR brings about a noteworthy enhancement in UCS prediction accuracy and expedites convergence. Notably, the GPSC models exhibit exceptional performance, evidenced by an exceptional R 2 value of 0.995 and an impressively minimal RMSE value of 1.913. These findings emphasize the GPSC model’s potential as an exceedingly auspicious tool for experts in the realms of engineering and geology. It presents a sturdy and dependable method for UCS prediction, a resource of immense value in augmenting the security and efficiency of civil engineering endeavors.


Background
At the heart of engineering, initiatives lie densifying loose soils, an indispensable endeavor that amplifies the mass per unit area for constructions like earth dams and highway embankments.Compaction transcends mere strength augmentation; it fortifies the soil's resilience, elevates its load-bearing capability, and steadies embankment inclines to mitigate settlement issues [1].In addition to bolstering strength, compaction offers a multitude of benefits, encompassing enhancements in volume, porosity, density, permeability, and impermeability.These improvements collectively elevate the soil quality, augmenting its ability to sustain structural loads.The UCS, a fundamental component of geomechanical models, is pivotal in mechanical rock behavior [2,3].UCS signifies the highest compressive stress a rock can withstand under controlled, uniaxial loading prior to experiencing failure.The field of rock mechanics, amalgamating theoretical foundations with practical implementations, elucidates the response of rocks to diverse stress conditions [4,5].The ramifications of rock failure bear significance in areas such as the production of solid materials and the stability of wellbores, especially within the context of petroleum operations.UCS data derived from subsurface formations holds utmost importance in drilling activities.This wealth of information informs the intricacies of bit hydraulics, determines the ideal mud weights for drilling, manages drilling costs, and elevates drilling efficacy [6].
UCS data extracted from subsurface formations is highly significant in drilling operations.This reservoir of data enlightens us about the complexities of bit hydraulics, establishes the optimal mud weights for drilling, oversees drilling expenditures, and enhances drilling efficiency.The unconfined compression test (UCT) adheres to a standardized procedure endorsed by both the American Society for Testing and Materials (ASTM) and the International Society for Rock Mechanics (ISRM) [7][8][9].Conducting direct UCS measurements in the laboratory consumes significant time and financial resources and requires the careful preparation of core samples.Meeting the latter requirement becomes arduous when working with frail, thinly layered, or heavily fractured rock formations.Numerous researchers advocate embracing indirect methodologies for UCS prediction to tackle the aforementioned difficulties linked to core sample preparation and testing.These examinations are rapid, easy to perform, portable, and cost-efficient.Indirect testing techniques, such as the point load index test (Is (50)), the Brazilian tensile strength test (BTS), and the ultrasonic test ( Vp ), are commonly utilized for forecast- ing UCS.These evaluations place less rigorous requirements on sample preparation in contrast to the UCS test.These associated index tests, especially when coupled with engineering knowledge, can provide a valuable initial assessment of UCS [3,4,10,11].
Laboratory trials on extracted core samples, offering insights into genuine stress conditions and mechanical traits, establish the groundwork for directly appraising mechanical rock properties [8].These assessments encompass a spectrum of tests, including uniaxial and triaxial compressive strength assessments, scratch trials, Schmidt hammer examinations, and point load tests.Collectively, these methodologies set the standard for property assessment [12].Nevertheless, acquiring a continuous UCS profile along wellbores encounters challenges with procuring representative core samples, including substantial costs and time-intensive procedures.To overcome this limitation, indirect methodologies have been formulated to bridge gaps by establishing connections between rock properties and petrophysical well-log data [13].The significance of UCS transcends rocks and encompasses various materials, including soils and industrial byproducts, exerting a substantial impact on foundation design, slope stability analysis, and structures' resilience.In Equilibrium slime mould algorithm materials, UCS assumes a pivotal role in influencing both the structural integrity and operational performance of pavements [7].
Nonetheless, the determination of a material's UCS involves addressing a plethora of variables, encompassing physicochemical characteristics, varieties of cementitious additives, and the duration of curing.These factors mandate carefully designed laboratory investigations and specialized equipment [14,15].The validity of these evaluations pivots on the pursuit of precision, as evidenced by the specifications of the employed specimens [16,17].The exploration of alternative methodologies to determine the UCS of stabilized materials, such as pond ashes, arises from the demanding nature of these assessments, the resource-intensive requirements, and the complexity of acquiring representative samples [18,19].

Literature review
Momeni's research [20] presents a PSO-based model for predicting the UCS of granite and limestone.This model outperforms traditional methods in accuracy, validated through experimentation with 66 sample sets.Inputs such as point load index, rebound number, p-wave velocity, and dry density contribute to its high predictive performance, particularly sensitive to dry density and rebound number.However, its universal applicability beyond granite and limestone types is cautioned, indicating a need for further refinement.Jahed Armaghani's study [21] introduces an adaptive neuro-fuzzy inference system (ANFIS) for forecasting UCS and Young's modulus (E) of granite, surpassing conventional methods like multiple regression analysis (MRA) and artificial neural networks (ANN) in accuracy.ANFIS achieves exceptional R 2 values of 0.985 for UCS and 0.990 for E, with low root-mean-square error (RMSE) and high variance accounted for (VAF) percentages, minimizing uncertainties in rock engineering projects.Armaghani's subsequent study [22] focuses on sandstone samples from Malaysia, employing a hybrid ICA-ANN model to predict UCS.This model demonstrates high accuracy, indicating significant progress in geotechnical research and offering practical applicability in estimating UCS for sandstone.
Asteris's research [23] enhances Schmidt hammer rebound number analysis by integrating N-and L-type measurements into a comprehensive database.Models utilizing backpropagating neural networks (BPNN), genetic programming (GP), and third-order linear equations achieve superior prediction accuracy, with the BPNN 1-7-1 model notably precise.The study proposes the a-20 index as a preferable performance indicator over the Pearson correlation factor (R), advancing predictive modeling for rock characterization.Armaghani's study [24] compares nonlinear prediction models for estimating the UCS of granitic rocks, highlighting ANFIS as the superior predictor.However, caution is advised regarding the universal application, emphasizing suitability primarily for similar rock types.Soft computing methods like ANFIS exhibit consistent superiority over ANN and NLMR, showcasing their potency in UCS estimation based on rock index properties.Yagiz's research [25] explores the correlation between Schmidt hardness rebound values and the UCS of rock, offering empirical equations for preliminary design assessments.Contextspecific equations are recommended due to variations across geological formations, ensuring accurate estimations tailored to specific conditions.Yagiz's subsequent research [26] investigates the impact of cycling integer variations in the slake durability index test on intact rock behavior.ANN proves effective in estimating crucial rock properties like UCS and modulus of elasticity (E), highlighting its utility in material characterization and geotechnical engineering research.

Objective
This study introduces an innovative machine-learning methodology aimed at achieving precise and optimal predictive results in geotechnical engineering applications.The hybridization approach implemented here focuses on enhancing the performance of Gaussian process regression (GPR) models to generate dependable outcomes.By incorporating two advanced and efficient optimizers, namely the sand cat swarm optimization (SCSO) and the equilibrium slime mould algorithm (ESMA), the development of these hybrid models surpasses the performance of conventional methods, marking a significant advancement.The evaluation of model results encompassed the use of established performance metrics such as R 2 and RMSE, playing a crucial role in mitigating potential biases and providing a more precise understanding of the models' effectiveness.The increased precision achieved by the hybrid models can enhance well-informed decision-making in geotechnical engineering projects, thus reducing the risks associated with inaccurate estimates of unconfined compressive strength (UCS).The utilization of GPR as a foundational predictive model for handling nonlinear associations within rock datasets offers several key advantages.GPR's inherent flexibility enables it to capture intricate nonlinear relationships, providing more accurate predictions compared to linear regression techniques.
Additionally, GPR offers uncertainty quantification, crucial in geological studies where data may be uncertain or noisy, facilitating better decision-making through probabilistic predictions.Its robustness to small datasets mitigates overfitting issues commonly encountered with traditional regression methods, while its interpretability offers insights into spatial correlations, which is vital for geological exploration.Moreover, GPR's adaptability to varying scales ensures accurate predictions across datasets with different units, making it a potent tool for predictive modeling in geotechnical engineering and related fields.The selection of SCSO and ESMA for hybridization with GPR in UCS prediction is motivated by their effectiveness in addressing challenges such as gradual convergence and avoiding local minima.SCSO utilizes swarm-based approaches inspired by sand cat colonies for efficient exploration of solution spaces, while ESMA, inspired by slime mold behavior, dynamically adjusts search parameters to navigate complex landscapes effectively.Combining the strengths of SCSO and ESMA with GPR equips hybrid models to tackle such challenges, ensuring more robust and reliable UCS predictions in geotechnical engineering applications.

Gaussian process regression (GPR)
The probabilistic regression approach of GPR initiates with a training dataset, denoted as D = {(y w , x w ), w = 1, 2, 3, . . ., W } , consisting of W pairs of vector inputs x w ∈ R L .Utilizing this training dataset with noisy scalar output values (yn), GPR con- structs a model capable of effectively extrapolating the output distribution to novel input locations.Presumably, external factors such as truncation or observational errors are responsible for the uncertainty in output noise.This noise is additive, characterized by a zero-mean, stationary, and normally distributed nature [27].
GPR employs a Gaussian process (GP) to depict the latent variables of f , with x functioning as an indicator for these variables.The objective is to confine the examination to functions for which the values exhibit Gaussian correlation.This is accomplished by utilizing a consistent Gaussian distribution for any finite set of { f (x 1 ), . . ., f (x k ) } with distinct indices.This equates to introducing a GP prior to functions within a Bayesian framework.By defining the mean function v(x) and the covariance function k(x, x ′ ) , functions can be conveniently described.This method simplifies predicting function values for new inputs, even with limited training data.The variance, denoted as s 2 noise , is utilized to represent the model's noise.
The E[.] represents the expectation.Typically, the mean function is chosen to be 0, with the primary focus on the unobserved region of the input space.The behavior of the process is solely influenced by the covariance function, which, by definition, is symmetric positive semi-definite when evaluated for any pair of input space points [28].The covariance function typically encompasses multiple hyperparameters that dictate the prior distribution of f(x).The squared exponential covariance function is a frequently employed choice [29].
In this context, k represents a norm defined within the input space.It is essential to highlight that as the distance between input pairs x and x ′ increases, the covariance function diminishes rapidly, signifying weaker correlations between f (x) and f (x ′ ) .There are three hyperparameters at play: q 1 determines the upper limit for covariance, q 2 is a strictly positive hyperparameter dictating the rate at which correlation dimin- ishes with increasing point separation, and q 3 serves as an additional hyperparameter, representing the unknown variance s 2 noise in Eq. ( 1), even though it is not explicitly mentioned in Eq. ( 2).These hyperparameters are assembled into a vector denoted as (q), which is treated as the actualization of a random vector (Q).The realization that provides the closest fit to the dataset is selected for generating predictions using the training data.The following joint Gaussian distribution can be derived when it is (1) assumed that the hyperparameters are already known in this study, with the vector of training latent variables represented as f and the vector of test latent variables as f * : The symmetric covariance matrix K is generated by computing the covariance between the i_th variable in the group denoted by the first subscript and the j_th variable in the group represented by the second subscript (where * is utilized as an abbreviation for f * ).This computation involves the covariance function k(., .)from Eq. ( 4) and the associated hyperparameters [30].Figure 1 presents the GPR flowchart.

Sand cat swarm optimization (SCSO)
SCSO is an algorithm based on swarm behavior, taking inspiration from the hunting tactics of sand cats for its convergence to a solution.The primary stages of SCSO can be outlined as follows. (4 Fig. 1 The prediction framework based on the GPR model ❖ Step 1: Initiating the algorithm-within the optimization problem, each sand cat corresponds to an array with a 1 × Dim dimension (where Dim signifies the number of decision variables), as illustrated in Fig.
2 [31].Per this diagram, each Pos value must fall within the specified upper and lower bounds.Initially, an initialization matrix is generated, considering the problem's dimension ( n × Dim ) [32,33].The associated solution is regarded as the output value, with each subsequent iteration replacing it with a superior value.If no improved values are attained during the current iteration, the solutions will remain unchanged and unsorted.❖ Step 2: Exploration phase (hunting for prey)-sand cats possess the capability to detect low frequencies below 2 kHz, and SCSO leverages this keen sense of hearing [34].The auditory acuity of the sand cat is expressed by Eq. ( 5) and denoted as R G .
In Eq. ( 6), parameter R serves as the control factor for regulating the exploration and exploitation phases of the algorithm.The value of S M is set to 2. During the search, the sand cat stumbles upon a new position randomly within the sensitivity range.The sensitivity range (r) undergoes random variations to prevent getting stuck in local solutions.
In this equation, the parameter R_G serves as a guide for the sensitivity range, denoted as r.Each sand cat's position is represented by P i .The sand cat seeks the prey's location relative to the best candidate position (P bc ) , the current position (P t c ) , and the sensitivity range (r), as described in Eq. ( 8): (5) ❖ Step 3: The exploitation phase (prey attack)-when simulating the sand cat's attack, the distance between the sand cat and its prey is expressed by Eq. ( 8) [35].In the attack modeling procedure, it is posited that the sensitivity range forms a circular area, and a random angle determines the direction of the sand cat's movement (α) chosen through the Roulette wheel selection function.The sensitivity range ranges from − 1 to 1 for the random selection of α within the [0o, 360o] range.As illustrated in Fig. 2, this circular motion results in the sand cat moving in various peripheral directions.As a result, the sand cat can reach the hunting location more swiftly.The process of attacking prey is defined by Eq. ( 10) [36].
❖ Step 4: Executing the SCSO algorithm-as previously stated, the exploration and exploitation phases control is governed by R and R_G.As per Eq. ( 10), R takes on a random value within the range [− 4, 4] due to the decrease of R_G from 2 to 0, as indicated in Eq. ( 9).Hence, in accordance with Eq. ( 10), when the value of R is less than or equal to 1, the sand cat will engage in attacking the prey; otherwise, it will search for the prey within the global domain under different circumstances.
In line with Eq. ( 11), the sand cat's position undergoes updates during the exploration and exploitation.

Equilibrium slime would algorithm (ESMA)
The foraging behavior of slime mould presents a promising source of inspiration for developing effective and efficient optimization methods [37].The starting position vector of each slime mold is randomly initialized through a randomization process.
The positioning model for the i-th slime mould, represented as X i ( j = 1, 2, ..., N ), in the next iteration (t + 1), is established using SMA as follows: (8)

The
− → X Gbest denotes the value of the global best fitness achieved across iterations one to t.Additionally, the variables r 1 and r 2 correspond to random values within the range of [0, 1].
To eradicate and disseminate the slime mold, a probability denoted by z is utilized.Within the context of this study, z is a constant value of 0.03 [38].Equation ( 14) sorts the fitness values in ascending order.Equation ( 15) is employed to calculate − → U.
A random number, r 3 , uniformly distributed within the range of [0, 1], is utilized.The local worst and best fitness values acquired during the current iteration are denoted by f Lworst and f Lbest , respectively.Equations (15)(16) are employed to calculate these fitness values.and Below is the formula that defines the variable P i , which represents the probability of selecting the trajectory of the i-th slime mold: For eachi = 1, 2, ..., N , the fitness value of the i-th slime mold in X i is determined by f (X i ).The initial iteration's global best fitness value up to the current iteration is repre- sented by f Gbest .The magnitude of the step size is indicated by −→ step a and is determined by a uniform distribution ranging from − a to a. Similarly, the size of the step, represented by −→ step b , is determined by a uniform distribution ranging from − b to b.The values of a and b are determined by Eq. ( 19), which is a function of the current iteration t and the maximum iteration T: (13 succinctly summarized in Table 1, with references provided for further scrutiny and validation [1,39,40].Table 1 acquires data about the inputs requisite for predicting rocks' unconfined compressive strength (UCS), which necessitates a meticulous approach.The following procedures were meticulously executed to attain the essential dataset: 1. Bulk density (BD): Bulk density values were ascertained employing laboratory-based measurements.Rock samples, representative of the study scope, were subjected to a meticulous determination of mass and volume.2. Brazilian tensile strength (BTS): Measuring Brazilian tensile strength requires specialized laboratory equipment.Rock specimens were carefully prepared and subjected to tensile stress, with resulting values recorded.3. Dry density (DD): Dry density data was obtained through systematic laboratory assessments.This involved accurately determining the mass and volume of rock samples post-desiccation.4. P-wave velocity ( Vp ): P-wave velocity, a pivotal parameter, was derived from seismic investigations.Field-based seismic surveys were conducted, or existing seismic data conforming to the requisite rock types was employed.5. Shear strength ( SRn ): Laboratory-based tests were performed to establish shear strength properties.This encompassed the application of controlled stress conditions on rock specimens, with meticulous recording of the outcomes.6. Uniaxial compressive strength at 50% (Is (50)): The uniaxial compressive strength at 50% confining stress was determined via conventional uniaxial compression tests.These standardized tests subjected rock samples to axial loading until failure occurred, with precise measurement of strength values.7. Unconfined compressive strength (UCS): The ultimate target variable, unconfined compressive strength (UCS), was obtained through rigorous unconfined compression tests.These tests involved the application of axial load until rock sample failure was observed, allowing for the precise determination of UCS values.
The data acquisition process adhered to stringent quality control measures to ensure the dataset's integrity, consistency, and accuracy in Fig. 3.This comprehensive dataset, comprising diverse rock types and conditions, serves as the foundation for the subsequent development of a robust machine-learning model for UCS prediction.It is noteworthy to mention that the dataset utilized in this study is observable in Table 5 in Appendix.

Performance evaluators
In this section, an overview is provided regarding a range of metrics employed to assess the performance of hybrid models, with a specific emphasis on their ability to quantify errors and correlations.These metrics serve as valuable tools for evaluating the efficacy of hybrid models in diverse applications.The metrics under discussion include rootmean-squared error (RMSE), coefficient of determination (R 2 ), mean-squared error (MSE), mean relative absolute error (MRAE), and the ratio of RMSE to standard deviation (RSR).These metrics collectively form a comprehensive toolkit for evaluating and understanding the accuracy and reliability of hybrid models in real-world scenarios [41].
The accuracy and reliability of the proposed models (GPSC and GPES) in predicting UCS were validated using a variety of metrics and criteria beyond R 2 and RMSE values.Specifically, in addition to R 2 and RMSE, a range of supplementary metrics was employed to comprehensively assess the performance of the models.These included MSE, MRAE, and RSR.By utilizing this array of metrics, a holistic understanding of the predictive capabilities of the models was obtained, ensuring a thorough evaluation of their accuracy and reliability.
• Coefficient correlation (R 2 ): Fig. 3 The scatter plot between input and output • Root-mean-square error (RMSE): • Mean square error (MSE): • Mean relative absolute error (MRAE): • The ratio of RMSE to standard deviation (RSR): Respectively, the variables can be articulated as follows: • The predicted value is denoted as b i .
• m̅ and b̅ represent the measured and average predicted values, respectively.
• The recorded value is indicated as m i .
• n signifies the sample size.
• The critical value from the t-distribution relies on the selected confidence level and degrees of freedom, represented as t.

Results of hyperparameters and convergence curves
In contrast to parameters, hyperparameters represent predefined specifications that are not inherently deduced from the dataset.These external configurations, incorporating elements such as learning rates and regularization strengths, are instrumental in delineating the behavioral characteristics of a model within the context of machine learning.Achieving optimal model performance relies significantly on the fundamental task of fine-tuning hyperparameters, necessitating rigorous experimentation and the utilization of sophisticated optimization methodologies.Table 2 intricately outlines the hyperparameter values linked with GPSC and GPES models, focusing particularly on n_restarts, length_scale, and alpha.As an illustration, the alpha hyperparameter value for GPSC was 0.2 and for GPES was 0.26. Figure 4 depicts a graph demonstrating the evolution of RMSE across iterations.The horizontal axis signifies the iteration number, ranging from 0 to 200, while the vertical axis denotes the corresponding RMSE values, which fall within the range of 0 to 10.The graphs started with a higher RMSE and progressively decreased with each subsequent iteration, finally converging to a lower RMSE value by the 200th iteration.Analyzing the convergence curves, the GPSC model commenced with an initial RMSE of 9.1, whereas the GPES model started slightly lower at RMSE = 8.1.Throughout the convergence process, both models demonstrated consistent reductions in RMSE, ultimately  reaching values below 4 by the 200th iteration.In summary, while both models exhibited improvement, the GPSC model outperformed with a final RMSE value of 2.6.

Comparison of models' performance
In this section, a comparative analysis of the results produced by the proposed model was undertaken, employing two frameworks: single models and hybrid models.More precisely, hybrid variations are formulated by combining GPR with the sand cat swarm optimization (GPSC) and the equilibrium slime mould algorithm (GPES).For these models under consideration, 70% of the UCS inputs were designated for the training phase, with the remaining 30% split into 15% for validation and 15% for testing purposes.
To comprehensively evaluate the outcomes and guarantee impartial results, a suite of metrics, including R 2 , RMSE, MSE, MRAE, and RSR, were utilized.In the context of the R 2 metric, values nearing 1 signify excellent results.Conversely, for the error indicators, values approaching 0 indicate precise outcomes.Table 3 presents the outcomes as assessed by the models using the specified metrics.The performance of the GPR model is considered subpar, as indicated by its RMSE values of 4.145 in training and 6.313 in testing.Nevertheless, the integration of optimizers has resulted in a notable enhancement in the precision of the GPR model.Among the hybrid models, the GPSC model stands out with the highest accuracy, achieving an R 2 value of 0.995 and an RMSE value of 1.913 in training.Furthermore, the GPES model has produced moderate results, achieving an R 2 value of 0.988 in training.While both optimizers have improved the performance of the conventional GPR, the GPSC has yielded exceptionally accurate results.
The detailed performance information provided in the upcoming figures can be consulted to perform a more comprehensive evaluation of the model's predictive capability regarding UCS. Figure 5 illustrates a scatter plot representing the models' performance, taking into account their R 2 and RMSE values.The scatter plot includes circular shapes of different colors, denoting the training, testing, and validation phases.These shapes are distributed around a central line, symbolizing an ideal outcome with an R 2 value of 1.The GPR model exhibits noticeable data dispersion, highlighting its limited accuracy.
In contrast, the GPSC and GPES models display improved performance compared to the standalone GPR model.The data points for GPSC are closely grouped around the central line, suggesting a more favorable outcome.Nonetheless, some broader dispersions can be observed in the case of GPES.
Figure 6 demonstrates the relationship between the predicted and measured values of the GPR base models.In this illustration, the black lines represent the measured values.When the predicted values align precisely with these measured values, it signifies the model's accuracy.
As shown in Fig. 6, there is a noticeable difference between the data produced by the GPR model and the measured values, especially during the testing phase.Conversely, the GPSC model demonstrates remarkably accurate results, with a nearly perfect alignment between its predicted and measured values, particularly during the testing phase.On the other hand, the GPES model shows a notable lack of precision between sample numbers 20 and 40, which renders it less accurate than the GPSC model.
Evaluating the error percentage of the proposed models can provide further insights into their performance and aid in identifying the most optimal one.Figure 7 illustrates the normal distribution of errors in the models.The GPR model exhibits the highest frequency of errors within the range of − 80 to 80% error values.Nonetheless, the GPSC model demonstrates a pronounced skewness, with a substantial frequency of errors clustering near 0%.Conversely, the GPES model exhibits a moderate performance relative to the others, with errors from around − 10 to 10%.Among these models, the GPSC model distinguishes itself for its precision and consistently reliable results.
To obtain a thorough grasp of the errors across the models, refer to the scatter interval illustrated in Fig. 8.The GPSC model's accuracy is revealed by the closely grouped data points, predominantly falling within the error range of − 10 to 10%.This procedure stands in contrast to the dispersion noticed in the other models.It highlights the dependability of the GPSC model's predictions, as clustering data points within a relatively narrow error range signify its consistent and precise performance.In contrast, the dispersion observed in the error distribution of the other models indicates a broader variability in their predictions.4 meticulously elucidates the findings derived from a comprehensive array of previous studies focused on predicting UCS within the field.These collective outcomes furnish an expansive groundwork against which the findings of our present study can be thoroughly evaluated and compared.Regarding this table, several models, such as ANN and RF, have been used to predict the UCS of rock samples.Among all these models, the RF model, which is related to a study by Hoque et al. [42], registered the least error value while registering the least R 2 value as well.On the other hand, the study of Sharma and Singh [43] employing Model-I simultaneously registered the highest R 2 and RMSE values.Opposite of the previous study, in this study, GPSC model experienced a high R 2 value of 0.995 and a low RMSE value of 1.913.

Conclusions
The prediction of the unconfined compressive strength (UCS) of rocks through machine learning has emerged as a promising and impactful area of research in the field of geotechnical engineering.This endeavor has significant implications for a wide range of civil  engineering projects, including tunneling, mining, and foundation design, where an accurate assessment of UCS is crucial for ensuring the safety and stability of structures.The study emphasizes machine learning's crucial role, particularly hybrid models, in enhancing the accuracy and reliability of UCS predictions.Conventional methods face difficulties in capturing complex rock data relationships, leading to slow convergence and local minima issues.Machine learning techniques, such as Gaussian process regression (GPR) coupled with advanced optimizers like sand cat swarm optimization (GPSC) and equilibrium slime mould algorithm (GPES), effectively tackle these challenges.Furthermore, evaluating these models using various performance metrics, including R 2 , RMSE, MSE, MRAE, and RSR, provides a comprehensive understanding of their capabilities.Based on the results, it can be concluded as follows: • The integration of SCSO and ESM optimization algorithms into the unified GPR model led to notable enhancements in R 2 values.In particular, SCSO contributed to a 2.1% improvement, and ESM contributed to a 1.3% increase.This underscores the effectiveness of incorporating these optimization techniques into the GPR framework to boost the accuracy of UCS predictions.• Among all the models assessed, the GPR model exhibited the least favorable performance.This was apparent from its recording of the highest error value, notably 6.313 for RMSE during validation, and the lowest R 2 , which was at 0.955 during validation.Taken together, these metrics suggest that the GPR model lacks accuracy in predicting UCS values.• The findings unquestionably highlight the outstanding precision of the GPSC model in contrast to the GPR and GPES models.This superiority is notably apparent in its remarkable performance, marked by an RMSE value of 1.913 and the highest R 2 value of 0.995.• Based on the findings obtained from the analysis, it can be inferred that the prediction models utilized for estimating the UCS were developed incorporating the parameters under investigation.Upon comparing the experimental findings with the prognostications generated by the proposed models, it was discerned that these models exhibited a notable level of accuracy in forecasting the UCS values.These results strongly imply that the proposed models demonstrate efficacy in predicting UCS and hold promise for diverse applications in geotechnical engineering.However, it is imperative to acknowledge that the applicability of these models across varying soil conditions might be delimited by the specificity of the dataset employed in their development.Furthermore, the intricate nature introduced by the utilization of multiple models and the sensitivity to the selection of meta-heuristic algorithms necessitates careful consideration when evaluating their interpretability and practical utility.

Fig. 2
Fig. 2 Update the mechanism of sand cat position in iterations.a Iteration i .b Iteration i + 1

Fig. 4
Fig. 4 Convergence of developed hybrid models

Fig. 5
Fig.5The scatter plot for developed hybrid models

Fig. 6 Fig. 7 Fig. 8
Fig. 6 The comparison of predicted and measured values

Table 2
Results of hyperparameter

Table 3
The result of developed models for GPR

Table 4
Comparison between the present and published article