Skip to main content

Developing operating speed models for elevated multilane urban arterials using artificial neural networks

Abstract

Operating speed models help assess and evaluate geometric design consistency along successive road segments. The development of operating speed models has been mostly focused on rural two-lane two-way highways, where horizontal curvature plays a dominant role in speed prediction. The need to enhance the prediction power of operating speed models and ability to capture more complex relationships within an urban setting have motivated this investigation. This research investigates the use of artificial neural networks, to develop operating speed models for multilane urban elevated arterial roads. Variables investigated in this study included geometric/operational features of a road segment in addition to the residual impact of the characteristics of upstream segments. A data collection exercise was undertaken on two major urban elevated arterial roads in Greater Cairo Region, Egypt: 6th of October and Saft Al-Laban corridors. Speed data was extracted from Google Distance Matrix Application Programming Interface and validated using test vehicle speed data. A regression-based modeling exercise was undertaken in the preliminary investigation phase to serve as a benchmark for the intended machine learning modeling exercise. Results showed that the prediction power of the developed ANN models — capturing the residual effect of upstream speeds — outperformed regression-based ones. The best-performing model used operating speeds of two upstream segments in addition to geometric/operational features of the segment under investigation to predict the segment operating speed (the model reported MAPE of 6.7%). Outputs of this model were used in a design consistency evaluation and potential transferability exercises to further investigate the model practicality.

Introduction

Road safety and design consistency have been of eminent concern in the last few decades. The hefty price that comes with post construction evaluations and design improvements has motivated efforts for pre-construction design consistency evaluation (during the design phase). Speed profile along a given road is a key measure in road safety and design consistency assessments. Variations in operating speeds are directly influenced by variations in road geometry. Limiting expected variations in operating speeds, by design, has become a worldwide necessity for sustainable road development. Thus, understanding the impact of road geometry on operating speeds and enhancing the prediction power of operating speed models have been of crucial concern to researchers as well as practitioners.

Immense efforts have been dedicated to modeling operating speeds, most of which were focused on freeways and two-lane two-way highways [1]. Studying the relationships between geometric/operational features of a given segment (such as horizontal curve radii, cross-section elements, and posted speeds) and its operating speed, using ordinary least-square regression techniques, has been the core of many developed operating speed models. Such modeling efforts have significantly contributed to the understanding of this complex relationship.

The need for enhancing the prediction power of operating speed models and modeling more complex relationships within an urban setting has motivated this investigation. Three main ideas were of interest to investigate: (1) widening the scope of modeling inputs to include road features of upstream segments; (2) generalizing the modeling attempt through adopting a case study of an elevated urban arterial and not the conventional two-way two-lane case study, and (3) using a machine learning modeling approach in addition to the conventional regression-based approach to enable the comparative assessment of both approaches. In the following section, examples of previous modeling efforts are described to set the stage for the current investigation.

Background

Developing operating speed models has been the focus of several research endeavors. Ordinary least-square regression has been the conventional modeling approach for a long time, with elements of horizontal curves serving as the key input to most of the developed models. On the other side, the use of machine learning in operating speed modeling has been gaining recent interest. Models were developed using artificial neural networks (ANN) in an attempt to compare their performance to that of conventional regression-based models. This section presents selected models from both sides.

Regression-based operating speed models

Table 1 presents a summary of previous regression-based operating speed models developed to predict the driver speed on two-lane two-way highways or arterials. For two-lane two-way highways, the models developed were mostly for curves with a few developed for tangents. Predictors usually represent the geometric characteristics of the alignment, such as horizontal curve radius (or its variations), deflection angle, length of curve, vertical grade of curve, approach tangent length, and operating speed of the previous element (approaching speed). Predictors of arterial roads on the other hand are mainly based on speeds such as posted speed limit and inferred design speed or based on urban segment characteristics such as access density, land uses, number of lanes, and the presence of sidewalks. This highlights that predicting operating speeds on arterial segments is more complicated than two-lane two-way highways due to various factors affecting speed other than the geometry of the alignment.

Table 1 Previously developed regression-based operating speed models

Artificial neural networks models

McFadden et al. [16] conducted a comparative study by developing two ANN models using the same predictors from the same dataset collected by Krammes et al. [3]. The accuracy of ANN models was examined by comparing the coefficients of determination (R2). The performance of ANN model 1 was compared to the performance of the regression model shown previously in Eq. 4 where ANN model 1 reported R2 of 0.761 compared to 0.81 (regression based). The performance of ANN model 2 was compared with the performance of the regression model shown previously in Eq. 5 where ANN model 2 reported R2 of 0.79 compared to 0.84 (regression based).

Semeida [17] investigated the accuracy of ANN compared to regression by developing models using same inputs for two-way multilane highways in Egypt. Posted speed limit and geometric features of road segments were the independent variables for the regression-based model to predict operating speed on any road segment. The model with the best accuracy reported a RMSE of 10.32. Semeida [17] stated that although the regression model had an acceptable accuracy, the model did not consider all the predictors. Therefore, he used multilayer perceptron feedforward ANN to develop a model which considers all the independent variables of the dataset as predictors. ANN model training set had RMSE = 2.9 and R2 = 0.982, while for validation set, the RMSE = 4.12 and R2 = 0.84.

In a separate study, Smeida [18] developed two regression models (one model for passenger cars and the other for trucks) and two ANN models to predict the operating speed on horizontal curves for multilane rural highways in Egypt. Only two independent variables were included in the regression-based model (median width and deflection angle). There were four other variables (shoulder width, curve length, curve radius, superelevation rate) that were significantly correlated with the operating speed that were not included in the regression models. Therefore, two ANN models were developed to examine the effect of using all correlated variables as inputs. The performance of the developed ANN models reported a RMSE of 5.77 and 4.3 and R2 of 0.932 and 0.95, for passenger cars and trucks, respectively.

Based on the presented literature, it is apparent that using ANN in modeling operating speeds is still evolving. Most of reviewed models were developed for freeways and two-lane two-way highways. Almost all considered predictors are segment specific (curvature radii, deflection angles, cross section elements, posted speed, etc.), which implies that changes in driver’s speed are only triggered by immediate changes in road features. This observation motivated this research to investigate the residual impact of upstream segments’ features on operating speeds. In other words, are driver’s speed-related decisions function only of the features of the current segment, or is there some built-up knowledge that comes with travelling on upstream road segments?

The objective of this research is to investigate the performance of ANN in modeling operating speeds on multilane elevated urban arterial roads. In doing so, the accuracy of the developed ANN model is compared to another model developed using multiple linear regression (MLR) as a benchmark. The effect of adding the operational and geometric features of the upstream segments as a variable on the accuracy of the developed model was investigated. In addition, to facilitate speed data collection, speed data collected from google maps (using its application programming interface (API) was investigated.

Methods

To realize the objective of this research, four main steps were adopted according to the research methodology chart, given in Fig. 1. First, a thorough review was performed to understand and select the main factors (variables) affecting operating speeds on multilane elevated urban arterial roads. Two corridors were selected for the case study: 6th of October Bridge and Saft Al-Laban Corridor, in Greater Cairo, Egypt. An experiment was set up to collect geometric and speed data from the two corridors using a GPS device in a test vehicle. The second step was concerned with data collection. While geometric features data was collected by creating AutoCAD Civil 3D® alignments, speed data was collected from Google API® distance matrix and then validated using field measured speeds.

Fig. 1
figure 1

Research methodology

The third step was concerned with model development. Two types of operating models were developed: MLR and ANN models. All models were developed to predict operating speeds of a roadway segment given its geometric/operational features in addition to features from upstream segments. A stepwise regression technique was used to develop MLR operating speed models. On the other hand, several ANN architectures were investigated. While different architectures vary in number of inputs, number of hidden layers, number of neurons per layer, and associated functions; they all had one neuron in the output layer representing segment operating speed. Evaluation of investigated ANN models was based on prediction accuracy indicators. Finally, a design consistency evaluation exercise was undertaken to validate the performance of the developed ANN model.

Case study data collection

Two elevated multilane urban arterials were selected for this study: 6th of October and Saft Al-Laban Corridors as shown in Fig. 2. Both corridors were chosen to ensure a wide range of sharp and smooth curves (with radii between 70 and 1400 m), and similar posted speeds of 60 km/h on most of the sections and 40 km/h on sharp alignment sections. Operating speeds data was collected during free-flow conditions to examine the effect of the geometric characteristics on vehicle speeds. Characteristics of the selected corridors are described in Table 2.

Fig. 2
figure 2

Case study routes a 6th of October Corridor and b Saft Al-Laban Corridor

Table 2 Summary of geometric features for the case study

A mobile equipped with a GPS-based tracker was used in a test vehicle for collecting latitude and longitude coordinates of the two corridors. A horizontal alignment was developed for each direction of the two corridors using the vehicle’s position inside the lane. The recorded coordinates were used in AutoCAD Civil 3D® to develop horizontal alignments. The horizontal geometric features (tangent lengths, curve lengths, and curve radii) were determined. The alignments were calibrated using Google Earth® to ensure that the alignments were developed based on coordinates that can be used on Google Maps APIs. The alignments were divided (segmented) into tangents and curves to detect the impact of a certain geometric element on the following element. Geometric features were collected for each segment: radius (R), curve length (CL), and tangent length (TL).

GPS-based speeds were measured during free-flow conditions in between 3:00 AM and 4:00 AM, through one test run for each moving direction. This time slot was selected to ensure free-flow conditions, as our case-study corridors are heavily used during daytime. Nonetheless, the corridors are fully lighted which is believed to limit the impact of darkness on drivers’ behavior.

Since operating speed models are normally distributed and require a sample size of spot speeds of at least 30 vehicles, the conducted test run was not enough to secure reliable operating speed data. Alternatively, open-sourced travel time data provided by Google Cloud Services was used in this research to estimate operating speeds on different road segments. The data is publicly accessible through Google APIs®. While the test vehicle data represents only one run speed data, Google-based data relies on a larger pool of vehicles in addition to a historical data component. It is important to note that the adequacy of using Google API data in capturing speed data has been investigated repeatedly in the literature [19]. Speeds were captured using a developed Python code which develops a distance matrix. The inputs for the developed code were the final segment coordinates, and the outputs were travelled distance (segment length) and travel time under free-flow condition. Since speeds were estimated under free-flow conditions, they were considered operating speeds. Google-based speed data estimation accuracy was evaluated by comparing it to the ground truth speed data of 6th of October Corridor segments. A reasonable difference was estimated; MAE and MAPE were found to be 4.2 km/h and 5% respectively which can be considered as the difference between the one run-based speed (collected from geo-tracker) and multiple vehicle speeds (collected by Google).

Data descriptive analysis

Collected geometry and speed data were used to construct two datasets to be used in the development of two types of operating speed models. Dataset #1 was for developing operating speed models on curves only (Model #1), and Dataset #2 was for developing operating speed models on all segments (curves and tangents) (Model #2).

Dataset #1 consisted of 60 horizontal curves extracted from the developed alignments for both corridors. Table 3 shows the description of the Dataset #1 variables.

Table 3 Description of dataset #1 variables

Spearman correlation analysis was adopted to analyze the correlation between the independent variables and the dependent variable in Dataset #1, using SPSS® analysis tool. The reason for using Spearman correlation is that it relaxes the normality assumption of Pearson correlation. Table 4 shows the results of the correlation analysis, where operating speed on curve (n) (V85C(n)) is highly correlated with NL, PS, R, V85T, TL, and V85C(n-1).

Table 4 Spearman correlation analysis for dataset # 1

The analysis also shows that there is high correlation between the independent variables themselves which indicates a potential presence of multicollinearity between the predictors. To ensure that the multicollinearity will not affect the regression model, the variance inflation factor (VIF) was calculated for each independent variable. All predictors reported > 1VIF > 10 which illustrates that although there is multicollinearity between the predictors, it is not significant to affect the accuracy of the MLR model.

On the other hand, Dataset #2 was constructed using data on all segments (tangents and curves), from both corridors. It consisted of 122 segments (60 curves and 62 tangents). A categorical approach was adopted to differentiate between segments with respect to curvature characteristics; segments were classified into 10 categories which are described as below:

  • Category #1 (C1): Curves with radii < 100 m

  • Category #2 to Category #9 (C2 to C9): Curves with radii of 100 to 550 m with an increment of 50 m

  • Category #10 (C10): Tangents and curves with radii > 550 m

Moreover, to investigate the effect of upstream segments’ geometric and operational features on the accuracy of the prediction model, the dataset included the characteristics of four previous segments. Table 5 below shows the shape of Dataset #2, the symbol of each variable, and its range of measurements to be used in the second model that predicts the speed of all segments.

Table 5 Variables for dataset #2

Results of the correlation analysis, presented in Table 6, illustrates that the speed on segment (n) (V85 (n)) is significantly correlated with some features of segments n, n-1, n-2, n-3, and n-4 (PSn, L (n-1), V85 (n-1), C(n-1),V85 (n-2), V85 (n-3), V85 (n-4)). It is also noticed from the correlation analysis that there is multicollinearity among the independent variables.

Table 6 Spearman correlation analysis for dataset # 2

To evaluate the effect of multicollinearity on regression Model #2, the variance inflation factor (VIF) for Dataset #2 predictors were calculated. The VIF for all independent variables was between 1 and 10 (1<VIF<10) which indicated that there is multicollinearity among the independent variables, but it does not significantly affect the regression models.

Model development

This research investigated two modeling techniques: multilinear regression (MLR) and feedforward artificial neural networks (ANN). Each of the mentioned modeling techniques was used to develop two types of models. Model #1 predicts speed on curves only using predictors from Dataset #1. Model #2 predicts speed on any segment (tangent or curve) using predictors from Dataset #2. MLR model was developed using SPSS®, while ANN model was developed using a tailored Python script. The dataset used in developing the ANN models was divided into training and testing data (with a ratio of 70:30). The predicting power of each modeling technique was assessed by comparing the prediction accuracy of each model.

MLR-Model #1 speed on curves

A stepwise procedure was used to model operating speed on curves using Dataset #1 as shown in Eq. 30 with MAE = 7.87 km/h, MAPE = 11.66%, and R2 = 0.37. The positive sign of the coefficient for the radius (R) is logical as the speed on curve (\({V}_{85C})\) increases with the increase in curve radius. Moreover, the positive sign for the coefficient (\({V}_{85 T}\)) also indicates that the higher the speed on the tangent before the curve, the higher the expected speed on the curve. Table 7 presents the significance of each predictor.

Table 7 MLR Models statistical significance
$${V}_{85C}=36.6+.015 R+0.341 {V}_{85 T}$$
(30)

MLR-Model #2 speed on all segments

For MLR-Model #2, the significantly correlated variables from Dataset #2 were used to develop the best regression model using stepwise regression approach. The best model is shown in Eq. 31 and Table 7. The model is significant at a 95% confidence level with coefficient of determination (R2) equal to 0.31, MAE = 6.17 km/h, and MAPE = 9.03%.

$${V}_{85 (n)}=14.313+0.201 {V}_{85 (n-1)}+0.255{ V}_{85 (n-2)}+0.814 {C}_{ (n-1)}+0.303 PS$$
(31)

The developed model reflects the impact of the residual speed from previous segments. The operating speed of the previous two segments (\({V}_{85 (n-1)} { V}_{85 (n-2)})\), the radius category of the previous segment \(({C}_{ \left(n-1\right)})\), and the posted speed on the road \((PS\)) were all of significant impact.

ANN-Model #1 speed on curves

ANN is commonly used to capture nondeterministic relationships between predictors and dependent variables. In the case of Model #1, the correlation analysis showed that there is high correlation between V85C(n) as the dependent variable and NL, PS, R, V85T, TL, and V85C(n-1) as independent variables. However, the MLR model considered R and V85T only as significant predictors. Thus, a feedforward ANN was used to model the operating speed considering all predictors of Dataset #1.

Since ANN-Model #1 was developed to predict speed on curves V85C(n), therefore, tangent length (TL), approaching speed (speed on tangent) V85T, and speed on the previous curve V85C(n-1) are considered the geometric and operational properties of upstream segments. To determine the effect of adding the upstream characteristics on the accuracy of the operating speed model, a stepwise criterion was adopted in developing ANN-Model #1. The effect of adding each predictor was measured by developing four separate ANN models shown in Table 8. Models’ architectures were developed through a comprehensive trial error approach. The dataset was divided into training and testing subsets with a ratio of 70:30.

Table 8 ANN-Model #1 developed trials

The effect of adding each predictor was measured by calculating MAE and MAPE for each model as shown in Table 9. The performance of ANN-Model #1–4 is considered the best performance among the developed models (MAPE of 8%) which illustrates that adding the upstream segments’ features has increased the accuracy of operating speed prediction model. Extensive efforts were exerted to fine tune the ANN model architecture to achieve the best possible performance. ANN M 1–4 had two hidden layers, with 7 and 4 neurons, respectively, and tanh activation function. Figure 3 presents M 1–4 model architecture and learning curve.

Table 9 Model #1 prediction errors
Fig. 3
figure 3

ANN M 1–4 architecture and learning curve

ANN-Model#2 speed on all segments

The developed MLR model considered four predictors only which were posted speed PS, category of the previous segment C(n-1), approaching speed V85 (n-1), and speed on segment n-2 V85 (n-2), although there were other predictors which were significantly correlated to the operating speed on segment n V85 (n). ANN-Model #2 was developed to consider all predictors of Dataset #2. To understand the contribution of data from previous segments, the model development process was conducted by developing five ANN models, as shown in Table 10.

Table 10 ANN-Model #2 developed trials

Table 11 presents MAE and MAPE for each of the developed models. The results show that although models ANN M 2–4 and ANN M2-5 have more predictors including the operating speed and category of segments (n-3) and (n-4), the MAE and MAPE values were higher than those for ANN M 2–3. The best-performing model was ANN M 2–3, with MAE = 4.65 km/h, MAPE = 6.7%, and R2 = 0.56. The model uses input data (predictors) from segment (n) until segment (n-2). The model has two hidden layers, 16 neurons each, and tanh activation function. Figure 4 presents ANN M 2–3 architecture and learning curve. On the other hand, using data from further segments (n-3) and (n-4) negatively affected the accuracy of the model as seen in the MAE and MAPE values of the 4th and 5th models. This implies that driver’s choice of speed was impacted by their previous experience (represented by their speed) on the previous two segments only (n-1) and (n-2), and that their speed on the segments before that (n-3) and (n-4) were not a contributing factor.

Table 11 Model #2 prediction errors
Fig. 4
figure 4

ANN M2-3 architecture and learning curve

Results and discussion

Model #1: Speed on curves

ANN performance surpassed the performance of MLR as R2 and MAPE for ANN M1-4 were 0.5 and 8% respectively compared to 0.37 and 11.66% for MLR-Model #1. To further investigate the robustness of ANN M1-4, an error distribution analysis was conducted on the testing dataset. As shown in Fig. 5, the distribution of errors takes a normal shape with 40% of the error values in the range of 0 to 5 km/h.

Fig. 5
figure 5

Error distribution for ANN M 1–4

The ability of the developed ANN M1-4 to capture the sensitivity of operating speeds to changes in radius was examined on a curved segment from 6th of October corridor. The segment has the following characteristics (NL = 3 lanes, PS = 60 km/h, CL = 168 m, R = 180 m, V85T = 68 km/h, TL = 113 m, and V85C(n-1) = 63 km/h). All input parameters were fixed, and changes in curve radius have been performed incrementally to observe the change in predicted operating speed. It can be noticed from Fig. 6 that the rate of increase in the predicted operating speed decreases with the increase in curve radius. More specifically, the predicted operating speed is highly sensitive to the change in curve radius from R = 80 m to R = 480 m (10 km/h increase). Then, a reduced sensitivity could be depicted for curves with 480 m < R < 1280 m.

Fig. 6
figure 6

Sensitivity of the predicted operating speed to change in curve radius

Model #2: Speed on all segments

The performance of ANN M2-3 reported the best performance with R2 and MAPE of 0.56 and 6.7% respectively compared to 0.31 and 9.03% for MLR-Model #2. The testing data for ANN M2-3 was examined to determine the distribution of errors. The results shown in Fig. 7 depict a normal distribution of errors with a peak in the error range 0 to 5 km/h.

Fig. 7
figure 7

Error distribution for ANN-Model #2 V85 estimation

For further understanding of the behavior of the developed ANN M2-3, a sensitivity analysis was performed on a segment from 6th of October Corridor with the following features (Ln = 106 m, Cn = 4, PSn = 60 km/h, L (n-1) = 175 m, V85 (n-1) = 70 km/h, C (n-1) = 10, L (n-2) = 103 m, V85 (n-2) = 74 km/h, C (n-2) = 5). The effect of segment category Cn on the operating speed was examined by changing the category of the selected segment from 1 to 10, while the other variables were fixed. It can be noticed from Fig. 8 that the model sensitivity to changes in segment category is limited (less than 5 km/h) and more pronounced in categories 1 to 6.

Fig. 8
figure 8

ANN-Model #2 sensitivity to the change in segment (n) category

The conducted sensitivity analysis depicts a variation in sensitivity to segments’ curvature in between Model #1 and Model #2. Model #1 predicts operating speeds only on curves, and hence, the data set used for training is specific to curved segments. The narrow data range allowed for higher sensitivity to curve radii. On the other hand, Model #2 is a generalized version that predicts operating speeds on any segment (tangents or curves). This comes at the expense of a wider range of data resulting in diluted sensitivity to the impact of curve radii.

Validation of the usability of ANN-Model #2

Assessment of model usability in design consistency evaluation

Design consistency is one of the major concerns in geometric design of roads as it is considered one of the main factors that control the quality of the trip. Lamm et al. [20] developed three main criteria to evaluate the design consistency of highways; each evaluation criterion depends on evaluating the consistency of a specific measure through the successive road segments. The safety criterion (II) depends on evaluating the operating speed. In this research, the developed models predict operating speeds for arterial roads; thus, safety criterion (II) was used to evaluate the design consistency of 6th of October and Saft Al-Laban Corridors. The safety criteria developed by Lamm et al. [20] were for two-lane, two-way highways, where a good design had a change in operating speed between two successive elements of less than 10 km/h, a fair design had a change of operating speed between two successive elements between 10 km/h and 20 km/h, and a poor design had a change of operating speed of more than 20 km/h. In this study, the evaluation ranges of criterion II were modified to fit the arterial roads based on the design speed ratio between highways and arterial roads as shown in Table 12.

Table 12 Modified ranges for operating speed-based evaluation for elevated urban arterials

A comparative design consistency evaluation was conducted on each movement direction for the case study roads by using the proposed modified ranges of Safety Criterion II presented in Table 12. The first consistency evaluation was conducted for the case study road segments using the operating speed data obtained from Google Maps Distance Matrix API as a reference for the comparative evaluation. The second evaluation was conducted for the same segments using the predicted operating speed data (V85) based on the trained ANN-M2-3. The purpose of the application was to compare the design consistency rating using the ground truth data (raw data) operating speed values and the ratings from operating speeds inferred from the ANN model to test how different were the two results and whether the errors of the models resulted in different consistency ratings.

The results of the comparative design consistency evaluations are presented in Table 13 for 6th of October Corridor and Table 14 for Saft Al Laban Corridor and summarized in Table 15. By comparing the conducted evaluations for the two corridors, it was found that 71% of the segments had the same evaluation category, while 29% of the segments had different evaluation rating. Furthermore, the segments with different consistency evaluation were found to be 14 tangents and 15 curves with different range of characteristics which indicates that there is no pattern/bias in observed differences. The differences, though, could be justified by two main reasons. The first reason is the accuracy of the developed ANN model which reflects the prediction power of the model. It was noticed that the majority of evaluation differences (97%) are one category difference, and the remaining 3% is two categories difference. Referring to the MAE of ANN 2–3 which was 4.6 km/h and by comparing it with the speed range difference of each evaluation category which is 7 km/h per category, it can be concluded that the reason for most of the differences in the consistency evaluation (one category difference) is because of the MAE of the prediction model itself. The second reason which was identified through visual observations is the localized operational conditions that are not captured by the developed model and could have an impact on operating speeds such as radars, cameras, or deteriorated pavement condition in specific parts of some segments.

Table 13 Comparative consistency evaluation results for 6th of October Corridor
Table 14 Comparative consistency evaluation results for Saft Al Laban Corridor
Table 15 Summary of consistency evaluation results

Assessment of model transferability

To further assess the developed ANN-Model #2 transferability, a validation exercise was undertaken to evaluate the potential application of ANN-Model #2 on other elevated urban arterials with similar characteristics. A test was conducted on both directions of a 1.7-km stretch of an elevated 6-lane divided urban arterial in Cairo, Egypt (Rod El Farag Corridor). Figure 9 and Table 16 display the geometric/operational characteristics of the considered portion of the corridor. Such data was used as model inputs to predict operating speeds on each segment of the corridor using ANN-Model #2.

Fig. 9
figure 9

Assessed route for validation

Table 16 Descriptive statistics of assessed route

The performance of the developed ANN-Model #2 was tested by comparing the model predicted speeds to speeds obtained from Google Distance Matrix API (representing ground truth data). Table 17 depicts speed estimates and prediction errors. These results are for 10 out of the 14 original segments, since the first two segments from both directions were excluded from the prediction exercise. The operating speed on these excluded segments was assumed as the posted speed and used as input in predicting the operating speed of the following segment.

Table 17 Speed estimates and prediction errors of assessed route

Estimated prediction errors ranged between 1 and 7% for all segments except for the first segment (where the prediction error increased to 13%). The estimated average prediction error was 4.6% (around 3 km/h). Reported results highlight the potential transferability of the developed model as it was able to reasonably predict operating speed patterns on a new urban arterial.

Conclusions

This research focused on the development of operating speed prediction models in Egypt using two modeling approaches: multiple linear regression and artificial neural networks based on speed data extracted from Google Distance Matrix APIs. Two distinct models were developed in this research, Model #1 to predict operating speed on curves only (V85 C) and Model #2 to predict speed on any segment (tangents and curves) (V85 n). Notably, no separate model was developed for speeds on tangents since speeds on the tangents largely depend on the length of the tangent, where short tangents will have speeds dependent on that of preceding and succeeding curves. The variability and driver’s choice of speed mainly depend on the geometrics of the horizontal curves experienced by the driver. To our knowledge, this is the first research attempt to investigate the following: (1) the residual impact of upstream segments’ operating speeds (segments n-1, n-2, n-3, and n-4) on operating speed of a given segment (n) and 2) the feasibility of developing one operating speed model that can predict operating speeds on both curves and tangents with acceptable level of accuracy.

A preliminary analysis was conducted on collected data to identify correlation patterns between different variables. Seven independent variables were selected to be investigated as potential inputs to Model#1, namely, number of lanes NL, posted speed PS, curve length CL, curve radius R, speed on tangent V85T, tangent length TL, and operating speed on previous curve V85C(n-1). A stepwise regression modeling exercise was undertaken. The developed model used only two independent variables (R, V85T,) and reported R2 of 0.37 and MAPE of 11.66%. On the other hand, the best performing ANN model used all seven variables and reported R2 of 0.5 and MAPE of 8%.

As for Model #2, fifteen independent variables, capturing geometric/operation features of the modeled segment n and upstream segments n-1, n-2, n-3, and n-4, have been investigated. The developed regression models used four independent variables: posted speed, operating speeds on segments n-1 and n-2, and curvature category of segment n-1. The model reported R2 of 0.31 and MAPE of 9.03%. On the other hand, the best-performing ANN model used all inputs for segments n, n-1, and n-2, reporting R2 of 0.56 and MAPE of 6.7%. The model performance was further evaluated in a design consistency evaluation exercise.

Outputs of this investigation have highlighted the potential of using ANN in enhancing the prediction power of operating speed models. The ability of ANN to consolidate inputs from more than one segment, widening the spatial extent of influence, was proven to effectively boast the model prediction accuracy. Moreover, the capacity to include all types of segments (curved and tangents) in one model adds an advantage related to the ease of application. Furthermore, the study assessed the potential transferability of the developed model by applying it to a new elevated urban arterial. Promising results were reported, with an average prediction error of 4.5%. However, it is important to acknowledge that the developed models serve as proof-of-concept models as they are based on data consolidated from two elevated urban corridors. For model generalization, data from multiple elevated urban corridors should be considered.

The models can be valuable to road authorities during the design phase to check for any design inconsistencies that could be avoided. It is not common to test the operating speed profiles of drivers during the design phase, especially on high-speed urban arterials.

While this study attempted addressing specific research objectives related to developing operating speed models for urban arterial roads using ANN, it has triggered several research questions. Future research in this direction could contribute further to this point including the following:

  1. 1.

    Extending the used database to develop the operating speed model to incorporate more variability in road geometry (specifically curve radii, lane widths, speed limits).

  2. 2.

    Incorporating other variables that may impact operating speeds such as pavement condition.

  3. 3.

    Training and testing the dataset under other conditions such as lighting conditions and access points on the road segments.

  4. 4.

    Correlating the operating speed behavior of drivers with crash occurrences on the road segments.

  5. 5.

    Incorporating time-dependent variables such as traffic conditions for a wider scope of research study besides free-flow conditions.

  6. 6.

    Using the results of this study to provide recommendations to road authorities on suggested speed limits, suggested road signage requirements, and suggested traffic management procedures to guide driver’s speed choice to be suitable to the alignment driven.

Availability of data and materials

All the materials, including and not limited to the descriptive analysis, tables, figures, statistical models, and equations, are included in the manuscript. In addition to that, all the relevant raw data, Excel sheets, data collected, and SPSS files are freely available to any researchers who wish to use them for noncommercial purposes while preserving data collected confidentiality and anonymity from the corresponding author on reasonable request.

Abbreviations

ANN:

Artificial neural networks

MLR:

Multiple linear regression

API:

Application programming interface

R max :

Maximum curve radius

R min :

Minimum curve radius

NL:

Number of lanes

PS:

Posted speed

CL:

Curve length

R:

Curve radius

V85T:

Speed on tangent

TL:

Tangent length

V85C(n-1) :

Operating speed on previous curve (n-1)

V85C(n) :

Operating speed on curve (n)

V85 (n) :

Operating speed on segment n

Ln :

Length of segment n

Cn :

Category of segment n

PS:

Posted speed

L (n-1) :

Length of segment n-1

V85 (n-1) :

Operating speed on segment n-1

C (n-1) :

Category of segment n-1

L (n-2) :

Length of segment n-2

V85 (n-2) :

Operating speed on segment n-2

C (n-2) :

Category of segment n-2

L (n-3) :

Length of segment n-3

V85 (n-3) :

Operating speed on segment n-3

C (n-3) :

Category of segment n-3

L (n-4) :

Length of segment n-4

V85 (n-4) :

Operating speed on segment n-4

C (n-4) :

Category of segment n-4

ANN M 1-n:

Artificial neural network model # 1 trial # n

ANN M 2-n:

Artificial neural network model # 2 trial # n

G:

Good

F:

Fair

P:

Poor

References

  1. Adu-Gyamfi Y & Zhao M. (2018). Traffic speed prediction for urban arterial roads using deep neural networks. In American Society of Civil Engineers (pp. 85–96)

  2. Lamm R, Choueiri EM and Mailaender T. (1990). Comparison of operating speeds on dry and wet pavements of two-lane rural highways

  3. KrammesRA, BrackettRQ, Shafer MA, Ottesen JL, Anderson IB, Fink KL & Messer CJ. (1995). Horizontal alignment design consistency for rural two-lane highways (no. FHWA-RD-94–034). Federal Highway Administration, United States

  4. Voigt A (1996). An evaluation of alternative horizontal curve design approaches for rural two-lane highways

  5. Gong H, Stamatiadis N (2008) Operating speed prediction models for horizontal curves on rural four-lane highways. Transport Res Record 2075(1):1–7. https://doi.org/10.3141/2075-01

    Article  Google Scholar 

  6. Zuriaga AM, García AG, Torregrosa FJ, D’Attoma P (2010) Modeling operating speed and deceleration on two-lane rural roads with Global Positioning System Data. Transport Res Record 2171(1):11–20. https://doi.org/10.3141/2171-02

    Article  Google Scholar 

  7. Passetti KA, Fambro DB (1999) Operating speeds on curves with and without spiral transitions. Transport Res Rec J Transport Res Board 1658(1):9–16. https://doi.org/10.3141/1658-02

    Article  Google Scholar 

  8. Abbas SK, Adnan MA, Endut IR (2011) Exploration of 85th percentile operating speed model on horizontal curve: a case study for two-lane rural highways. Procedia Soc Behav Sci 16:352–363. https://doi.org/10.1016/j.sbspro.2011.04.456

    Article  Google Scholar 

  9. Mahmoud H, Said D & Radwan L. (2015). Three-dimensional modelling of operating speeds on horizontal curves for two-lane rural highways. In Proceedings of 94th Transportation Research Board Annual Meeting

  10. Islam M, Seneviratne P (1994) Evaluation of design consistency of two-lane rural highways. Institute Transport Eng J 64(2):28–31

    Google Scholar 

  11. Bird RN, Hashim IH (2005) Operating speed and geometry relationships for rural single carriageways in the UK. In 3rd International Symposium on Highway Geometric Design. Transportation Research Board, Chicago

  12. Hashim IH, Abdel-Wahed TA, Moustafa Y (2016) Toward an operating speed profile model for rural two-lane roads in Egypt. J Traffic Transport Eng (English edition) 3(1):82–88

    Article  Google Scholar 

  13. Fitzpatrick K, Shamburger CB, Krammes RA, Fambro DB (1997) Operating speed on suburban arterial curves. Transp Res Rec 1579(1):89–96

    Article  Google Scholar 

  14. Fitzpatrick K, Carlson P, Brewer M, Wooldridge MD (2003) Design speed, operating speed, and posted speed limit practices. In 82nd annual meeting of the transportation research board, Washington, DC

  15. Wang J, Dixon KK, Li H, Hunter M (2006) Operating-speed model for low-speed urban tangent streets based on in-vehicle global positioning system data. Transp Res Rec 1961(1):24–33

    Article  Google Scholar 

  16. McFadden J, Yang WT, Durrans S (2001) Application of artificial neural networks to predict speeds on two-lane rural highways. Transp Res Rec 1751(1):9–17

    Article  Google Scholar 

  17. Semeida AM (2013) Impact of highway geometry and posted speed on operating speed at multi-lane highways in Egypt. J Adv Res 4(6):515–523

    Article  Google Scholar 

  18. Semeida AM (2014) Application of artificial neural networks for operating speed prediction at horizontal curves: a case study in Egypt. J Modern Transport 22(1):20–29

    Article  Google Scholar 

  19. Alsobky A, Mousa R (2020) Estimating free flow speed using Google Maps API: accuracy, limitations, and applications. Advances in Transportation Studies 50:49–64

    Google Scholar 

  20. Lamm R, Psarianos B & Mailaender T. (1999). Highway design and traffic safety engineering handbook

Download references

Acknowledgements

Not applicable.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

All the authors confirm contribution to the paper as follows: study conception and design, DS and HT; data collection, FR; analysis and interpretation of results, FR, DS, and HT; draft manuscript preparation, FR, DS, and HT; and manuscript review, DS and HT. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Dalia Said.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Said, D., Reyad, F. & Talaat, H. Developing operating speed models for elevated multilane urban arterials using artificial neural networks. J. Eng. Appl. Sci. 70, 123 (2023). https://doi.org/10.1186/s44147-023-00288-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s44147-023-00288-4

Keywords