Roadway safety research indicates a correlation between drivers’ behavior, the demographics, and the local environment affecting the risk perception and roadway crashes. This research examines these issues in an Egyptian context by addressing three groups: private cars drivers, truck drivers, and public transportation drivers. A Driver Behavior Questionnaire (DBQ) was developed to capture information about drivers’ behavior, personal characteristics, risk perception, and involvement in crashes. The risk perception was captured subjectively by exposing participants to various visual scenarios representing specific local conditions to rank their perception of the situation from a safety perspective.
Results indicated that the human factor, in particular, failure of keeping a safe following distance, was a major cause of crashes. The analyzed data was used to predict expected crash frequency based on personal attributes, such as age, driving experience, personality traits, and driving behavior, using negative binomial models. The study recommends that the DBQ technique, combined with risk perception scenarios, can be used to understand drivers’ characteristics and behaviors and collect information on the crashes they experience.
Practically the study findings could provide series of recommendations to the local authorities about the introduction of the traffic management and noise control act; raising awareness of driving etiquette; setting and enforcing driving hours’ regulations, and consider specific training programs for beginners drivers.
World Health Organization (WHO) statistics for road crashes worldwide show that the number of deaths due to road crashes ranges between 1.25 and 1.35 million per year and are the leading cause of death among young people . The number of road crashes has decreased in developed countries due to several interventions. However, this is not the case in developing countries, which account for 90% of worldwide road crash fatalities. The USA rate of road fatalities was 1.3 persons per 10,000 vehicles in 2016 . At the same time, the rate in Egypt was 11 road fatalities per 10,000 vehicles . An equally startling statistic is that there are 4 deaths in Egypt per 100 km roads , while the rates of death in the UK and USA are 0.47 and 0.92 people, respectively . These statistics indicate an alarming number of fatalities in Egypt, resulting in a heavy toll on Egyptian welfare and the economy at large.
Accordingly, there is a strong need to study the relationship between drivers’ behavior, risk perception, and roadway crashes. This research, therefore, tackles the roadway crashes issues in an Egyptian context by addressing three groups: general drivers, truck drivers, and public transportation drivers to find a way to enhance the safety of the Egyptian roads and to improve Egyptian driving behaviors. The Driver Behavior Questionnaire technique (DBQ) is adopted to capture information about drivers, their behavior, and risk perception. The collected data was then used to predict the expected crash frequency using negative binomial models.
The research objectives endeavors to investigate and quantify the impact of human behavior on road crashes in Egypt and to better understand driver conduct relevant to traffic safety. Specifically, the objective is to map the relationship between drivers’ demographic characteristics, their history of traffic rules’ violations and crashes, and their level of risk perception while driving along Egyptian roads. Due to the lack of an accurate and representative government roadway crash database from authorities; this research attempts to connect these dots using means of survey and drivers’ interviews to ultimately model the association of these variables as a step towards improving traffic safety in Egypt. What further exaggerated the matter is that research studies related to the types and reasons for the negative behavior of drivers in Egypt are quite sporadic.
With the above objectives in mind this paper has been structured as follows: background studies summarize relevant background studies in relation to risk perception, drivers’ behavior, driver demographics, and personality traits and their relation to roadway crashes. Besides, the background studies also discuss techniques used to study the drivers’ behavior including the Driver Behavior Questionnaire (DBQ), its structure, fields of interest, and different ways to collect data. Then, the “Methods” section presents the research methodology, and the steps that have been followed to study the drivers’ negative behaviors, risk perception, and their relationship to traffic crashes and violations concerning demographic factors. The results are then introduced by representing the results into descriptive analysis, and crash prediction modeling subsections. A specific section is devoted to a discussion of the modeling results and key findings. Finally, the paper wraps up with conclusions, limitations, and potential for future research.
According to the background studies, about 90–95% of traffic incidents result from human actions. Thus, it is reasonable to infer that a crash is likely the fault of the driver and not the fault of the vehicle. Dahlen et al. (2012)  showed that aggressiveness in driving increases the risk of crashes and physical injuries. The study associated anger, which interferes with judgment and coordination behind the wheel, with lowering driving performance and increasing the likelihood of a crash. Aggressive behavior is the intention to harm or injure other drivers or pedestrians in any emotional or physical way. Alonso et al. (2019)  found that the perception of anger, aggressiveness, and risky behavior changes according to the characteristics of sociodemographic variables of the participants, and people’s attitudes and behaviors towards road safety is a reflex of their perception. Tao et al. (2017)  found that personality traits and driving experience played a role in predicting the risk of traffic crashes. Regev et al. (2018)  studied the relationship between exposure, age, gender, and time of driving in the UK. This research showed that both low and high exposure, that is, time behind the wheel, is very dangerous and risky. In Egypt, Elshamly et al. (2017)  found that fatigue due to long driving hours and lack of sleep is the likeliest cause of truck crashes in Egypt. These findings highlight the important role played by human factors on the risk of crash involvement among drivers. Vanlaar et al. (2006)  validated an empirical model discussing the driver’s perception of causes of road accidents and differences in perceptions between participants by collecting data from 23 countries using face-to-face interviews to rate 15 causes of road accidents by six-point ordinal scale. The model showed that there are no relevant differences between the 23 countries’ participants. However, driving under the influence of alcohol or drugs was perceived as the most significant variable in causing road accidents; followed by using mobile phones.
Several methods have been used to examine driver behavior and characteristics, including GPS tracking devices, the Mobile-Sensor-Platform for Intelligent Recognition of Aggressive Driving (MIROAD), and visual reality systems. Other methods involve the use of the Driver Behavior Questionnaire (DBQ) technique. This research adopts the DBQ method to collect demographic information, driving behavior, drivers’ crash history, and estimates the level of risk perception.
Reason et al.  developed the Driver Behavior Questionnaire (DBQ) to measure drivers’ actions behind the wheel. The DBQ is one of the most widely used instruments for measuring self-reported driving behaviors. The DBQ method has been used for studies in China , Canada , Denmark , Latvia , Qatar , and other countries. The content and structure of the studies varied with the study scope, objective, and participants.
Despite the popularity of the DBQ, this research is the first attempt to implement the DBQ with Egyptian drivers. The survey is divided into four sections: (A) demographic characteristics, (B) the driver’s traffic violations and crashes, (C) driver behaviour, and (D) risk perception.
As stated earlier, the research objective is to study the drivers’ negative behaviors, risk perception, and their relationship to traffic crashes. With the lack of advanced technologies (such as simulators) to conduct in-depth studies on the behavioral aspects by simulating real driving; this study adopted the DBQ technique and developed a survey form to collect the required data on demographics, crash history, violations of information, behaviors, perception, and personality traits. The data was then analyzed descriptively and statistically, and different statistical models were derived, tested, and compared. Accordingly, the best model was selected to predict the number of crashes by certain variables.
Survey form design
As survey design is critical to achieving the study's objectives, a comparative literature review was synthesized to determine how previous researchers designed their DBQ instrument. The research team then summarized the most relevant studies to capture the most significant reported variables; categorized by demographics, crash history, violations of information, behaviors, perception, and personality traits. This is illustrated in Table 1. The comparative literature review resulted in identifying 58 variables gleaned from previous studies, refer to Table 1 and categorized to very significant (VS); correlated, but not very significant (C),; not significant nor correlated at all (NS). Albeit the research team has identified the most significant variables from state-of-the-art research as a starting point; succinct knowledge of the local context and exploratory interviews with drivers accentuated the need to consider additional factors/questions, such as driving under stress, which seemed relevant in the Egyptian context. A 35-question instrument was developed; of which, 17 questions focused on driver behavior, while the rest were designed to capture the other factors associated with roadway safety, such as the driver’s characteristics, previous traffic violations, and crash history. Most of the questions were on a 5-point Likert-like Scale (1 = never, 2 = rarely, 3 = sometimes, 4 = most of the time, and 5 = always). The questions of the questionnaire were divided into four sections, each with a set of related variables, as detailed in the following subsections:
Questions in this section included the age, gender, driving experience, number of daily driving hours and trips, and education level to study the relationship between each factor and traffic crashes. Gleaning participants’ age and gender are important to study the relationship between drivers’ age group, their driving behavior, and the occurrence of a traffic crash.
The education level and driving experience are used to study the effect on risky driving behavior, traffic violations, traffic crashes, and risk perception. As the literature reveals that driving experience plays an important role in road safety, the instrument addressed the number of hours of daily driving and the number of trips per day to measure the exposure of the drivers to potential roadway incidents.
Traffic violations and crash history
This section of the questionnaire was designed to elicit information about the driver’s traffic violation and crash history, specifically the cause and number of these crashes in the previous 3 years, to serve as a basis for the statistical and descriptive analysis to investigate the relationship between traffic crashes, driving behavior, risk perception, and demographic factors.
This section has 17 questions measuring participants’ risky and aggressive behavior while driving, such behavior could include speeding, tailgating, distracted driving, and failure to wear a seatbelt. Other questions also investigated the reaction of the participant in the event of experiencing aggressive or inappropriate behavior from other driver’s/roadway users.
The participant’s level of risk perception was captured by a specific technique. The research team mapped several selected roads to glean real-life footage of various combinations of traffic conditions, driver population, day/night, etc. Participants were then exposed to selected scenes from real-life situations captured on different roads in and around Cairo. These images depicted traffic violations, aggressive behavior, traffic safety, and road geometry concerns, in 10 scenarios typical to driving on Egyptian roads [for example: overloading a vehicle, picking up/drop-off of passengers along an urban highway, pedestrians crossing in front of traffic on the highway, heavy trucks drove on the left lane, to name a few]. The objective of this part was to investigate the relationship of driver risk perception, aggressive behavior, and crash history with relevant demographic factors. The selected scenarios were presented to the participants to evaluate them from their safety and perception point of view and safety awareness while driving; through rating the scenario/situation on a scale from 1 to 5, 1 being very safe and 5 being very dangerous.
Data collection process
The study depended on data collection by online questionnaires and field survey forms. It was expected that significant parameters for the drivers in Egypt may vary from those found in similar studies in other countries. The difference was related to road conditions, driver behavior, safety warrants, culture and habits, traffic laws, and enforcement.
The sample size calculation indicated a sample size of 385 assuming that the population size exceeds 1 million and margin of error is less than 0.05 and the confidence level is 95%.
where N is the population number, e is the margin of error, Z is the Z score value for the standard deviations equivalent to 1.96.
The data collection process started with a pilot survey on a limited scale to ensure the terminology was clear. A second trial involving 35 participants proved the soundness of the questionnaire. Then, the survey was published on multiple communication channels in early 2019. Researchers conducted personal interviews with drivers at factory loading stations, public transportation main terminals, and waiting areas around key attractions (e.g., shopping malls, cinemas, hospitals), resulting in 883 completed interviews with 515 private car drivers, 82 taxi drivers, 110 public bus drivers, 124 truck drivers, and 52 public transit drivers. After eliminating surveys with incomplete responses, the researchers had data from 824 participants.
The data collection process stopped at this number as it exceeded the minimum sample size. However, this survey collected above 824 valid responses. So, the confidence level can be increased to 99%. In this survey, all the drivers across Egypt were targeted. With roughly 8.6 million registered vehicles in Egypt and many drivers reaching be three times the number of registered vehicles; it is practically impossible to survey the entire population of drivers in Egypt. The sample size calculation typically reaches a fixed value after a certain population size. In addition, due to the geographical constraints in the targeting process of the trucks, taxis, and public transportation drivers; the field interview only captured drivers from Cairo and Giza. These drivers were all males which is expected to result in an over representative male percentage in the sample. Another point to consider is that the sample frame would have a bias due to the online communication channels used in the data collection; all of which have been included in the limitation section at the end of the paper.
It is worth noting that while conducting the questionnaire, approval and consent were part of the survey design, and the research and questionnaire approach did not use any personal data; indicating that participation was done voluntarily and anonymously. Confidentiality and the scientific value of data were emphasized, highlighting that data would be used only for research purposes to encourage participants to provide sincere answers to all questions as we noticed that some drivers were afraid to participate thinking that the data might be shared with traffic police. The data was then collected and initially wrangled by descriptive analysis to explore and cluster the participants according to driver categories. Each category was described separately and illustrated by graphs and figures demonstrating the distributions of answers among survey variables as shown below in the results section. The data were integrated into a logical format for further processing by Statistical Package for the Social Sciences (SPSS®) software using version 22 statistics package and Minitab software. Then, the analyzed data and variables to produce predictive models and estimate the number of likely crashes based on the drivers’ characteristics were presented in the modeling section after the descriptive analysis results were discussed in the following section.
Results and discussion
The researchers analyzed the data and variables initially using descriptive analysis to put on hand the significant variables as shown below.
The descriptive analysis of the questionnaire was based on the participant’s data showing the results of the driver crashes and accrual reasons and the driving behavior questions and the number of crashes related to these behaviors. Besides, the risk perception rating and percentages of each scenario which is based on the participants' opinion. Also, the age and gender data were presented. In addition to that, the driver’s experience and the driver’s education as well as the number of daily trips for each driver were presented.
The demographic analysis results show that 74.4% of the participants were males (including the bus, taxi, and truck drivers, who were over 30% of the sample), and 70.7% had a university undergraduate or a post-graduate degree. The results also show that 57.6% had five or more years of driving experience. Of these, 52.5% were taxi, truck, or microbuses drivers. Figure 1 indicates that the majority of participants (57.28%) were somehow involved in one to three crashes in the last 3 years, categorized by the following variables and indicating the dominant variable between brackets: age [26–40 years], gender [male], years of experience driving [> 10 years.], and university degree [university and post-graduate degree]. As shown in Fig. 1a, the greatest percentage of participants involved in the 1–3 crashes category were 26–40 years old. More than 30% of this category has more than 10 years of driving experience. As shown in Fig. 1c, 189 participants, or 48.8%, with 10 or more years of experience were taxi, truck, or microbus drivers. Also, 99.47 % of the 189 participants (188) drivers usually drove 3–8 h per day. Therefore, these drivers had longer exposure times, increasing the probability of being involved in crashes.
In this sample, 74.4% are males, as can be concluded from Fig. 1b, and more than 40% of the participant involved in the 1–3 crashes category were males; the females’ percentage was only 12% in this category. It should be noted that 25.6% of the sample was made up of female drivers, representing 211 participants. A total of 71 of these participants held a post-graduate degree, and the rest had graduated from a university with an undergraduate degree.
The average crash frequency for the demographic variables that were deemed significant is shown in Fig. 2 while fixing all the other variables. In the demographic dimension, the age category was inversely proportional to the number of crashes. As shown in Fig. 2b, there was a relationship between exposure and the number of traffic crashes for public transportation and truck drivers who drove at least 3 h a day. The same is true of the general drivers; those who drove more than 3 h per day tended to have more crashes.
Number and reasons for crashes
The participants were asked to respond to questions on the number and cause of vehicular crashes they experienced in the previous 3 years. The number of crashes ranged widely from 0 to 16, with a reported mean of 2.04 and a median of 1.00 crashes per participant. The reasons behind the reported crashes are summarized in Fig. 3. Participants reported that they believed their crashes were caused primarily (18.7%) by tailgating, which is the failure to keep a sufficiently safe distance between their car and the car in front, followed by (16.36%) related to sudden swerving of their car or the car in front. The third most likely cause (14.71%) was distracted driving, caused by mobile phones or eating.
Driver behavior data
Driver behavior data show that 56.4% of the participants exceeded the posted speed limit, while 15.0% of the respondents said they always overtake from the right-hand side. In addition, 40.1% said they use phones while driving, 61.7% of drivers said they express anger or aggressiveness by using the headlight beam or honking the horn. Following at a safe distance of more than 18.0 m while driving the vehicle at a speed of 80 km/h was respected by only 35.5% of the drivers. Finally, the survey results indicated that a significant 25.48% of the participants drove in the opposite direction of traffic. While the authors acknowledge that this percentage is considerably higher than in any other country, it is noteworthy that 45% of respondents are public transportation and truck drivers. Most of those drivers received only primary or preparatory education, and they abide by driving rules. Therefore, this result was not a complete surprise, especially in the Greater Cairo Region, where researchers interviewed these drivers. Figure 4 shows the variability in the average crash frequency for the different levels of driving behavior. For example, drivers who regularly use the horn or the high beam aggressively (Fig. 4a), tend to drive in the opposite direction (Fig. 4b) or tend to tailgate the front vehicle (Fig. 4c) are more likely to be involved in crashes. In other words, hostility on the highway is more likely to result in a roadway crash. It was also found that seatbelt use was inversely proportional to the average number of crashes (Fig. 4d); that is, drivers who tend to use seatbelts were less likely to be involved in a crash in the last 3 years.
As previously discussed in the DBQ setup, respondents were asked to rate specific scenes from real-life situations captured on different roads based on their perception of the risky behavior in the scenario. These are the scenes deemed as the most dangerous by various types of drivers. They depicted traffic violations, aggressive behavior, traffic safety issues, and road geometry concerns in 10 scenarios typically witnessed on Egyptian roads. The 10 captured scenarios were: a typical cross-section with no depicted risk, improper pavement marking or median variable width, illegal pickup/drop off of public transportation, illegal/unsafe loading of heavy trucks, night driving with no lights, trucks driving on the fast (left) lane, an illegal pedestrian crossing on busy highways, illegal pickup/drop off on highways, dangerous means of transportation on top of goods on trucks, and driving against traffic.
The participants rated the situation on a scale from 1 to 5, 1 being very safe and 5 being very dangerous. The respondents were grouped into three different driver categories: truck drivers, public transportation (bus and taxi) drivers, and passenger car drivers. The differences in the results between the three categories, presented in Fig. 5, provide interesting insights into how different groups of drivers perceive risk in different ways.
The results showed that 93.5% of the truck drivers rated a pedestrian illegally crossing a road with high-speed traffic as the most dangerous situation. The illegal picking up or dropping off on the highway came in second (86% rated this situation as a very dangerous act). The dangerous means of transportation for passengers and cargo came third, with 82 % of the truck drivers rating it as a very hazardous act. These three situations were perceived as the highest risk, as they probably are the main reasons for heavy vehicle crashes on Egyptian roads and pose the greatest danger to truck drivers. The results showed that these participants did not perceive that driving in the fast lane was a dangerous act for truck drivers, nor were driving in the opposite direction of traffic and illegal or dangerous truck loading.
Public transportation drivers had a slightly different view. A total of 92.9% agreed with the truck drivers that an illegal pedestrian crossing was the most dangerous, followed by an unsafe means of transportation (85%), and lastly were trucks driving in the fast or left lane (82.4%). These three situations were representative of the risky situations that public transportation drivers may encounter on roads in Egypt, and they increase the probability that these drivers will be involved in a crash. The public transportation drivers found that it is very dangerous for trucks to travel in the left lane, but not more dangerous than illegal pedestrian crossing. Also, they rated the illegal picking up and dropping off as a normal scenario because they, unlike truck drivers, do this regularly.
Passenger car drivers agreed with drivers of public transportation on the first and second most dangerous situations, illegal pedestrian crossing (94.6%), and dangerous means of transportation (90.9%). Passenger car drivers reported that the third most dangerous situation was truck drivers traveling in the fast lane (87.2%). This represents one of the most severe types of crashes on Egypt’s highways, ones in which heavy vehicles are involved with a private car and/or pedestrians. Regardless of the class of drivers, it was clear that all participants did rate the exposure of vulnerable pedestrians crossing illegally as the most dangerous (93.9 %), followed by dangerous means of transport (86.6%), and trucks drivers driving on the left lane (79.3%).
It is noteworthy that 25% of the participants said they might drive in the wrong direction to reach their destination faster and have done this at least once. Moreover, only 36% of the participants felt that driving against traffic was a very dangerous act. This reflects the general perception in Egypt that this is an acceptable way to drive given traffic conditions.
Modeling procedure for the probability of crashes
After data were initially wrangled to explore the results, it was essential to conduct statistical analysis and modeling to characterize the drivers’ behavior. Because of the nature of the collected data, logistic regression techniques were utilized. In SPSS, the data and variables were analyzed to produce predictive models to estimate the number of crashes likely to occur, based on the demographic factors, driver behavior, and risk perception. The dependent variables (exposure to traffic crashes) are categorical, and the independent variables are the driver behavior, risk perception, and demographic variables. Regression models were applied to each cluster to investigate the critical factors affecting the probability of crash occurrence within that cluster. For example, the demographic variables were examined concerning the number of crashes to determine how those variables affected the probability of crash occurrence. Each cluster was studied separately. Then, all clusters were investigated together against the number of crashes.
Various regression analyses were conducted. These were the linear regression analysis (LR), negative binomial regression analysis (NBR), and Poisson regression analysis (PR). A series of tests and analyses were carried out to assess the most suitable model for this data. However, crashes were count data and were usually modeled by using Poisson and negative binomial regression models. Rare-event count data such as crash occurrence better fit Poisson distribution, Washington et al. (2020) . However, one requirement of the Poisson distribution is that the mean of the count data equals its variance and this is not the case in this research as the variance is significantly larger than the mean, which implies that the data is over-dispersed. In many cases, over-dispersed count data are successfully modeled using the negative binomial distribution, Washington et al. (2020) .
In the first step, data and codes were reviewed again to investigate the data distribution. Each of the collected variables is categorical, except for the car crash count, which is measured as the number of crashes. Three tests were run, PR, NBR, and LR, to assess the predictive capability of the variables. According to the initial results of the three models, where all the constructs are included, the NBR appears to be most appropriate in that it was able to extract a higher number of predictors. The variables were clustered in three dimensions: (1) driver behavior, (2) risk perception, and (3) demographic variables.
Model A: demographic variables model
The correlation analysis for demographic variables is shown in Table 2. It indicates a strong positive correlation between age and driving experience. The two variables most positively correlated to the number of crashes are the number of driving hours and daily trips. Age is negatively correlated with the number of crashes. The internal consistency of the demographic variables was assessed by Cronbach’s alpha reliability test . The Cronbach’s alpha scores for the initial and final trials of the demographic variables are shown in Table 4a. Using all the variables of the drivers’ demographic characteristics results in a low level of reliability of 0.319 level of alpha. Thus, it is necessary to progressively drop some variables from the model until an acceptable level of reliability is attained. In that respect, dropping the gender, trip purpose, and education variables resulted in a reasonably accepted alpha coefficient of 0.735 level of reliability. The model parameter estimation is presented in Table 5a, indicating that at a 5% significance level, only the age, number of daily driving hours, and the number of daily trips variables can be retained to explain the predicted crash frequency.
Model B: driver behavior variables model
The correlation analysis for the drivers’ behavior variables, as shown in Table 3, indicates a strong positive correlation between speeding and illegal overtaking, mobile phone use, and changing lanes. Speeding is strongly and positively correlated with wrong overtaking and changing lanes frequently, behaviors that indicate an aggressive driving attitude. Similar to Model A above, the Cronbach’s alpha reliability scale initially resulted in an alpha coefficient of 0.532 when incorporating all the variables describing the driver’s behavior. Dropping the drug’s impact, running red light, and changing lanes illegally variables resulted in a reasonably accepted 0.720 level of reliability, as shown in Table 4b.
The model parameter estimation is presented in Table 5b. It indicates that, at a 5% significance level, only exceeding speed limit and driving in the opposite direction variables are related to the driver’s behavior factors and can be considered to explain the predicted crash frequency. In addition, at 1% or more significance level, the keeping a safe following distance, using seatbelt while driving, and honking the horn or using the high beam variables; resulting in a significant model shown in Table 5b, as well as the goodness of fit tests to review how strong and precise this model in predicting the number of crashes.
Model C: combined demographics and drivers’ behavior model
Combining all the variables from both demographics and behavior dimensions results in a mixed model that estimates the predicted crash frequency as a function of both dimensions. The model parameter estimation is presented in Table 5c, indicating that, at a 5% significance level or less, the following variables are significant: age, number of daily driving hours, number of daily trips, honking horn/using high beam, and driving in the opposite direction. The variable that contributes the most to reducing the predicted frequency of crashes is Age, indicating that the senior drivers are less likely to be involved in a crash. Of course, this conclusion tops out at a certain age and is a function of the participant's age group. On the other hand, the behavioral components of the model exhibit a higher contribution to the predicted crash frequency compared to the demographic ones.
Model D: adjusted demographics and drivers’ behavior model
Using all the significant and logical variables retained from all the previous models and considering Cronbach’s alpha reliability measures and the correlation matrix results in a mixed model that estimates the predicted crash frequency as a function of only the significant and logical variables. The model parameter estimation is presented in Table 5d, indicating that, at 5% significance level or less, the following variables are significant: age, number of daily driving hours, number of daily trips, honking the horn/using high beams, driving in opposite direction, and keeping a safe following distance.
Age and keeping a safe following distance are the variables that contribute the most to reducing the predicted frequency of crashes. On the other hand, the driving in opposite direction variable contributes greatly to predicted crash frequency compared to the other variables.
The model's parameter estimation was presented in modeling tables indicating that all models were significant. However, the best model that represented the sample were identified by comparing all models against the well know comparison parameters starting with the goodness of fit which is known by chi-square (R2). Where the chi-squared test is a parameter to check the goodness of fit for the null hypothesis to confirm the statistical significance to determine whether two or more categorical random variables such as age and accidents are independent of each other. Also, used to compare the log-likelihoods of regression models under the null hypothesis. So, it can be used to compare between different models to confirm the best fitting model to the used data. Then, another two parameters called Akaike’s Information Criterion (AIC), and Bayesian Information Criterion (BIC) are mathematical methods for evaluating how well a model fits the data it was generated from. In statistics, AIC is used to compare different possible models and determine which one is the best fit for the data. However, BIC is an estimated probability of a model being true. So, a lower BIC means that a model is considered to be more likely to be the truth. Both criteria are based on various assumptions and asymptotic approximations.
Model goodness of fit
All the models were tested by the goodness of fit (R2) test, the results showed that Model D, adjusted demographics and drivers’ behavior model was the best model represented the sample, as its Omnibus Test (likelihood ratio chi-square) value equals 99.235 at a degree of freedom of 6 and significance of p = 0.000. Goodness of fit < deviance (720.037)/degree of freedom (816) with R2 =0.912, Pearson chi-square (976.801)/degree of freedom (816) R2= 1.097.
Bayesian Information Criteria (BIC)
All the models were tested by the Bayesian Information Criterion (BIC) test, the results indicated that Model D BIC was equal to 3430.237 the lowest value. Model C results showed BIC equals 3484.213, and Model B equal to 3455.174. Lastly, Model A equal to 3453.663, which means that model D was the truest model as its BIC value was the lowest value.
Akaike Information Criteria (AIC)
Reviewing all the models from the Akaike Information Criterion (AIC) number, the results show that Model D, adjusted demographics, and drivers’ behavior model are equal to 3392.524, which is the lowest and the best value. Model C was equal to 3422.174, and Model B was equal to 3422.174. Lastly, Model A was equal to 3430.093. Also, the Consistent Akaike’s Information Criterion (CAIC) number for model D is the lowest value among all models by 3438.237.
Testing the models showed that Model D is preferred by the BIC and the AIC. When testing the models against the goodness of fit McFadden pseudo R2 value, it was found that Model D (R2 = 0. 912) was a better fit for the database because it falls within the accepted values (0.4 and 0.9). Conclusively, the best model found by this researcher is Model D, as defined by the following equation:
The DBQ technique, combined with risk perception scenarios, can be used as an enabling tool to understand drivers’ characteristics and behaviors and collect information on the crashes they experience, especially in cases where a structured periodic crash database is largely missing.
The key conclusions of this research can be summarized as follows:
Participants stated that their crashes were primarily attributed to tailgating and failure to keep a safe gap, while the modeling results added that horn honking, use of high beams, and driving toward oncoming traffic also are aggressive factors contributing to the predicted number of crashes.
Regardless of driver type (private car, public transportation, or truck drivers), all participants said that the most dangerous behavior was when pedestrians illegally crossed a busy highway. Dangerous means of transport by cargo and passengers and trucks illegally driving in the left or fast lane were considered the second and third most hazardous.
The variables that contribute the most to reducing the predicted frequency of crashes are age and safe following distance
Behavioral components of the model exhibit a greater contribution to predicted crash frequency than do the demographic ones.
In practice, this research has the potential to support the Ministry of Transport and traffic police responsible for law enforcement on the following directions:
Introduction of Anti-Car-Honking Ordinance and Traffic Management and Noise Control Act to enforce traffic control, maintain traffic order and ensure traffic safety.
Raise awareness of driving etiquette rules to avoid the “flashing to dazzle” effect and consider including informative material in the driving test exam.
Setting and enforcing driving hours’ regulations as there is evidence from research relating fatigue to crashes, as this was clearly shown from the modeling results herein, especially that the majority of participants who indicated long driving hours are truck drivers followed by public transportation (bus and taxi) drivers.
Consider specific education and training programs for beginning drivers including behind-the-wheel driver education to address tailgating, driving in the opposite direction, and seatbelts issues. Additionally, consider improvement schools for young offenders as a non-trivial number of respondents were involved in 1–3 accidents in the last 3 years, and as the age group increases the number of crashes decreases.
The survey was performed with special care to avoid response patterns as much as possible, one of the biggest limitations of this study was the self-reported data online as it could be associated with a bias of social desirability or poor understanding of the questionnaire. During the data collection process, some constraints due to geographical areas were raised as the field interview only captured drivers from Cairo and Giza. Also, the passenger car drivers captured via social media means having some glitches such as age limitations and educational categories like university degrees and post-graduate degree holders. This seems to be biased towards high educational degrees. In addition to that, we believe that due to the field interviews the male percentage in the sample was over representative. The field interviews done with the truck, taxi, public transportation drivers created males representing 74% of the total sample. However, no official statistics are mentioning the number or the percentage of female drivers in Egypt. as this is an uncontrolled bias and due to time and effort limitations the research had to adopt this bias in the study and mitigate it as shown in the modeling section. Also, collected data was limited to 2019 only. In addition, the risk perception collected data could not be modeled due to some unknown glitches. Also, the collected number of collisions for each participant could not be verified due to lots of constraints like the autonomous agreement and the data unavailability from the government.
Negative binomial regression was used in the model after succeeding in the comparison with Poisson’s regression. The four models were presented and based on the model testing and comparison shown in the below section between the four models only one was recommended to be used in the predicting formula of the number of crashes.
Opportunities for future research could include the following:
Harnessing the potential of emerging tools, like driving simulation, and initiating programs like naturalistic driving—even at a modest scale.
Adopting a structured equation model [SEM] further extends the modeling effort presented in this paper by quantitatively studying multivariable relationships between measurement variables and latent variables.
Availability of data and materials
All the materials, including and not limited to, the descriptive analysis, tables, figures, statistical models, and equations, are included in the manuscript. In addition to that, all the relevant raw data, excel sheets, questionnaires forms, data collected, and SPSS files are freely available to any researchers who wish to use them for non-commercial purposes while preserving data collected confidentiality and anonymity from the corresponding author on reasonable request.
World Health Organization
Egyptian Central Agency for Public Mobilization and Statistics
Driver Behavior Questionnaire
Mobile Intelligent Recognition of Aggressive Driving
Linear regression analysis
Negative binomial regression analysis
Poisson regression analysis
Structured equation model
World Health Organization. Global status report on road safety 2018: summary. No. WHO/NMH/NVI/18.20. World Health Organization, 2018.
Janstrup KH (2017) Road Safety Annual Report 2017. Technical University of Denmark: Lyngby, Denmark.
Central Agency for Public Mobilization and Statistics. CAPMAS (2018), DDI-EGY-CAPMAS-Road-2018.
Dahlen ER, Edwards BD, Tubré T, Zyphur MJ, Warren CR (2012) Taking a look behind the wheel: An investigation into the personality predictors of aggressive driving. Accid Anal Prev 45:1–9. https://doi.org/10.1016/j.aap.2011.11.012
Alonso F, Esteban C, Montoro L, Serge A (2019) Conceptualization of aggressive driving behaviors through a perception of aggressive driving scale (PAD). Transp Res F: Traffic Psychol Behav 60:415–426. https://doi.org/10.1016/j.trf.2018.10.032
Tao D, Zhang R, Qu X (2017) The role of personality traits and driving experience in self-reported risky driving behaviors and accident risk among Chinese drivers. Accid Anal Prev 99(Pt A):228–235. https://doi.org/10.1016/j.aap.2016.12.009
Zhang H, Qu W, Ge Y, Sun X, Zhang K (2017) Effect of personality traits, age and sex on aggressive driving: psychometric adaptation of the Driver Aggression Indicators Scale in China. Accid Anal Prev 103:29–36. https://doi.org/10.1016/j.aap.2017.03.016
Martinussen LM, Hakamies-Blomqvist L, Møller M, Özkan T, Lajunen T (2013) Age, gender, mileage and the DBQ: the validity of the Driver Behavior Questionnaire in different driver groups. Accid Anal Prev 52:228–236. https://doi.org/10.1016/j.aap.2012.12.036
Perepjolkina V, Renge V (2011) Drivers’ Age, Gender, Driving Experience, and Aggressiveness as Predictors of Aggressive Driving Behaviour. Signum Temporis 4(1):62–72. https://doi.org/10.2478/v10195-011-0045-2
Sümer N, Lajunen T, Özkan T (2002) Sürücü davranislarinin kaza riskindeki rolü: ihlaller ve hatalar (The role of driver behaviour in accident risk: violations and errors). In: International Traffic and Road Safety Congress & Fair
Mesken J, Lajunen T, Summala H (2002) Interpersonal violations, speeding violations and their relation to accident involvement in Finland. Ergonomics 45(7):469–483. https://doi.org/10.1080/00140130210129682
Gueho L, Granie MA, Abric JC (2014) French validation of a new version of the Driver Behavior Questionnaire (DBQ) for drivers of all ages and level of experiences. Accid Anal Prev 63:41–48. https://doi.org/10.1016/j.aap.2013.10.024
Şimşekoğlu Ö, Nordfjærn T, Rundmo T (2012) Traffic risk perception, road safety attitudes, and behaviors among road users: a comparison of Turkey and Norway. J Risk Res 15(7):787–800. https://doi.org/10.1080/13669877.2012.657221
Washington S, Karlaftis MG, Mannering F, Anastasopoulos P (2020) Statistical and econometric methods for transportation data analysis. Chapman and Hall/ CRC press. https://doi.org/10.1201/9780429244018
All the authors confirm contribution to the paper as follows: study conception and design: DS and HA. Data collection: Sayed. Analysis and interpretation of results: IS, DS, and HA. Draft manuscript preparation: IS, DS, and HA. Manuscript review: DS and HA. All authors reviewed the results and approved the final version of the manuscript.
IS is a senior highway engineer and is interested in road safety, driver behavior, risk perception, crash analysis, autonomous vehicles, and smart city research. Sayed is currently working at RAK Municipality. He is responsible for planning and designing the strategic Megaprojects of the Emirate of Ras Al Khaimah. This is after working for Parsons Corporation for more than 5 years. Sayed was part of the designing team responsible for the roads and infrastructure design of some of Dubai’s signature projects like EXPO 2020, Palm Deira, Dubai One Way system, Dubai Design District, Health care city, and a lot more.
HA is an Associate Professor at Cairo University, Traffic and Highway Engineering, and also a Director of Urban Transport Technologies at SETS. He completed his Ph.D. in ITS from the University of Toronto. Much of his professional and academic experience has been accumulated in Egypt, the Middle East, and in Canada in Traffic management, Transportation planning, operations, modeling, and optimization; ITS specifications, technical requirements, and functional testing; Smart Mobility Systems Concepts, Vision, and Strategy; Data analytics and visualization, data-driven innovation in transportation and spatial data management; and Roadway safety audits, operational reviews, speed management and traffic calming measures.
DS is an Associate Professor at Cairo University. She completed her Ph.D. at Carleton University, Canada in 2008. She has taken part and led in several research projects in Canada and Egypt related to Traffic Safety on Highways, Driver Behaviour and its Relation to Geometric Design of Highways, and Using New Technologies for Capturing Driver Behaviour Parameters. She has received several prestigious scholarships and awards during her studies including awards by the Transportation Association of Canada, and the National Science and Engineering Research Council of Canada. She is also a Professional Engineer and is involved in several strategic transportation projects.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.