CN109034264A

CN109034264A - Traffic accident seriousness predicts CSP-CNN model and its modeling method

Info

Publication number: CN109034264A
Application number: CN201810930337.0A
Authority: CN
Inventors: 李彤; 郑明�; 朱锐
Original assignee: Yunnan University YNU
Current assignee: Yunnan University YNU
Priority date: 2018-08-15
Filing date: 2018-08-15
Publication date: 2018-12-18
Anticipated expiration: 2038-08-15
Also published as: CN109034264B

Abstract

The invention discloses a kind of traffic accident seriousness prediction CSP-CNN model and its modeling methods.CSP-CNN model includes mode input layer, the traffic accident data gray image set of mode input layer input traffic accident data conversion, and input convolutional layer to it and carry out convolutional calculation, the feature vector of the last one convolutional layer extraction is obtained, and this feature vector is input to full articulamentum；Full articulamentum carries out flatten operation to the feature vector of input, carries out linear process after converting thereof into one-dimensional vector, and full articulamentum includes 3 hidden units, exports 3 linear process results to model output layer；3 traffic accident severity levels are arranged in model output layer, and are predicted using Softmax activation primitive traffic accident seriousness.The present invention has fully considered time-space relationship, syntagmatic and deeper internal relation between traffic accident feature, and is predicted traffic accident seriousness.

Description

Traffic accident seriousness predicts CSP-CNN model and its modeling method

Technical field

The invention belongs to data mining technology fields, more particularly to a kind of traffic accident seriousness based on deep learning Prediction model and its modeling method.

Background technique

Every year, the whole world is more than that the life of 1,250,000 people is terminated because of road traffic accident, and there are also 20,000,000 to 50,000,000 people It is many therefore and disabled by non-lethal injury.Road Traffic Injury brings huge warp to personal, family and entire country The loss of Ji loss, road traffic collision is in the great majority the 3% of national GDP.

Accident (Crash) severity prediction be one of important step of Incident Management, it for emergency worker assess accident seriousness, Potential impact, the effective Incident Management program of implementation of assessment accident provide important information.Due to correctly predicted traffic accident The work of seriousness will provide extremely important help to the life saved in those accidents, and therefore, traffic accident is serious The forecasting problem of property can be described as a major challenge of current intelligent transport system field.

Traffic accident seriousness prediction technique can be divided into statistical learning method and deep learning method two major classes at present.In recent years Come, deep learning starts the highest attention by researcher and business people as a kind of new machine learning method, wherein deep The degree theories of learning are explained text, image and sound, have obtained widely in fields such as text, image and speech recognitions Using.Nerual network technique is as a kind of efficient depth learning technology, because it has ability of processing multidimensional data, realizes The advantages that flexibility, versatility and stronger predictive ability and be widely used in traffic forecast problem.It is tight in traffic accident Principal characteristic prediction aspect, MehmetMetinKunt etc., entitled " the Prediction for Traffic Accident delivered Severity:Comparing the Artificial Neural Network,Genetic Algorithm,Combined Genetic Algorithm and Pattern Search Methods ", Transport, 2011,26, (4), 353-366 are logical It crosses in multilayer perceptron (MLP) structure modelling method in genetic method (GA), pattern search and artificial neural network (ANN) The seriousness of traffic accidents is predicted using 12 accident relevant parameters.The foundation of these models is based on Teheran- The lattice flood highway 2007 1000 traffic accident data sets in total occurred, according to R value, root-mean-square error (RMSE) is average exhausted Best fit model is selected to error (MAE) and square sum of the deviations (SSE).The experimental results showed that the R peak of MLP is about 0.87, show that MLP provides optimum prediction result.Zeng, Q.and Huang, H., entitled " the A Stable and delivered Optimized Neural Network Model for Crash Injury Severity Prediction”,Accident Analysis&Prevention, 2014,73,351-358, which propose a kind of convex combination (CC) method, to be come quickly and steadily trains Function approximation (N2PFA) is used for for neural network (NN) model of traffic accident seriousness prediction and improved NN beta pruning Method optimizes network structure, and by they and the NN for propagating (BP) method and ordered logic (OL) model training by conventional counter It is compared, the results showed that, CC method is better than BP method in convergence capabilities and training speed.Compared with the NN being fully connected, The NN of optimization includes the network node of much less and has almost much the same classification accuracy.They all have than OL model Preferably fitting and estimated performance, this again demonstrate neural networks in terms of predicting traffic accident seriousness better than statistics mould Type.Sameen etc., entitled " the Severity Prediction of Traffic Accidents with delivered Recurrent Neural Networks ", Applied Sciences, 2017,7, (6) pass through Recognition with Recurrent Neural Network (LSTM- RNN), 1130 traffic accidents North-South Freeway Malaysian during 2009 to 2015 occurred are analyzed and are answered Prediction for traffic accident seriousness.Theirs the experimental results showed that, return (BLR) model phase with MLP and Bayesian logic Than LSTM-RNN model is better than MLP and BLR model, and the verifying accuracy rate of LSTM-RNN model is 71.77%, and MLP and BLR Model respectively reaches 65.48% and 58.30%.

Now, CNN has become one of the research hotspot of numerous scientific domains, it is a kind of quickly and effectively feed forward neural Network is widely used in computer vision, image recognition and field of speech recognition and obtains significant achievement.CNN is in feature It is not also to connect entirely that extract the convolutional layer that aspect has a characteristic that in first, CNN, which be part connection, this indicates output mind Only it is connected with local adjacent input neuron through member；Another layer structure in second, CNN, pond layer, it is only selected that Significant feature is selected to property from region of acceptance, this considerably reduce the parameter scales of model；Third, full articulamentum only exist The final stage of CNN uses.The factor for influencing traffic accident seriousness mainly includes following five big features: road surface characteristic, accident Feature, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor.However, there is no go to consider and excavate the above related work in detail These influence space, combination and the deeper internal relation between the feature of Traffic Casualties seriousness.

Summary of the invention

The purpose of the present invention is to provide a kind of traffic accident seriousness prediction model and its modeling methods, according to traffic thing Therefore traffic accident data set is converted gray level image form by the importance of feature, traffic accident of the construction based on deep learning is tight Principal characteristic predicts CSP-CNN model, extracts space, combination and deeper inherence between Traffic Casualties seriousness feature Relationship predicts Traffic Casualties seriousness.

The technical scheme adopted by the invention is that traffic accident seriousness predicts CSP-CNN model, by following four part Composition: mode input layer, convolutional layer, full articulamentum and model output layer；

The mode input layer for inputting traffic accident data gray image set, and provides input for convolutional layer；

The convolutional layer, for extracting the pumping of traffic accident data set from the traffic accident data set gray level image of input As feature；

The full articulamentum, the feature vector of the traffic accident data set for extracting and learning the last one convolutional layer After being converted into one-dimensional vector, linear process is carried out based on the one-dimensional vector, and export linear processing result；

The model output layer is tight using the prediction traffic accident of Softmax activation primitive for the output to full articulamentum Principal characteristic；

Wherein, the convolutional layer has 4, and 256 filters, convolution kernel size kernel size are arranged in each volume base =3 , Walk long stride=1 mend 0 parameter pad=1；

The full articulamentum includes 1 flatten layers and 128 hidden units；

The model output layer, the i.e. full articulamentum of softmax include 3 hidden units.

The modeling method of traffic accident seriousness prediction CSP-CNN model, the specific steps are as follows:

Step 1: the importance based on traffic accident feature is by traffic accident data conversion at traffic accident data gray figure Image set, and it is input to mode input layer, the inputting mathematical form expression of traffic accident seriousness prediction model CSP-CNN is such as Under:

Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set x, PC are traffic thing Therefore father's Characteristic Number of data set x, CC are maximum subcharacter number under father's feature of all traffic accident data set x, max (PC, CC) indicates the maximum value of both PC and CC, P_MMIndicate the gray level image picture element matrix x of traffic accident data set_dIn M The pixel of row M column；

Step 2: convolutional calculation: convolutional calculation is carried out using activation primitive ReLU to the input that mode input layer provides, is swashed Function ReLU living are as follows:

G (h)=max (0, h)； (2)

Wherein, h is the input of convolutional Neural member；

Convolutional calculation formula are as follows:

Wherein: a_{K, l}Indicate the row k l column element of convolutional layer Feature Map, wherein the value range of e and f is [1,F]；C is channel number, identical as filter number of convolutional layer；F is the size of filter, the width and height of filter It spends identical；w_{C, e, f}Indicate the e row f column weight of c-th of channel filter；p_c,k,lIndicate c-th of channel ash of input picture Spend the pixel element of the row k l column of image；p_c,k+e,l+fIndicate the kth+e row of c-th of channel gray level image of input picture The pixel element of l+f column；w_bThe bias term for indicating filter, when each model running, random initializtion w_b；

For the input of convolutional Neural member；

Step 3: full articulamentum calculates: the feature vector that the last one convolutional layer is extracted and learnt uses following equation Input by flatten operational transition at one-dimensional vector as full articulamentum:

a^flatten=flatten ([a₁,a₂,...,a_c]),c∈[1,C]； (4)

Wherein, a^flattenIndicate the one-dimensional vector of transformation, the Feature Map of the full articulamentum after as flatten； [a₁, a₂..., a_c] be the last one convolutional layer output, as the last one convolutional layer extract and study feature vector [Feature Map1,Feature Map2,…,FeatureMapc]；

The calculation formula of full articulamentum is as follows:

Wherein:Indicate the linear convergent rate of full articulamentum, w_flIndicate the weight of full articulamentum, b_flIndicate the inclined of full articulamentum Set item；

Step 4: the prediction of traffic accident seriousness: setting traffic accident menace level is fender-bender or serious friendship Interpreter's event or fatal traffic accident three classes, output of the model output layer according to full articulamentumIt is pre- using Softmax activation primitive Traffic accident seriousness is surveyed, exports the probability value of the traffic accident grade for setting, the maximum traffic accident grade of probability value is For the traffic accident seriousness of prediction；

Step 5: traffic accident seriousness prediction CSP-CNN model is trained, confirms CSP-CNN model hyper parameter Combination.

Importance based on traffic accident feature in the step 1 is by traffic accident data conversion at traffic accident data Grayscale image image set realizes that process is as follows:

Step 1: obtaining the eigenmatrix FM by pretreated traffic accident data set；

Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, by traffic Corresponding feature vector FV is converted to gray level image in the eigenmatrix FM of casualty data collection；

Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in grayscale image In chained list grayImageList, and return to gray level image grayImage.

Pretreated steps are as follows for traffic accident data set in the step 1:

(1) imperfect, mistake and duplicate traffic accident data are deleted, and serious to Traffic Casualties are influenced The subcharacter of property is deleted；

(2) traffic accident data set normalizes, and removes the unit limitation of data, converts nondimensional cardinar number for data Value: being normalized traffic accident data set x using the standardized method Z-score Normalization in statistics, Obtain data symbol standardized normal distribution, the conversion function of Z-score Normalization are as follows:

Wherein, x^*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is single special The mark of all data is poor under sign；Successively each feature in traffic accident data set x is respectively calculated.

The eigenmatrix FM of traffic accident data set is obtained in the step 1, the specific steps are as follows:

Step 1.1. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain All father's feature fp of data:

Fp={ fp₁..., fp_m}； (7)

Wherein, m indicates that original traffic casualty data concentrates the quantity of father's feature of certain data；

The original traffic casualty data that step 1.2. obtains data prediction confirmation concentrates all subcharacters of certain data Fc:

Wherein, [1, m] i ∈, j ∈ [1, n], fc_i,jIndicate that original traffic casualty data concentrates j-th of son of certain data Feature, and father's feature of the subcharacter is fp_i, and meet:AndWherein i ≠ j, i.e., every height are special Sign belongs to and is pertaining only to 1 father's feature；The subcharacter number scale of i-th of father's feature is Np_i=| fp_i|；

Step 1.3. determines that original traffic casualty data concentrates the importance weight vector of all subcharacters of certain data Wc:

Wc=(w_1,1..., w_i,j)； (9)

Wherein, w_{I, j}Indicate that original traffic casualty data concentrates the importance weight of j-th of subcharacter of certain data, and The subcharacter belongs to father's feature fp_i；

Step 1.4. determines the feature vector FV of certain data in traffic accident data set, is in traffic accident data set The expression-form of certain data feature is a triple:

FV=<fp, fc, wc>； (10)

Step 1.5. determines the eigenmatrix FM of traffic accident data set, is all data characteristicses of traffic accident data set Expression-form, be the set of a feature vector:

FM={ FV₁,...,FV_k, and FM ∈ R^k×n； (11)

Wherein, k indicates that the total number of original traffic casualty data collection, n indicate that original traffic casualty data concentrates certain number According to subcharacter quantity, R^k×nIndicate the matrix of a k × n.

The feature vector FV of certain data in traffic accident data set is converted into gray level image in the step 2, specifically Include the following steps:

Step 2.1. sorts out original traffic casualty data collection feature: the feature according to certain data in traffic accident data set Vector FV sorts out its n subcharacter respectively into corresponding m father feature, while initializing the important of all subcharacter fc Property weight vector wc；

Step 2.2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and return to the father The subcharacter number of feature；

The subcharacter number of above-mentioned return is compared by step 2.3. with m, and the maximum value of the two is defined as max_ Then dim initializes a full null matrix Mat^{max_dim×max_dim}As final traffic accident data set storage element；

Step 2.4. is according to original traffic casualty data collection to full null matrix Mat^{max_dim×max_dim}It is filled；

Step 2.5. calls the Reshap function of graphics process, gives full null matrix Mat^{max_dim×max_dim}Increase a channel, It is converted into gray level image grayImage.

According to original traffic casualty data collection to full null matrix Mat in the step 2.4^{max_dim×max_dim}It is filled Specific step is as follows:

The arrangement of step 2.4.1. father's feature descending: according to the weight w p of each father's feature, all father's feature fp are carried out Descending arrangement, the weight w p of some father's feature_iEqual to the importance weight w of subcharacters all under it_i,jThe sum of, i.e. wp_i=Σ w_{I, j}；

Step 2.4.2.Mat^{max_dim×max_dim}Row filling: according to the weight w p of each father's feature, all descendings are arranged Father's feature fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than "；

Step 2.4.3. subcharacter descending arrangement: according to the importance weight w of the subcharacter under each father's feature_i,j, to it Under subcharacter carry out descending arrangement；

Step 2.4.4.Mat^{max_dim×max_dim}Column filling: in full null matrix Mat^{max_dim×max_dim}In corresponding row, that is, exist In each father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out；

Step 2.4.5. keeps full null matrix Mat^{max_dim×max_dim}Remaining element value is that " 0 " is constant, obtains final result Matrix.

The original traffic casualty data concentrates the quantity m=5 of father's feature of certain data；The original traffic accident number According to the quantity n=12 for the subcharacter for concentrating certain data；

Fp={ Accident Characteristic₁, road surface characteristic₂, environmental factor₃, vehicle characteristics₄, Characteristics of Drivers ' Behavior₅}；

{ the east orientation position fc=_1,1, north orientation position_1,2, No. 1 road classification_1,3, traffic injury time_1,4, vehicle that accident is related to Quantity_1,5, surface conditions_2,6, lighting condition_3,7, weather condition_3,8, type of vehicle_4,9, casualty category_5,10, injures and deaths gender_5,11, The injures and deaths age_5,12}。

Determine that original traffic casualty data concentrates the importance weight of all subcharacters of certain data in the step 1.3 Vector wc is to carry out 1000 iteration to 12 subcharacters of traffic accident data set using XGBoost method to obtain.

CSP-CNN model hyper parameter combines in the step 5 are as follows: batch size=128, loss function are Categorical crossentropy, optimizer are Adagrad (gradient decline optimizer), learning rate 0.001, error term For 1e-07, initializes convolution kernel and use Glorot normal distribution initial method (glorot_normal).

The beneficial effects of the present invention are:

(1) importance based on traffic accident feature has been put forward for the first time by traffic accident data conversion into grayscale image image set Method；

(2) traffic accident data are converted into grayscale image image set for the first time and carry out pre- test cross as the input of deep learning model Logical accident (Crash) severity；

(3) according to the importance based on traffic accident feature by traffic accident data conversion at grayscale image image set method and The CSP-CNN model proposed has fully considered time-space relationship between traffic accident feature, syntagmatic and deeper Internal relation, and traffic accident seriousness is predicted.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is the structural schematic diagram of traffic accident seriousness prediction CSP-CNN model；

Fig. 2 is the traffic accident data set injures and deaths seriousness distribution map of Leeds City, Britain 2009-2016；

Fig. 3 is traffic accident data set feature significance distribution；

Fig. 4 is that the feature vector of certain data in traffic accident data set is converted to gray level image process schematic；

Fig. 5 is the accuracy of CSP-CNN model under different depth；

Fig. 6 is the accuracy rate under different model experiments；

Fig. 7 is local sensing traffic accident data set；

Fig. 8 (a) is accurate rate, recall rate and the F1 Score broken line of different model predictions under fender-bender test set Figure；

Fig. 8 (b) is accurate rate, recall rate and the F1 Score broken line of different model predictions under severe traffic accidents test set Figure；

Fig. 8 (c) is accurate rate, recall rate and the F1 Score broken line of different model predictions under fatal traffic accident test set Figure.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Predict the seriousness of Traffic Casualties, it is necessary to comprehensively consider the traffic accident data set with characteristic information, It mainly includes following five father's features: road surface characteristic that knowing, which influences the factor of traffic accident seriousness, Accident Characteristic, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor, herein based on five father's features of above-mentioned influence Traffic Casualties seriousness, to will hand over Interpreter's event data set is converted into the method for gray level image form and CSP-CNN model is described in detail.

By the feature vector FV of certain data in traffic accident data set based on the importance of traffic accident feature by traffic Casualty data is converted into grayscale image image set and realizes that process is as follows:

Step 1: the eigenmatrix FM of traffic accident data set is obtained；

Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, to friendship Corresponding feature vector FV carries out greyscale image transitions in the eigenmatrix FM of interpreter's event data set；

Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in gray scale In figure chained list grayImageList, and return to gray level image grayImage.

Obtain the eigenmatrix FM of traffic accident data set, the specific steps are as follows:

Fp={ fp₁..., fp_m}； (7)

Step 1.2. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain All subcharacter fc of data:

Wherein, [1, m] i ∈, j ∈ [1, n], fc_{I, j}Indicate that original traffic casualty data concentrates j-th of son of certain data Feature, and father's feature of the subcharacter is fp_i, and meet:AndWherein i ≠ j, i.e., every height Feature belongs to and is pertaining only to 1 father's feature；The subcharacter number scale of i-th of father's feature is Np_i=| fp_i|；

Wc=(w_1,1..., w_i,j)； (9)

FV=<fp, fc, wc>； (10)

FM={ FV₁,...,FV_k, and FM ∈ R^k×n (11)

The feature vector FV of certain data in traffic accident data set is converted into gray level image, specifically includes following step It is rapid:

Step 1. sorts out original traffic casualty data collection feature: according to certain data in traffic accident data set feature to FV is measured, its n subcharacter is sorted out respectively into corresponding m father feature, while initializing the importance of all subcharacter fc Weight vector wc；

Step 2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and returns to father spy The subcharacter number of sign；

The subcharacter number of above-mentioned return is compared by step 3. with m, and the maximum value of the two is defined as max_dim, Then a full null matrix Mat is initialized^{max_dim×max_dim}As final traffic accident data set storage element；

Step 4. is according to original traffic casualty data collection to full null matrix Mat^{max_dim×max_dim}It is filled；

Step 5. calls the Reshap function of graphics process, gives full null matrix Mat^{max_dim×max_dim}Increase a channel, it will It is converted to gray level image grayImage.

According to original traffic casualty data collection to full null matrix Mat^{max_dim×max_dim}The step of being filled is as follows:

The arrangement of step 4.1. father's feature descending: according to the weight w p of each father's feature, all father's feature fp are dropped Sequence arrangement, the weight w p of some father's feature_iEqual to the importance weight w of subcharacters all under it_{I, j}The sum of, i.e. wp_i=Σ w_{I, j}；

Step 4.2.Mat^{max_dim×max_dim}Row filling: according to the weight w p of each father's feature, the father that all descendings are arranged Feature fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than "；Such as full null matrix Mat^max ^{_dim×max_dim}Line number be odd number, then father's feature of maximum weight is placed on most middle row, second largest father's feature of weight It will be placed into the lastrow of most middle row, the third-largest father's feature of weight will be placed into the second row above most middle row, most After the completion of the filling of the upper surface of middle row, continue to fill from the next line of most middle row, such descending pad；Such as full null matrix Mat^{max_dim×max_dim}Line number be even number, then father's feature of maximum weight is placed on a line of top in most middle two rows, power A line following in most middle two rows will be placed by being worth second largest father's feature；

The arrangement of step 4.3. subcharacter descending: by the subcharacter that current each father's feature is included be it is unordered, in turn After the completion of above step, the importance weight w according to the subcharacter under each father's feature is also needed_{I, j}, special to the son under it Sign carries out descending arrangement；

Step 4.4.Mat^{max_dim×max_dim}Column filling: in full null matrix Mat^{max_dim×max_dim}In corresponding row, i.e., every In a father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out；Greatly such as the 2nd Father's feature under there may be 3 subcharacters, the 2nd big father's feature is placed in full null matrix Mat^{max_dim×max_dim}The second row, that At this point, these three subcharacters will be respectively placed in (2,3) of matrix, (2,2), (2,4) unit；

Step 4.5. keeps full null matrix Mat^{max_dim×max_dim}Remaining element value is that " 0 " is constant, obtains final result square Battle array.

In traffic accident data set of the invention in the feature vector FV of certain data, m=5, n=12:

{ the east orientation position fc=_1,1, north orientation position_1,2, No. 1 road classification_1,3, traffic injury time_1,4, vehicle that accident is related to Quantity_1,5, surface conditions_2,6, lighting condition_3,7, weather condition_3,8, type of vehicle_4,9, casualty category_5,10, injures and deaths gender_5,11, The injures and deaths age_5,12}；

Wc=(0.165774538_1,1, 0.171530785_1,2, 0.082228259_1,3, 0.047771472_1,4, 0.060763375_1,5, 0.048847406_2,6, 0.041826936_3,7, 0.04354843_3,8, 0.126314657_4,9, 0.067057589_5,10, 0.049116389_5,11, 0.095220163_5,12)。

Establish traffic accident seriousness prediction CSP-CNN model

Since CNN is the extracting method for having its unique image key feature, so that CNN is in terms of image recognition and understanding Show stronger learning ability.Compared with other deep learning models, there are two unique features for CNN tool: part connection It is shared with weight.Part connection refers to that each neuron is only connect with one piece of region for being referred to as receptive field of input neuron, And the shared filter for referring to that extraction characteristics of image neuron uses of weight is shared.The two unique features are determined jointly Parameter other deep learning models that compare for having determined CNN are less.

In order to adapt to traffic environment, set forth herein CSP-CNN models.In following side compared with traditional CNN model Face has specificity: (1) input of model is different, i.e. only one channel of the traffic-accident image of CSP-CNN mode input, i.e., Grayscale image, it is substantially a picture element matrix, and pixel value range from 0 to traffic accident feature normalization after numerical value. On the contrary, in the CNN model in image recognition and calssification problem, input picture usually has that there are three channels, i.e. RGB, and picture The range of plain value is from 0 to 255；(2) from 12 father's features, according to the 5 father's features chosen herein: Accident Characteristic, road surface Feature, environmental factor, vehicle characteristics and Characteristics of Drivers ' Behavior, when being converted into matrix, matrix dimensionality 5x5, it is carried out to Down-sampling operates the characteristic information that can destroy natively few Traffic Casualties seriousness, therefore, in CSP-CNN not Downsampling operation (pond layer Pooling) in traditional CNN model；(3) model output is different, in traffic environment, The output of CSP-CNN is the prediction of Traffic Casualties seriousness, and in image recognition and calssification problem, the output of CNN is Image category label.

As shown in Figure 1, setting convolution kernel size kernel size=3 , Walk long stride=1, mends 0 parameter pad=1, hand over The structure of logical accident (Crash) severity prediction CSP-CNN model contains four major parts: input, convolutional layer, the full connection of model Layer and model output layer.

Firstly, the input of CSP-CNN carries out conversion institute to traffic accident data for the importance based on traffic accident feature Traffic accident data set gray level image is obtained, it contains the 5 father's features and 12 subcharacters of traffic accident.Correspondingly, model Inputting mathematical form be expressed as follows:

Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set, PC are traffic thing Therefore father's Characteristic Number of data set, CC are maximum subcharacter number under all father's features, P_MMIndicate traffic accident data set Gray level image picture element matrix x_dIn M row M column pixel, core layer of the convolutional layer as CSP-CNN, its purpose is to mention The abstract characteristics in traffic accident data set are taken, in order to clearly describe the calculating process of convolutional layer, first to traffic accident number It is numbered according to each pixel of collection gray level image, P_c,k,lIndicate the row k l of c-th of channel gray level image of input picture The pixel element of column；Then each weight of filter is numbered, uses w_{C, e, f}Indicate the e row of c-th of channel filter F column weight；Finally, calculating convolution using activation primitive Rectified Linear Unit (ReLU):

Activation primitive ReLU are as follows: g (h)=max (0, h) (2)

Wherein, h indicates the input of neuron, convolutional calculation formula are as follows:

Wherein: a_{K, l}Indicate the row k l column element of Feature Map；C is channel number, with convolutional layer Filter number is identical；F is the size of filter (width or height, the two are identical)；p_c,k+e,l+fIndicate the c of input picture The pixel element of the kth+e row l+f column of a channel gray level image；The value range of e and f is [1, F]；w_bIndicate filter's Bias term, when each model running, random initializtion w_b；For the input of convolutional Neural member.

After each convolutional layer can have multiple filter, each filter and original traffic accident image to carry out convolution, all An available Feature Map.Therefore, the filter of the channel number of the Feature Map after convolution and convolutional layer Number is identical.

The setting of full articulamentum, the feature vector that the last one convolutional layer extracts and learns is passed through using following equation Input of the flatten operational transition at one-dimensional vector as first full articulamentum:

a^flatten=flatten ([a₁,a₂,...,a_c]),c∈[1,C]； (4)

Wherein, a^flattenIndicate the one-dimensional vector of transformation, Feature Map, [a after as flatten₁, a₂..., a_c] For the last one convolutional layer extract and learn feature vector [Feature Map1, Feature Map2 ..., FeatureMapc]；

The calculation formula of full articulamentum is as follows:

Finally, the input by the output of upper one layer full articulamentum as next full articulamentum, and final outputTo output Layer, output layer classify to Traffic Casualties seriousness using Softmax activation primitive, and Softmax function can be to setting Each classification export a probability value, the maximum classification of probability value is the classification predicted, the output of model is corresponding hand over Logical accident casualty severity level, including fender-bender, severe traffic accidents and fatal traffic accident.

In addition to this, between convolutional layer and convolutional layer, between convolutional layer and full articulamentum and full articulamentum with connect entirely All carry out acceleration model between layer with Batch Normalization trains and prevents over-fitting.

Experimental result and analysis

Proposed CSP-CNN model is in Python using the open source deep learning frame of Google exploitation What TensorFlow was realized, this is because TensorFlow has the advantages such as availability, flexibility, high efficiency, it can be convenient ground Define and execute a variety of deep learning networks.Concrete configuration is Intel Xeon E5-2682 V4 (Broadwell) processor, 2.5GHz dominant frequency, Nvidia P100GPU, 12GiB video memory have 9.3TFLOPS single-precision floating point and 4.7TFLOPS double precision 100epochs pairs of CSP-CNN model experiment is used based on TensorFlow frame on the GPU server of Floating-point Computation ability 39403 samples (the 80% of data set) are trained, and are verified with 9851 samples (the 20% of data set).

(1) data collection

The traffic accident data of Leeds, England city council 8 years (2009-2016) are used for this experiment.This period is obtained The accident record sum obtained is 21436 parts.Ritz city traffic accident information each time in accident record, occurs in traffic accident While, 15 different subcharacters are collected, including place, the number being related to and vehicle, road surface, weather conditions etc..In order to examine Influence of the various factors to Traffic Casualties severity is examined, injures and deaths severity is divided into slight, serious and three fatal Rank.

(2) data prediction

It before traffic accident data set is applied to CSP-CNN as input, needs to pre-process data set, walk It suddenly include: that data pre-processing, the processing of data category imbalance and data are converted into image, the specific steps are as follows:

1) processing of data early periods includes being deleted imperfect, mistake and duplicate traffic accident data, being influenced The subcharacter of Traffic Casualties seriousness is deleted and the normalization of traffic accident data set.Delete imperfect, mistake and again The entire data set that can be trained after multiple data shares 18727 datas.Wherein, different traffic accident severity data The ratio for collecting shared total data set is as shown in Figure 2, wherein 88% traffic accident belongs to minor accident, 11% traffic accident Belong to major accident, 1% traffic accident belongs to disastrous accident.

Secondly, 15 different subcharacters of traffic accident data set predict whether to be associated with quilt according to it with traffic accident seriousness 12 are reduced to, covers road surface characteristic, Accident Characteristic, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor, as shown in table 1.

12 subcharacters of 1 traffic accident data set of table and corresponding description

Since the dimension of 12 each features of subcharacter of traffic accident is different, it is therefore desirable to the number under each feature According to being normalized, the unit limitation of data is removed, nondimensional pure values are translated into, convenient for not commensurate or magnitude Feature, which is able to carry out, to be compared.In addition to this, the normalization of traffic accident data set can also bring lift scheme convergence rate and The effect of precision.By utilizing the standardized method Z-score Normalization (Zero-Mean in statistics Normalization data symbol standardized normal distribution, i.e. mean value are obtained after) traffic accident data set x is normalized It is 0, standard deviation 1 converts function are as follows:

Wherein, x^*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is single special The mark of all data is poor under sign；Each feature is carried out respectively when calculating；

2) as can be seen from FIG. 2, fatal and serious traffic accident only accounts for the sub-fraction of traffic accident sum, if not needle The unbalanced situation of traffic accident data set is handled, the training of model will be paid attention to accounting for the big data class of total data ratio Not, it has ignored and accounts for the lesser data category of total data ratio, and eventually lead to trained model to the biggish sample of proportion This classification over-fitting, and to the lesser sample class poor fitting of proportion.In general, by the method for sampling for imbalance There are two types of the processing modes of data, i.e. lack sampling and over-sampling, since lack sampling can lose a part of data set, causes to fill Divide ground to utilize data set, in order to fully utilize traffic accident data set, solves data using the method for over-sampling not herein The problem of balance.Simple oversampler method is exactly random over-sampling, increases minority class by the strategy of simple copy sample Sample, but this method easily lead to model learning to information be excessively especially unable to it is extensive, i.e., model training occur Over-fitting, for this purpose, we use based on synthesis minority class oversampling technique (Synthetic Minority Oversampling Technique, SMOTE) improved Borderline-SMOTE2 method solves the problems, such as this, by using this method, we are most The traffic accident data set obtained eventually is 49254, and wherein the ratio of minor accident, major accident and disastrous accident data set is 1:1:1, i.e., it is 16418 each.

3) in order to preferably extract space, combination and the deeper internal relation of traffic thing data set features, according to The big father's feature of the five of traffic accident and corresponding subcharacter are converted into gray level image form, as the defeated of CSP-CNN model Enter variable, the characteristics of using CNN, from bottom to high level, specifically comes preferably to learn traffic accident data set features from being abstracted into Between space, combination and deeper internal relation, and finally obtain a Traffic Casualties seriousness prediction mould Type.It is main that traffic accident data set is converted into gray level image including the following steps: (1) based on XGBoost to traffic thing Therefore 12 subcharacters progress, 1000 iteration obtain significance distribution as a result, shown in the following Fig. 3 of distribution results and table 2；(2) it will hand over Input of the importance and traffic accident data set of interpreter's event data set feature as method FM2GI, exports traffic accident number According to the gray level image form of collection.

2 traffic accident data set importance value of table

Fig. 4 illustrates how that the feature vector by certain data in traffic accident data set is converted to gray level image.

(3) hyper parameter of CSP-CNN

The interface provided by scikit-learn, in conjunction with GridSearchCV and RandomizeSearchCV method pair The parameter combination of CSP-CNN carries out the search of 100epochs, it is determined that optimal CSP-CNN hyper parameter combination.It is used only GridSearchCV needs the calculating cost of great number, and RandomizeSearchCV, which is used only, can find the super ginseng of local optimum Array is closed, and in order to preferably utilize them, uses RandomizeSearchCV when global search optimal hyper parameter combination, And GridSearchCV is used when local search optimal hyper parameter combination, being calculated as of needing in this way originally will reduce by one It is not easy in the case where falling into the combination of local optimum hyper parameter a bit and also, hyper parameter is adjusted by this method combined of intersecting Combination can obtain more preferably result.There is the model of various hyper parameters combination by establishing, and utilize 5-fold cross- Validation assesses each model, finally obtains the highest hyper parameter combination of accuracy.Table 3 is shown using the mixing side Hyper parameter combination used in CSP-CNN after method search.

The hyper parameter of 3 CSP-CNN model of table combines

(4) CSP-CNN depth analysis

In general, multiple modules and multilayer can be stacked in deep learning model, thus analyze network depth with Awareness network behavior is extremely important, in general, the depth of CNN should not too greatly can not be too small, therefore CNN can learn more Complicated relationship, while keeping the convergence of model.Different depth values is distributed to CSP-CNN model from small to large to survey Examination.Table 4 lists the network structure under the different depth of CSP-CNN, is tested according to the CSP-CNN network structure in table 4 Obtain the accuracy rate of training set of the CSP-CNN shown in fig. 5 under different depth structure and verifying collection.When the depth of CSP-CNN When being 5, the accuracy of training set and verifying collection is respectively 96.24% and 92%；When depth is 7, the accuracy rate for verifying collection reaches To highest 93.42%, correspondingly, the accuracy rate of training set is 97.45%；When the depth of CSP-CNN model is 7 or more, though The accuracy rate of the training set of right model is being gradually increasing, but the accuracy rate for verifying collection gradually decreases, this shows CSP-CNN model The situation of over-fitting is initially entered, the CSP-CNN model training collection of depth 9,11,13 and the accuracy of verifying collection are respectively 97.91%, 98.03%, 98.27% and 93.36%, 93.34%, 93.23%.By using 4 with 256 filters A convolutional layer, 1 faltten layers, 1 full articulamentum comprising 128 hidden units and 1 include 3 hidden units The full articulamentum of softmax realizes best accuracy rate, and the training set and verifying collection accuracy rate of the model have respectively reached 93.42% With 97.45%.Therefore depth is used to be tested for 7 CSP-CNN model herein.

CSP-CNN model under 4 different depth of table

(5) experimental result is compared with other models

In order to illustrate the validity of mentioned CSP-CNN model herein, this experiment is by the model and 6 statistical models and 3 A deep learning model is compared.Wherein, 6 statistical models are respectively: k nearest neighbor method (KNN) is a kind of for classifying With the nonparametric technique of recurrence；Decision tree (DT) is the combination that a complicated decision is resolved into several simple decisions, it is desirable to The final solution obtained in this way is similar to expected solution；Naive Bayes Classifier (NBC) is one A simple " probability classification " race has strong (naivety) independence assumption based on Bayes theorem is applied between feature； Logistic return (LR) by using logistic function (i.e. cumulative logistic distribution) estimated probability come measurement category because Relationship between variable and one or more independents variable；Grad enhancement (GB) is a kind of for returning and the statistics of classification problem Habit technology, it generates prediction model in the form of weak prediction model set, and the thought of grad enhancement is derived from Leo Braitman Observation；Support vector machines (SVMs is also support vector network) is the supervised learning model with relevant learning method, for dividing Analysis is for classifying and the data of regression analysis.Correspondingly, 3 deep learning methods are respectively as follows: neural network (NNs) or connection master Adopted system is a kind of computing system, its faintly inspiration by the biological neural network for constituting animal brain represents traditional Neural network, and attempt through hiding layer come learning characteristic；It is RNN that shot and long term, which remembers recurrent neural network (LSTM-RNN), Extension, and become since the framework is capable of handling long-term memory and the disappearance gradient problem that avoids traditional RNN from being subjected to It is popular；One-dimensional convolution (Conv1D) is the convolution form of convolutional neural networks (CNNs), is usually used in series model and natural language Processing.

More, above-mentioned 6 statistical models are realized by the interface that scikit-learn is provided, and parameter is set as silent Recognize parameter.Neural network model is set as 4 hidden layers and corresponding 245 neurons of each hidden layer, 1 softmax connect entirely Layer is connect, activation primitive is relu and optimizer is stochastic gradient descent method (SGD), and in addition to this, each layer of initial parameter is uniform.Shot and long term memory recurrent neural network contains one LSTM layers and has respectively with 128 hidden units and three The hidden layer of 128,256,512 neural units, correspondingly, the last layer are the full articulamentums of softmax, and optimizer is SGD, ginseng Number is learning rate=0.01, decay=0.9, momentum=0.8.The parameter of Conv1D is set as hidden comprising 4 Hiding layer and it is respectively provided with 256 hiding neural units, the last layer is the full articulamentum of softmax, and activation primitive is relu and excellent Changing device is Adam.

Table 5 and Fig. 6 are illustrated to traffic accident data set using 6 statistical models, 3 deep learning models and CSP- The experimental result of training set and verifying collection accuracy that CNN is obtained.The result shows that in the accuracy of test set, it is proposed that CSP-CNN model be better than other statistical models and deep learning model, this explanation, CSP-CNN can well it is extensive On new traffic accident data set.Although the accuracy on training set, CSP-CNN is not highest, and training set accuracy is most Obviously there is over-fitting in high DT model, and the high CSP-CNN of training set second does not have then.One the possible reason is Because will be considered that and do not deposited between the feature of traffic accident data set when statistical models treat traffic accident data set vector In local correlations, space, combination and the deeper internal relation between traffic accident data set features are had ignored.Together Sample, for deep learning model, these deep learning models can not also analyze traffic thing from the angle of model structure Therefore the spatial relationship between data set features, and there are very strong correlation and inherences between these traffic accident data set features Relationship.And the CSP-CNN model that is mentioned herein, it is local sensing, can sufficiently extract traffic accident data set features it Between spatial relationship, syntagmatic and deeper internal relation, simple declaration is as shown in Figure 7.Fig. 7 is traffic accident number According to the picture element matrix form of collection image, from figure 7 it can be seen that CSP-CNN model passes through the filter (convolution kernel) of specific size, On the one hand, it can be extracted accordingly according to the different importance of subcharacter (such as 12 traffic accident subcharacters in Fig. 7) Traffic accident feature, on the other hand, CSP-CNN model given full play to the specific ability of local sensing, not will be considered that spy Be between sign have no it is associated, it can extract with the subcharacter of spatial relationship and internal relation combination after feature, for example, In Fig. 7, filter is learning to extract subcharacter lighting condition under sliding window, weather condition, and casualty category at the injures and deaths age, is related to Vehicle number, east to and northern directional combination after traffic accident feature, which clearly demonstrates mentioned CSP-CNN mould herein Type is how to extract the traffic accident feature rich in spatial relationship, syntagmatic and deeper internal relation.

Accuracy rate under the different model experiments of table 5

Essential purpose to the prediction of Traffic Casualties seriousness is in order in time to involved in generation traffic accident Personnel provide corresponding medical rescue, reduce accident casualties, notify corresponding urgent decision-making section in time, avoid causing Bigger property loss.For this purpose, the Traffic Casualties seriousness of prediction is further divided into three kinds of degree to divide by us Analysis: i.e. fender-bender, severe traffic accidents and fatal traffic accident.Since correctness is not evaluation model predictive ability Sole indicator, and for the practical application scene of binding model, we introduce accurate rate, recall rate and F1 Score and come to traffic Accident test set is analyzed, and wherein the calculation formula of accurate rate is as follows:

Wherein, TP (True Positive) indicates real example, i.e., true classification is positive example, and prediction classification is negative example；FP (False Positive) indicates false positive example, i.e., true classification is negative example, and prediction classification is positive example.

The calculation formula of recall rate is as follows:

Wherein, FN (False Negative) indicates false negative example, i.e., true classification is positive example, and prediction classification is negative example.

The calculation formula of F1 Score is as follows:

Table 6 and Fig. 8 are slight, different models under serious and fatal traffic accident accurate rate, recall rate and F1 Score In the experimental result of traffic accident test set.

Accurate rate, recall rate and the F1 Score of different model predictions under the different Traffic Casualties seriousness of table 6

By table 6 and Fig. 8 it is found that for it is on fender-bender test set the results show that CSP-CNN model accurate rate It is highest compared with other models, and recall rate is statistical models GB highest；For on severe traffic accidents test set As a result illustrate, the accurate rate of CSP-CNN, recall rate are all highest compared with other nine models；And for fatal traffic thing Therefore on test set the result shows that, the accurate rate of CSP-CNN, NN and Conv1D, recall rate are compared with other models, arranged side by side One.In conjunction with practical scenario analysis, for the prediction of fender-bender, we can allow for the accurate rate of prediction to exist centainly Error, because fender-bender can't cause the wounded heavy casualties and the heavy losses of property in very maximum probability, and , just must be relatively high to the precise requirements of prediction for serious and fatal traffic accident, as long as because slightly predicting not Accurately, the decision of corresponding emergency medical support and corresponding urgent department may cannot be provided, great people is finally brought Member's injures and deaths and property loss, then performance of CSP-CNN model is better than it if being analyzed in conjunction with specific situation angle from this His model.In general, accuracy rate and recall rate are interactional, and accuracy rate is high, and recall rate is with regard to low；Recall rate is low, accurately Rate is just high, both is ideally relatively high certainly, and in order to it is fair it is objective for the sake of, it is believed that with one with accurately It is common method that the closely related overall target F1 Score of the two indexs of rate and recall rate, which carrys out the performance of evaluation model,.From As a result as can be seen that in slight and severe traffic accidents test set, the F1 Score of mentioned CSP-CNN model is higher than herein Other models, and in fatal traffic accident test set, the F1 Score and NN, Conv1D the arranged side by side 1st of CSP-CNN model.

To sum up, it is either analyzed from the accuracy of model prediction, also allows for specific application scenario to different serious Traffic accident under degree is analyzed, and the performance of the CSP-CNN model mentioned herein is superior to other models.

This paper presents deep learning side's CSP-CNN models to predict traffic accident seriousness.With only focused in the past Different in the simple structure of traffic accident data, mentioned method can successfully issue the feature representation of traffic accident seriousness, than Such as time-space relationship nonlinear between traffic accident feature, syntagmatic and deeper internal relation.We are based on benefit Hereby 8 years traffic accident data sets are tested between the 2009-2016 of city council, by this paper institute climbing form type CSP-CNN with NBC, KNN, LR, DT, GB, SVM, Conv1D, NN and LSTM-RNN model are compared, the experimental results showed that this paper institute climbing form type Performance is better than other above-mentioned models.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims

1. traffic accident seriousness predicts CSP-CNN model, which is characterized in that be made of following four part: mode input layer, Convolutional layer, full articulamentum and model output layer；

The convolutional layer, for extracting the abstract spy of traffic accident data set from the traffic accident data set gray level image of input Sign；

The full articulamentum, the feature vector conversion of the traffic accident data set for extracting and learning the last one convolutional layer After one-dimensional vector, linear process is carried out based on the one-dimensional vector, and export linear processing result；

The model output layer predicts traffic accident seriousness using Softmax activation primitive for the output to full articulamentum；

Wherein, the convolutional layer has 4, and 256 filters, convolution kernel size kernel size=3 is arranged in each volume base, Walk long stride=1 mends 0 parameter pad=1；

The full articulamentum includes 1 flatten layers and 128 hidden units；

2. the modeling method of traffic accident seriousness prediction CSP-CNN model as described in claim 1, which is characterized in that specific Steps are as follows:

Step 1: the importance based on traffic accident feature is by traffic accident data conversion at traffic accident data gray image Collection, and it is input to mode input layer, the inputting mathematical form expression of traffic accident seriousness prediction model CSP-CNN is such as Under:

Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set x, PC are traffic accident number According to father's Characteristic Number of collection x, maximum subcharacter number under father's feature that CC is all traffic accident data set x, max (PC, CC the maximum value of both PC and CC, P) are indicated_MMIndicate the gray level image picture element matrix x of traffic accident data set_dIn M row M The pixel of column；

Step 2: convolutional calculation: convolutional calculation is carried out using activation primitive ReLU to the input that mode input layer provides, activates letter Number ReLU are as follows:

G (h)=max (0, h)； (2)

Wherein, h is the input of convolutional Neural member；

Convolutional calculation formula are as follows:

Wherein: a_{K, l}Indicate the row k l column element of convolutional layer Feature Map, wherein the value range of e and f is [1, F]； C is channel number, identical as filter number of convolutional layer；F is the size of filter, the width and height phase of filter Together；w_{C, e, f}Indicate the e row f column weight of c-th of channel filter；p_c,k,lIndicate c-th of channel grayscale image of input picture The pixel element of the row k l column of picture；p_c,k+e,l+fIndicate the kth+e row l+f of c-th of channel gray level image of input picture The pixel element of column；w_bThe bias term for indicating filter, when each model running, random initializtion w_b；

For the input of convolutional Neural member；

Step 3: full articulamentum calculates: the feature vector that the last one convolutional layer extracts and learns is passed through using following equation Input of the flatten operational transition at one-dimensional vector as full articulamentum:

a^flatten=flatten ([a₁,a₂,...,a_c]),c∈[1,C]； (4)

Wherein, a^flattenIndicate the one-dimensional vector of transformation, the Feature Map of the full articulamentum after as flatten；[a₁, a₂..., a_c] be the last one convolutional layer output, as the last one convolutional layer extract and study feature vector [Feature Map1,Feature Map2,…,FeatureMapc]；

The calculation formula of full articulamentum is as follows:

Wherein:Indicate the linear convergent rate of full articulamentum, w_flIndicate the weight of full articulamentum, b_flIndicate the biasing of full articulamentum ?；

Step 4: the prediction of traffic accident seriousness: setting traffic accident menace level is fender-bender or serious traffic thing Therefore or fatal traffic accident three classes, output of the model output layer according to full articulamentumUtilize the pre- test cross of Softmax activation primitive Logical accident (Crash) severity, exports the probability value of the traffic accident grade for setting, the maximum traffic accident grade of probability value is as pre- The traffic accident seriousness of survey；

Step 5: being trained traffic accident seriousness prediction CSP-CNN model, confirmation CSP-CNN model hyper parameter combination.

3. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute The importance based on traffic accident feature in step 1 is stated by traffic accident data conversion into traffic accident data gray image set Realization process is as follows:

Step 1: obtaining the eigenmatrix FM by pretreated traffic accident data set；

Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, by traffic accident Corresponding feature vector FV is converted to gray level image in the eigenmatrix FM of data set；

Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in grayscale image chained list In grayImageList, and return to gray level image grayImage.

4. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 3, which is characterized in that institute Stating traffic accident data set in step 1, pretreated steps are as follows:

(1) imperfect, mistake and duplicate traffic accident data are deleted, and to influence Traffic Casualties seriousness Subcharacter is deleted；

(2) traffic accident data set normalizes, and removes the unit limitation of data, converts nondimensional pure values for data: benefit Traffic accident data set x is normalized with the standardized method Z-score Normalization in statistics, is counted According to symbol standardized normal distribution, the conversion function of Z-score Normalization are as follows:

Wherein, x^*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is under single feature The mark of all data is poor；Successively each feature in traffic accident data set x is respectively calculated.

5. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 3, which is characterized in that institute State the eigenmatrix FM that traffic accident data set is obtained in step 1, the specific steps are as follows:

Step 1.1. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain data All father's feature fp:

Fp={ fp₁..., fp_m}； (7)

The original traffic casualty data that step 1.2. obtains data prediction confirmation concentrates all subcharacter fc of certain data:

Wherein, [1, m] i ∈, j ∈ [1, n], fc_i,jIndicate that original traffic casualty data concentrates j-th of subcharacter of certain data, And father's feature of the subcharacter is fp_i, and meet:AndWherein i ≠ j, i.e., each subcharacter category In and be pertaining only to 1 father's feature；The subcharacter number scale of i-th of father's feature is Np_i=| fp_i|；

Step 1.3. determines that original traffic casualty data concentrates the importance weight vector wc of all subcharacters of certain data:

Wc=(w_1,1..., w_i,j)； (9)

Wherein, w_i,jIndicate that original traffic casualty data concentrates the importance weight of j-th of subcharacter of certain data, and the son Feature belongs to father's feature fp_i；

Step 1.4. determines the feature vector FV of certain data in traffic accident data set, is certain in traffic accident data set The expression-form of data characteristics is a triple:

FV=<fp, fc, wc>； (10)

Step 1.5. determines the eigenmatrix FM of traffic accident data set, is the table of all data characteristicses of traffic accident data set It is the set of a feature vector up to form:

FM={ FV₁,...,FV_k, and FM ∈ R^k×n； (11)

Wherein, k indicates that the total number of original traffic casualty data collection, n indicate that original traffic casualty data concentrates certain data The quantity of subcharacter, R^k×nIndicate the matrix of a k × n.

6. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute It states in step 2 and the feature vector FV of certain data in traffic accident data set is converted into gray level image, specifically include following step It is rapid:

Step 2.1. sorts out original traffic casualty data collection feature: the feature vector according to certain data in traffic accident data set FV sorts out its n subcharacter respectively into corresponding m father feature, while initializing the importance power of all subcharacter fc It is worth vector wc；

Step 2.2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and return to father's feature Subcharacter number；

The subcharacter number of above-mentioned return is compared by step 2.3. with m, and the maximum value of the two is defined as max_dim, so A full null matrix Mat is initialized afterwards^{max_dim×max_dim}As final traffic accident data set storage element；

Step 2.5. calls the Reshap function of graphics process, gives full null matrix Mat^{max_dim×max_dim}Increase a channel, by it Be converted to gray level image grayImage.

7. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 5, which is characterized in that institute It states in step 2.4 according to original traffic casualty data collection to full null matrix Mat^{max_dim×max_dim}The specific steps being filled are such as Under:

The arrangement of step 2.4.1. father's feature descending: according to the weight w p of each father's feature, descending is carried out to all father's feature fp Arrangement, the weight w p of some father's feature_iEqual to the importance weight w of subcharacters all under it_i,jThe sum of, i.e. wp_i=Σ w_{I, j}；

Step 2.4.2.Mat^{max_dim×max_dim}Row filling: according to the weight w p of each father's feature, the father that all descendings are arranged is special Levy fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than "；

Step 2.4.3. subcharacter descending arrangement: according to the importance weight w of the subcharacter under each father's feature_i,j, under it Subcharacter carries out descending arrangement；

Step 2.4.4.Mat^{max_dim×max_dim}Column filling: in full null matrix Mat^{max_dim×max_dim}In corresponding row, i.e., each In father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out；

Step 2.4.5. keeps full null matrix Mat^{max_dim×max_dim}Remaining element value is that " 0 " is constant, obtains final result square Battle array.

8. special according to the modeling method of any one of the claim 5~7 traffic accident seriousness prediction CSP-CNN model Sign is that the original traffic casualty data concentrates the quantity m=5 of father's feature of certain data；The original traffic accident number According to the quantity n=12 for the subcharacter for concentrating certain data；

{ the east orientation position fc=_1,1, north orientation position_1,2, No. 1 road classification_1,3, traffic injury time_1,4, vehicle number that accident is related to Amount_1,5, surface conditions_2,6, lighting condition_3,7, weather condition_3,8, type of vehicle_4,9, casualty category_5,10, injures and deaths gender_5,11, injures and deaths Age_5,12}。

9. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 5, which is characterized in that institute It states and determines that original traffic casualty data concentrates the importance weight vector wc of all subcharacters of certain data in step 1.3, be 1000 iteration are carried out to 12 subcharacters of traffic accident data set using XGBoost method to obtain.

10. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute It states CSP-CNN model hyper parameter in step 5 to combine are as follows: batch size=128, loss function Categorical Crossentropy, optimizer are that gradient declines optimizer, learning rate 0.001, and error term 1e-07 initializes convolution kernel Using Glorot normal distribution initial method.