CN109034264A - Traffic accident seriousness predicts CSP-CNN model and its modeling method - Google Patents

Traffic accident seriousness predicts CSP-CNN model and its modeling method Download PDF

Info

Publication number
CN109034264A
CN109034264A CN201810930337.0A CN201810930337A CN109034264A CN 109034264 A CN109034264 A CN 109034264A CN 201810930337 A CN201810930337 A CN 201810930337A CN 109034264 A CN109034264 A CN 109034264A
Authority
CN
China
Prior art keywords
traffic accident
feature
data
traffic
father
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810930337.0A
Other languages
Chinese (zh)
Other versions
CN109034264B (en
Inventor
李彤
郑明�
朱锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201810930337.0A priority Critical patent/CN109034264B/en
Publication of CN109034264A publication Critical patent/CN109034264A/en
Application granted granted Critical
Publication of CN109034264B publication Critical patent/CN109034264B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a kind of traffic accident seriousness prediction CSP-CNN model and its modeling methods.CSP-CNN model includes mode input layer, the traffic accident data gray image set of mode input layer input traffic accident data conversion, and input convolutional layer to it and carry out convolutional calculation, the feature vector of the last one convolutional layer extraction is obtained, and this feature vector is input to full articulamentum;Full articulamentum carries out flatten operation to the feature vector of input, carries out linear process after converting thereof into one-dimensional vector, and full articulamentum includes 3 hidden units, exports 3 linear process results to model output layer;3 traffic accident severity levels are arranged in model output layer, and are predicted using Softmax activation primitive traffic accident seriousness.The present invention has fully considered time-space relationship, syntagmatic and deeper internal relation between traffic accident feature, and is predicted traffic accident seriousness.

Description

Traffic accident seriousness predicts CSP-CNN model and its modeling method
Technical field
The invention belongs to data mining technology fields, more particularly to a kind of traffic accident seriousness based on deep learning Prediction model and its modeling method.
Background technique
Every year, the whole world is more than that the life of 1,250,000 people is terminated because of road traffic accident, and there are also 20,000,000 to 50,000,000 people It is many therefore and disabled by non-lethal injury.Road Traffic Injury brings huge warp to personal, family and entire country The loss of Ji loss, road traffic collision is in the great majority the 3% of national GDP.
Accident (Crash) severity prediction be one of important step of Incident Management, it for emergency worker assess accident seriousness, Potential impact, the effective Incident Management program of implementation of assessment accident provide important information.Due to correctly predicted traffic accident The work of seriousness will provide extremely important help to the life saved in those accidents, and therefore, traffic accident is serious The forecasting problem of property can be described as a major challenge of current intelligent transport system field.
Traffic accident seriousness prediction technique can be divided into statistical learning method and deep learning method two major classes at present.In recent years Come, deep learning starts the highest attention by researcher and business people as a kind of new machine learning method, wherein deep The degree theories of learning are explained text, image and sound, have obtained widely in fields such as text, image and speech recognitions Using.Nerual network technique is as a kind of efficient depth learning technology, because it has ability of processing multidimensional data, realizes The advantages that flexibility, versatility and stronger predictive ability and be widely used in traffic forecast problem.It is tight in traffic accident Principal characteristic prediction aspect, MehmetMetinKunt etc., entitled " the Prediction for Traffic Accident delivered Severity:Comparing the Artificial Neural Network,Genetic Algorithm,Combined Genetic Algorithm and Pattern Search Methods ", Transport, 2011,26, (4), 353-366 are logical It crosses in multilayer perceptron (MLP) structure modelling method in genetic method (GA), pattern search and artificial neural network (ANN) The seriousness of traffic accidents is predicted using 12 accident relevant parameters.The foundation of these models is based on Teheran- The lattice flood highway 2007 1000 traffic accident data sets in total occurred, according to R value, root-mean-square error (RMSE) is average exhausted Best fit model is selected to error (MAE) and square sum of the deviations (SSE).The experimental results showed that the R peak of MLP is about 0.87, show that MLP provides optimum prediction result.Zeng, Q.and Huang, H., entitled " the A Stable and delivered Optimized Neural Network Model for Crash Injury Severity Prediction”,Accident Analysis&Prevention, 2014,73,351-358, which propose a kind of convex combination (CC) method, to be come quickly and steadily trains Function approximation (N2PFA) is used for for neural network (NN) model of traffic accident seriousness prediction and improved NN beta pruning Method optimizes network structure, and by they and the NN for propagating (BP) method and ordered logic (OL) model training by conventional counter It is compared, the results showed that, CC method is better than BP method in convergence capabilities and training speed.Compared with the NN being fully connected, The NN of optimization includes the network node of much less and has almost much the same classification accuracy.They all have than OL model Preferably fitting and estimated performance, this again demonstrate neural networks in terms of predicting traffic accident seriousness better than statistics mould Type.Sameen etc., entitled " the Severity Prediction of Traffic Accidents with delivered Recurrent Neural Networks ", Applied Sciences, 2017,7, (6) pass through Recognition with Recurrent Neural Network (LSTM- RNN), 1130 traffic accidents North-South Freeway Malaysian during 2009 to 2015 occurred are analyzed and are answered Prediction for traffic accident seriousness.Theirs the experimental results showed that, return (BLR) model phase with MLP and Bayesian logic Than LSTM-RNN model is better than MLP and BLR model, and the verifying accuracy rate of LSTM-RNN model is 71.77%, and MLP and BLR Model respectively reaches 65.48% and 58.30%.
Now, CNN has become one of the research hotspot of numerous scientific domains, it is a kind of quickly and effectively feed forward neural Network is widely used in computer vision, image recognition and field of speech recognition and obtains significant achievement.CNN is in feature It is not also to connect entirely that extract the convolutional layer that aspect has a characteristic that in first, CNN, which be part connection, this indicates output mind Only it is connected with local adjacent input neuron through member;Another layer structure in second, CNN, pond layer, it is only selected that Significant feature is selected to property from region of acceptance, this considerably reduce the parameter scales of model;Third, full articulamentum only exist The final stage of CNN uses.The factor for influencing traffic accident seriousness mainly includes following five big features: road surface characteristic, accident Feature, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor.However, there is no go to consider and excavate the above related work in detail These influence space, combination and the deeper internal relation between the feature of Traffic Casualties seriousness.
Summary of the invention
The purpose of the present invention is to provide a kind of traffic accident seriousness prediction model and its modeling methods, according to traffic thing Therefore traffic accident data set is converted gray level image form by the importance of feature, traffic accident of the construction based on deep learning is tight Principal characteristic predicts CSP-CNN model, extracts space, combination and deeper inherence between Traffic Casualties seriousness feature Relationship predicts Traffic Casualties seriousness.
The technical scheme adopted by the invention is that traffic accident seriousness predicts CSP-CNN model, by following four part Composition: mode input layer, convolutional layer, full articulamentum and model output layer;
The mode input layer for inputting traffic accident data gray image set, and provides input for convolutional layer;
The convolutional layer, for extracting the pumping of traffic accident data set from the traffic accident data set gray level image of input As feature;
The full articulamentum, the feature vector of the traffic accident data set for extracting and learning the last one convolutional layer After being converted into one-dimensional vector, linear process is carried out based on the one-dimensional vector, and export linear processing result;
The model output layer is tight using the prediction traffic accident of Softmax activation primitive for the output to full articulamentum Principal characteristic;
Wherein, the convolutional layer has 4, and 256 filters, convolution kernel size kernel size are arranged in each volume base =3 , Walk long stride=1 mend 0 parameter pad=1;
The full articulamentum includes 1 flatten layers and 128 hidden units;
The model output layer, the i.e. full articulamentum of softmax include 3 hidden units.
The modeling method of traffic accident seriousness prediction CSP-CNN model, the specific steps are as follows:
Step 1: the importance based on traffic accident feature is by traffic accident data conversion at traffic accident data gray figure Image set, and it is input to mode input layer, the inputting mathematical form expression of traffic accident seriousness prediction model CSP-CNN is such as Under:
Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set x, PC are traffic thing Therefore father's Characteristic Number of data set x, CC are maximum subcharacter number under father's feature of all traffic accident data set x, max (PC, CC) indicates the maximum value of both PC and CC, PMMIndicate the gray level image picture element matrix x of traffic accident data setdIn M The pixel of row M column;
Step 2: convolutional calculation: convolutional calculation is carried out using activation primitive ReLU to the input that mode input layer provides, is swashed Function ReLU living are as follows:
G (h)=max (0, h); (2)
Wherein, h is the input of convolutional Neural member;
Convolutional calculation formula are as follows:
Wherein: aK, lIndicate the row k l column element of convolutional layer Feature Map, wherein the value range of e and f is [1,F];C is channel number, identical as filter number of convolutional layer;F is the size of filter, the width and height of filter It spends identical;wC, e, fIndicate the e row f column weight of c-th of channel filter;pc,k,lIndicate c-th of channel ash of input picture Spend the pixel element of the row k l column of image;pc,k+e,l+fIndicate the kth+e row of c-th of channel gray level image of input picture The pixel element of l+f column;wbThe bias term for indicating filter, when each model running, random initializtion wb
For the input of convolutional Neural member;
Step 3: full articulamentum calculates: the feature vector that the last one convolutional layer is extracted and learnt uses following equation Input by flatten operational transition at one-dimensional vector as full articulamentum:
aflatten=flatten ([a1,a2,...,ac]),c∈[1,C]; (4)
Wherein, aflattenIndicate the one-dimensional vector of transformation, the Feature Map of the full articulamentum after as flatten; [a1, a2..., ac] be the last one convolutional layer output, as the last one convolutional layer extract and study feature vector [Feature Map1,Feature Map2,…,FeatureMapc];
The calculation formula of full articulamentum is as follows:
Wherein:Indicate the linear convergent rate of full articulamentum, wflIndicate the weight of full articulamentum, bflIndicate the inclined of full articulamentum Set item;
Step 4: the prediction of traffic accident seriousness: setting traffic accident menace level is fender-bender or serious friendship Interpreter's event or fatal traffic accident three classes, output of the model output layer according to full articulamentumIt is pre- using Softmax activation primitive Traffic accident seriousness is surveyed, exports the probability value of the traffic accident grade for setting, the maximum traffic accident grade of probability value is For the traffic accident seriousness of prediction;
Step 5: traffic accident seriousness prediction CSP-CNN model is trained, confirms CSP-CNN model hyper parameter Combination.
Importance based on traffic accident feature in the step 1 is by traffic accident data conversion at traffic accident data Grayscale image image set realizes that process is as follows:
Step 1: obtaining the eigenmatrix FM by pretreated traffic accident data set;
Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, by traffic Corresponding feature vector FV is converted to gray level image in the eigenmatrix FM of casualty data collection;
Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in grayscale image In chained list grayImageList, and return to gray level image grayImage.
Pretreated steps are as follows for traffic accident data set in the step 1:
(1) imperfect, mistake and duplicate traffic accident data are deleted, and serious to Traffic Casualties are influenced The subcharacter of property is deleted;
(2) traffic accident data set normalizes, and removes the unit limitation of data, converts nondimensional cardinar number for data Value: being normalized traffic accident data set x using the standardized method Z-score Normalization in statistics, Obtain data symbol standardized normal distribution, the conversion function of Z-score Normalization are as follows:
Wherein, x*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is single special The mark of all data is poor under sign;Successively each feature in traffic accident data set x is respectively calculated.
The eigenmatrix FM of traffic accident data set is obtained in the step 1, the specific steps are as follows:
Step 1.1. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain All father's feature fp of data:
Fp={ fp1..., fpm}; (7)
Wherein, m indicates that original traffic casualty data concentrates the quantity of father's feature of certain data;
The original traffic casualty data that step 1.2. obtains data prediction confirmation concentrates all subcharacters of certain data Fc:
Wherein, [1, m] i ∈, j ∈ [1, n], fci,jIndicate that original traffic casualty data concentrates j-th of son of certain data Feature, and father's feature of the subcharacter is fpi, and meet:AndWherein i ≠ j, i.e., every height are special Sign belongs to and is pertaining only to 1 father's feature;The subcharacter number scale of i-th of father's feature is Npi=| fpi|;
Step 1.3. determines that original traffic casualty data concentrates the importance weight vector of all subcharacters of certain data Wc:
Wc=(w1,1..., wi,j); (9)
Wherein, wI, jIndicate that original traffic casualty data concentrates the importance weight of j-th of subcharacter of certain data, and The subcharacter belongs to father's feature fpi
Step 1.4. determines the feature vector FV of certain data in traffic accident data set, is in traffic accident data set The expression-form of certain data feature is a triple:
FV=<fp, fc, wc>; (10)
Step 1.5. determines the eigenmatrix FM of traffic accident data set, is all data characteristicses of traffic accident data set Expression-form, be the set of a feature vector:
FM={ FV1,...,FVk, and FM ∈ Rk×n; (11)
Wherein, k indicates that the total number of original traffic casualty data collection, n indicate that original traffic casualty data concentrates certain number According to subcharacter quantity, Rk×nIndicate the matrix of a k × n.
The feature vector FV of certain data in traffic accident data set is converted into gray level image in the step 2, specifically Include the following steps:
Step 2.1. sorts out original traffic casualty data collection feature: the feature according to certain data in traffic accident data set Vector FV sorts out its n subcharacter respectively into corresponding m father feature, while initializing the important of all subcharacter fc Property weight vector wc;
Step 2.2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and return to the father The subcharacter number of feature;
The subcharacter number of above-mentioned return is compared by step 2.3. with m, and the maximum value of the two is defined as max_ Then dim initializes a full null matrix Matmax_dim×max_dimAs final traffic accident data set storage element;
Step 2.4. is according to original traffic casualty data collection to full null matrix Matmax_dim×max_dimIt is filled;
Step 2.5. calls the Reshap function of graphics process, gives full null matrix Matmax_dim×max_dimIncrease a channel, It is converted into gray level image grayImage.
According to original traffic casualty data collection to full null matrix Mat in the step 2.4max_dim×max_dimIt is filled Specific step is as follows:
The arrangement of step 2.4.1. father's feature descending: according to the weight w p of each father's feature, all father's feature fp are carried out Descending arrangement, the weight w p of some father's featureiEqual to the importance weight w of subcharacters all under iti,jThe sum of, i.e. wpi=Σ wI, j
Step 2.4.2.Matmax_dim×max_dimRow filling: according to the weight w p of each father's feature, all descendings are arranged Father's feature fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than ";
Step 2.4.3. subcharacter descending arrangement: according to the importance weight w of the subcharacter under each father's featurei,j, to it Under subcharacter carry out descending arrangement;
Step 2.4.4.Matmax_dim×max_dimColumn filling: in full null matrix Matmax_dim×max_dimIn corresponding row, that is, exist In each father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out;
Step 2.4.5. keeps full null matrix Matmax_dim×max_dimRemaining element value is that " 0 " is constant, obtains final result Matrix.
The original traffic casualty data concentrates the quantity m=5 of father's feature of certain data;The original traffic accident number According to the quantity n=12 for the subcharacter for concentrating certain data;
Fp={ Accident Characteristic1, road surface characteristic2, environmental factor3, vehicle characteristics4, Characteristics of Drivers ' Behavior5};
{ the east orientation position fc=1,1, north orientation position1,2, No. 1 road classification1,3, traffic injury time1,4, vehicle that accident is related to Quantity1,5, surface conditions2,6, lighting condition3,7, weather condition3,8, type of vehicle4,9, casualty category5,10, injures and deaths gender5,11, The injures and deaths age5,12}。
Determine that original traffic casualty data concentrates the importance weight of all subcharacters of certain data in the step 1.3 Vector wc is to carry out 1000 iteration to 12 subcharacters of traffic accident data set using XGBoost method to obtain.
CSP-CNN model hyper parameter combines in the step 5 are as follows: batch size=128, loss function are Categorical crossentropy, optimizer are Adagrad (gradient decline optimizer), learning rate 0.001, error term For 1e-07, initializes convolution kernel and use Glorot normal distribution initial method (glorot_normal).
The beneficial effects of the present invention are:
(1) importance based on traffic accident feature has been put forward for the first time by traffic accident data conversion into grayscale image image set Method;
(2) traffic accident data are converted into grayscale image image set for the first time and carry out pre- test cross as the input of deep learning model Logical accident (Crash) severity;
(3) according to the importance based on traffic accident feature by traffic accident data conversion at grayscale image image set method and The CSP-CNN model proposed has fully considered time-space relationship between traffic accident feature, syntagmatic and deeper Internal relation, and traffic accident seriousness is predicted.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is the structural schematic diagram of traffic accident seriousness prediction CSP-CNN model;
Fig. 2 is the traffic accident data set injures and deaths seriousness distribution map of Leeds City, Britain 2009-2016;
Fig. 3 is traffic accident data set feature significance distribution;
Fig. 4 is that the feature vector of certain data in traffic accident data set is converted to gray level image process schematic;
Fig. 5 is the accuracy of CSP-CNN model under different depth;
Fig. 6 is the accuracy rate under different model experiments;
Fig. 7 is local sensing traffic accident data set;
Fig. 8 (a) is accurate rate, recall rate and the F1 Score broken line of different model predictions under fender-bender test set Figure;
Fig. 8 (b) is accurate rate, recall rate and the F1 Score broken line of different model predictions under severe traffic accidents test set Figure;
Fig. 8 (c) is accurate rate, recall rate and the F1 Score broken line of different model predictions under fatal traffic accident test set Figure.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Predict the seriousness of Traffic Casualties, it is necessary to comprehensively consider the traffic accident data set with characteristic information, It mainly includes following five father's features: road surface characteristic that knowing, which influences the factor of traffic accident seriousness, Accident Characteristic, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor, herein based on five father's features of above-mentioned influence Traffic Casualties seriousness, to will hand over Interpreter's event data set is converted into the method for gray level image form and CSP-CNN model is described in detail.
By the feature vector FV of certain data in traffic accident data set based on the importance of traffic accident feature by traffic Casualty data is converted into grayscale image image set and realizes that process is as follows:
Step 1: the eigenmatrix FM of traffic accident data set is obtained;
Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, to friendship Corresponding feature vector FV carries out greyscale image transitions in the eigenmatrix FM of interpreter's event data set;
Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in gray scale In figure chained list grayImageList, and return to gray level image grayImage.
Obtain the eigenmatrix FM of traffic accident data set, the specific steps are as follows:
Step 1.1. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain All father's feature fp of data:
Fp={ fp1..., fpm}; (7)
Wherein, m indicates that original traffic casualty data concentrates the quantity of father's feature of certain data;
Step 1.2. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain All subcharacter fc of data:
Wherein, [1, m] i ∈, j ∈ [1, n], fcI, jIndicate that original traffic casualty data concentrates j-th of son of certain data Feature, and father's feature of the subcharacter is fpi, and meet:AndWherein i ≠ j, i.e., every height Feature belongs to and is pertaining only to 1 father's feature;The subcharacter number scale of i-th of father's feature is Npi=| fpi|;
Step 1.3. determines that original traffic casualty data concentrates the importance weight vector of all subcharacters of certain data Wc:
Wc=(w1,1..., wi,j); (9)
Wherein, wI, jIndicate that original traffic casualty data concentrates the importance weight of j-th of subcharacter of certain data, and The subcharacter belongs to father's feature fpi
Step 1.4. determines the feature vector FV of certain data in traffic accident data set, is in traffic accident data set The expression-form of certain data feature is a triple:
FV=<fp, fc, wc>; (10)
Step 1.5. determines the eigenmatrix FM of traffic accident data set, is all data characteristicses of traffic accident data set Expression-form, be the set of a feature vector:
FM={ FV1,...,FVk, and FM ∈ Rk×n (11)
Wherein, k indicates that the total number of original traffic casualty data collection, n indicate that original traffic casualty data concentrates certain number According to subcharacter quantity, Rk×nIndicate the matrix of a k × n.
The feature vector FV of certain data in traffic accident data set is converted into gray level image, specifically includes following step It is rapid:
Step 1. sorts out original traffic casualty data collection feature: according to certain data in traffic accident data set feature to FV is measured, its n subcharacter is sorted out respectively into corresponding m father feature, while initializing the importance of all subcharacter fc Weight vector wc;
Step 2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and returns to father spy The subcharacter number of sign;
The subcharacter number of above-mentioned return is compared by step 3. with m, and the maximum value of the two is defined as max_dim, Then a full null matrix Mat is initializedmax_dim×max_dimAs final traffic accident data set storage element;
Step 4. is according to original traffic casualty data collection to full null matrix Matmax_dim×max_dimIt is filled;
Step 5. calls the Reshap function of graphics process, gives full null matrix Matmax_dim×max_dimIncrease a channel, it will It is converted to gray level image grayImage.
According to original traffic casualty data collection to full null matrix Matmax_dim×max_dimThe step of being filled is as follows:
The arrangement of step 4.1. father's feature descending: according to the weight w p of each father's feature, all father's feature fp are dropped Sequence arrangement, the weight w p of some father's featureiEqual to the importance weight w of subcharacters all under itI, jThe sum of, i.e. wpi=Σ wI, j
Step 4.2.Matmax_dim×max_dimRow filling: according to the weight w p of each father's feature, the father that all descendings are arranged Feature fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than ";Such as full null matrix Matmax _dim×max_dimLine number be odd number, then father's feature of maximum weight is placed on most middle row, second largest father's feature of weight It will be placed into the lastrow of most middle row, the third-largest father's feature of weight will be placed into the second row above most middle row, most After the completion of the filling of the upper surface of middle row, continue to fill from the next line of most middle row, such descending pad;Such as full null matrix Matmax_dim×max_dimLine number be even number, then father's feature of maximum weight is placed on a line of top in most middle two rows, power A line following in most middle two rows will be placed by being worth second largest father's feature;
The arrangement of step 4.3. subcharacter descending: by the subcharacter that current each father's feature is included be it is unordered, in turn After the completion of above step, the importance weight w according to the subcharacter under each father's feature is also neededI, j, special to the son under it Sign carries out descending arrangement;
Step 4.4.Matmax_dim×max_dimColumn filling: in full null matrix Matmax_dim×max_dimIn corresponding row, i.e., every In a father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out;Greatly such as the 2nd Father's feature under there may be 3 subcharacters, the 2nd big father's feature is placed in full null matrix Matmax_dim×max_dimThe second row, that At this point, these three subcharacters will be respectively placed in (2,3) of matrix, (2,2), (2,4) unit;
Step 4.5. keeps full null matrix Matmax_dim×max_dimRemaining element value is that " 0 " is constant, obtains final result square Battle array.
In traffic accident data set of the invention in the feature vector FV of certain data, m=5, n=12:
Fp={ Accident Characteristic1, road surface characteristic2, environmental factor3, vehicle characteristics4, Characteristics of Drivers ' Behavior5};
{ the east orientation position fc=1,1, north orientation position1,2, No. 1 road classification1,3, traffic injury time1,4, vehicle that accident is related to Quantity1,5, surface conditions2,6, lighting condition3,7, weather condition3,8, type of vehicle4,9, casualty category5,10, injures and deaths gender5,11, The injures and deaths age5,12};
Wc=(0.1657745381,1, 0.1715307851,2, 0.0822282591,3, 0.0477714721,4, 0.0607633751,5, 0.0488474062,6, 0.0418269363,7, 0.043548433,8, 0.1263146574,9, 0.0670575895,10, 0.0491163895,11, 0.0952201635,12)。
Establish traffic accident seriousness prediction CSP-CNN model
Since CNN is the extracting method for having its unique image key feature, so that CNN is in terms of image recognition and understanding Show stronger learning ability.Compared with other deep learning models, there are two unique features for CNN tool: part connection It is shared with weight.Part connection refers to that each neuron is only connect with one piece of region for being referred to as receptive field of input neuron, And the shared filter for referring to that extraction characteristics of image neuron uses of weight is shared.The two unique features are determined jointly Parameter other deep learning models that compare for having determined CNN are less.
In order to adapt to traffic environment, set forth herein CSP-CNN models.In following side compared with traditional CNN model Face has specificity: (1) input of model is different, i.e. only one channel of the traffic-accident image of CSP-CNN mode input, i.e., Grayscale image, it is substantially a picture element matrix, and pixel value range from 0 to traffic accident feature normalization after numerical value. On the contrary, in the CNN model in image recognition and calssification problem, input picture usually has that there are three channels, i.e. RGB, and picture The range of plain value is from 0 to 255;(2) from 12 father's features, according to the 5 father's features chosen herein: Accident Characteristic, road surface Feature, environmental factor, vehicle characteristics and Characteristics of Drivers ' Behavior, when being converted into matrix, matrix dimensionality 5x5, it is carried out to Down-sampling operates the characteristic information that can destroy natively few Traffic Casualties seriousness, therefore, in CSP-CNN not Downsampling operation (pond layer Pooling) in traditional CNN model;(3) model output is different, in traffic environment, The output of CSP-CNN is the prediction of Traffic Casualties seriousness, and in image recognition and calssification problem, the output of CNN is Image category label.
As shown in Figure 1, setting convolution kernel size kernel size=3 , Walk long stride=1, mends 0 parameter pad=1, hand over The structure of logical accident (Crash) severity prediction CSP-CNN model contains four major parts: input, convolutional layer, the full connection of model Layer and model output layer.
Firstly, the input of CSP-CNN carries out conversion institute to traffic accident data for the importance based on traffic accident feature Traffic accident data set gray level image is obtained, it contains the 5 father's features and 12 subcharacters of traffic accident.Correspondingly, model Inputting mathematical form be expressed as follows:
Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set, PC are traffic thing Therefore father's Characteristic Number of data set, CC are maximum subcharacter number under all father's features, PMMIndicate traffic accident data set Gray level image picture element matrix xdIn M row M column pixel, core layer of the convolutional layer as CSP-CNN, its purpose is to mention The abstract characteristics in traffic accident data set are taken, in order to clearly describe the calculating process of convolutional layer, first to traffic accident number It is numbered according to each pixel of collection gray level image, Pc,k,lIndicate the row k l of c-th of channel gray level image of input picture The pixel element of column;Then each weight of filter is numbered, uses wC, e, fIndicate the e row of c-th of channel filter F column weight;Finally, calculating convolution using activation primitive Rectified Linear Unit (ReLU):
Activation primitive ReLU are as follows: g (h)=max (0, h) (2)
Wherein, h indicates the input of neuron, convolutional calculation formula are as follows:
Wherein: aK, lIndicate the row k l column element of Feature Map;C is channel number, with convolutional layer Filter number is identical;F is the size of filter (width or height, the two are identical);pc,k+e,l+fIndicate the c of input picture The pixel element of the kth+e row l+f column of a channel gray level image;The value range of e and f is [1, F];wbIndicate filter's Bias term, when each model running, random initializtion wbFor the input of convolutional Neural member.
After each convolutional layer can have multiple filter, each filter and original traffic accident image to carry out convolution, all An available Feature Map.Therefore, the filter of the channel number of the Feature Map after convolution and convolutional layer Number is identical.
The setting of full articulamentum, the feature vector that the last one convolutional layer extracts and learns is passed through using following equation Input of the flatten operational transition at one-dimensional vector as first full articulamentum:
aflatten=flatten ([a1,a2,...,ac]),c∈[1,C]; (4)
Wherein, aflattenIndicate the one-dimensional vector of transformation, Feature Map, [a after as flatten1, a2..., ac] For the last one convolutional layer extract and learn feature vector [Feature Map1, Feature Map2 ..., FeatureMapc];
The calculation formula of full articulamentum is as follows:
Wherein:Indicate the linear convergent rate of full articulamentum, wflIndicate the weight of full articulamentum, bflIndicate the inclined of full articulamentum Set item;
Finally, the input by the output of upper one layer full articulamentum as next full articulamentum, and final outputTo output Layer, output layer classify to Traffic Casualties seriousness using Softmax activation primitive, and Softmax function can be to setting Each classification export a probability value, the maximum classification of probability value is the classification predicted, the output of model is corresponding hand over Logical accident casualty severity level, including fender-bender, severe traffic accidents and fatal traffic accident.
In addition to this, between convolutional layer and convolutional layer, between convolutional layer and full articulamentum and full articulamentum with connect entirely All carry out acceleration model between layer with Batch Normalization trains and prevents over-fitting.
Experimental result and analysis
Proposed CSP-CNN model is in Python using the open source deep learning frame of Google exploitation What TensorFlow was realized, this is because TensorFlow has the advantages such as availability, flexibility, high efficiency, it can be convenient ground Define and execute a variety of deep learning networks.Concrete configuration is Intel Xeon E5-2682 V4 (Broadwell) processor, 2.5GHz dominant frequency, Nvidia P100GPU, 12GiB video memory have 9.3TFLOPS single-precision floating point and 4.7TFLOPS double precision 100epochs pairs of CSP-CNN model experiment is used based on TensorFlow frame on the GPU server of Floating-point Computation ability 39403 samples (the 80% of data set) are trained, and are verified with 9851 samples (the 20% of data set).
(1) data collection
The traffic accident data of Leeds, England city council 8 years (2009-2016) are used for this experiment.This period is obtained The accident record sum obtained is 21436 parts.Ritz city traffic accident information each time in accident record, occurs in traffic accident While, 15 different subcharacters are collected, including place, the number being related to and vehicle, road surface, weather conditions etc..In order to examine Influence of the various factors to Traffic Casualties severity is examined, injures and deaths severity is divided into slight, serious and three fatal Rank.
(2) data prediction
It before traffic accident data set is applied to CSP-CNN as input, needs to pre-process data set, walk It suddenly include: that data pre-processing, the processing of data category imbalance and data are converted into image, the specific steps are as follows:
1) processing of data early periods includes being deleted imperfect, mistake and duplicate traffic accident data, being influenced The subcharacter of Traffic Casualties seriousness is deleted and the normalization of traffic accident data set.Delete imperfect, mistake and again The entire data set that can be trained after multiple data shares 18727 datas.Wherein, different traffic accident severity data The ratio for collecting shared total data set is as shown in Figure 2, wherein 88% traffic accident belongs to minor accident, 11% traffic accident Belong to major accident, 1% traffic accident belongs to disastrous accident.
Secondly, 15 different subcharacters of traffic accident data set predict whether to be associated with quilt according to it with traffic accident seriousness 12 are reduced to, covers road surface characteristic, Accident Characteristic, vehicle characteristics, Characteristics of Drivers ' Behavior and environmental factor, as shown in table 1.
12 subcharacters of 1 traffic accident data set of table and corresponding description
Since the dimension of 12 each features of subcharacter of traffic accident is different, it is therefore desirable to the number under each feature According to being normalized, the unit limitation of data is removed, nondimensional pure values are translated into, convenient for not commensurate or magnitude Feature, which is able to carry out, to be compared.In addition to this, the normalization of traffic accident data set can also bring lift scheme convergence rate and The effect of precision.By utilizing the standardized method Z-score Normalization (Zero-Mean in statistics Normalization data symbol standardized normal distribution, i.e. mean value are obtained after) traffic accident data set x is normalized It is 0, standard deviation 1 converts function are as follows:
Wherein, x*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is single special The mark of all data is poor under sign;Each feature is carried out respectively when calculating;
2) as can be seen from FIG. 2, fatal and serious traffic accident only accounts for the sub-fraction of traffic accident sum, if not needle The unbalanced situation of traffic accident data set is handled, the training of model will be paid attention to accounting for the big data class of total data ratio Not, it has ignored and accounts for the lesser data category of total data ratio, and eventually lead to trained model to the biggish sample of proportion This classification over-fitting, and to the lesser sample class poor fitting of proportion.In general, by the method for sampling for imbalance There are two types of the processing modes of data, i.e. lack sampling and over-sampling, since lack sampling can lose a part of data set, causes to fill Divide ground to utilize data set, in order to fully utilize traffic accident data set, solves data using the method for over-sampling not herein The problem of balance.Simple oversampler method is exactly random over-sampling, increases minority class by the strategy of simple copy sample Sample, but this method easily lead to model learning to information be excessively especially unable to it is extensive, i.e., model training occur Over-fitting, for this purpose, we use based on synthesis minority class oversampling technique (Synthetic Minority Oversampling Technique, SMOTE) improved Borderline-SMOTE2 method solves the problems, such as this, by using this method, we are most The traffic accident data set obtained eventually is 49254, and wherein the ratio of minor accident, major accident and disastrous accident data set is 1:1:1, i.e., it is 16418 each.
3) in order to preferably extract space, combination and the deeper internal relation of traffic thing data set features, according to The big father's feature of the five of traffic accident and corresponding subcharacter are converted into gray level image form, as the defeated of CSP-CNN model Enter variable, the characteristics of using CNN, from bottom to high level, specifically comes preferably to learn traffic accident data set features from being abstracted into Between space, combination and deeper internal relation, and finally obtain a Traffic Casualties seriousness prediction mould Type.It is main that traffic accident data set is converted into gray level image including the following steps: (1) based on XGBoost to traffic thing Therefore 12 subcharacters progress, 1000 iteration obtain significance distribution as a result, shown in the following Fig. 3 of distribution results and table 2;(2) it will hand over Input of the importance and traffic accident data set of interpreter's event data set feature as method FM2GI, exports traffic accident number According to the gray level image form of collection.
2 traffic accident data set importance value of table
Fig. 4 illustrates how that the feature vector by certain data in traffic accident data set is converted to gray level image.
(3) hyper parameter of CSP-CNN
The interface provided by scikit-learn, in conjunction with GridSearchCV and RandomizeSearchCV method pair The parameter combination of CSP-CNN carries out the search of 100epochs, it is determined that optimal CSP-CNN hyper parameter combination.It is used only GridSearchCV needs the calculating cost of great number, and RandomizeSearchCV, which is used only, can find the super ginseng of local optimum Array is closed, and in order to preferably utilize them, uses RandomizeSearchCV when global search optimal hyper parameter combination, And GridSearchCV is used when local search optimal hyper parameter combination, being calculated as of needing in this way originally will reduce by one It is not easy in the case where falling into the combination of local optimum hyper parameter a bit and also, hyper parameter is adjusted by this method combined of intersecting Combination can obtain more preferably result.There is the model of various hyper parameters combination by establishing, and utilize 5-fold cross- Validation assesses each model, finally obtains the highest hyper parameter combination of accuracy.Table 3 is shown using the mixing side Hyper parameter combination used in CSP-CNN after method search.
The hyper parameter of 3 CSP-CNN model of table combines
(4) CSP-CNN depth analysis
In general, multiple modules and multilayer can be stacked in deep learning model, thus analyze network depth with Awareness network behavior is extremely important, in general, the depth of CNN should not too greatly can not be too small, therefore CNN can learn more Complicated relationship, while keeping the convergence of model.Different depth values is distributed to CSP-CNN model from small to large to survey Examination.Table 4 lists the network structure under the different depth of CSP-CNN, is tested according to the CSP-CNN network structure in table 4 Obtain the accuracy rate of training set of the CSP-CNN shown in fig. 5 under different depth structure and verifying collection.When the depth of CSP-CNN When being 5, the accuracy of training set and verifying collection is respectively 96.24% and 92%;When depth is 7, the accuracy rate for verifying collection reaches To highest 93.42%, correspondingly, the accuracy rate of training set is 97.45%;When the depth of CSP-CNN model is 7 or more, though The accuracy rate of the training set of right model is being gradually increasing, but the accuracy rate for verifying collection gradually decreases, this shows CSP-CNN model The situation of over-fitting is initially entered, the CSP-CNN model training collection of depth 9,11,13 and the accuracy of verifying collection are respectively 97.91%, 98.03%, 98.27% and 93.36%, 93.34%, 93.23%.By using 4 with 256 filters A convolutional layer, 1 faltten layers, 1 full articulamentum comprising 128 hidden units and 1 include 3 hidden units The full articulamentum of softmax realizes best accuracy rate, and the training set and verifying collection accuracy rate of the model have respectively reached 93.42% With 97.45%.Therefore depth is used to be tested for 7 CSP-CNN model herein.
CSP-CNN model under 4 different depth of table
(5) experimental result is compared with other models
In order to illustrate the validity of mentioned CSP-CNN model herein, this experiment is by the model and 6 statistical models and 3 A deep learning model is compared.Wherein, 6 statistical models are respectively: k nearest neighbor method (KNN) is a kind of for classifying With the nonparametric technique of recurrence;Decision tree (DT) is the combination that a complicated decision is resolved into several simple decisions, it is desirable to The final solution obtained in this way is similar to expected solution;Naive Bayes Classifier (NBC) is one A simple " probability classification " race has strong (naivety) independence assumption based on Bayes theorem is applied between feature; Logistic return (LR) by using logistic function (i.e. cumulative logistic distribution) estimated probability come measurement category because Relationship between variable and one or more independents variable;Grad enhancement (GB) is a kind of for returning and the statistics of classification problem Habit technology, it generates prediction model in the form of weak prediction model set, and the thought of grad enhancement is derived from Leo Braitman Observation;Support vector machines (SVMs is also support vector network) is the supervised learning model with relevant learning method, for dividing Analysis is for classifying and the data of regression analysis.Correspondingly, 3 deep learning methods are respectively as follows: neural network (NNs) or connection master Adopted system is a kind of computing system, its faintly inspiration by the biological neural network for constituting animal brain represents traditional Neural network, and attempt through hiding layer come learning characteristic;It is RNN that shot and long term, which remembers recurrent neural network (LSTM-RNN), Extension, and become since the framework is capable of handling long-term memory and the disappearance gradient problem that avoids traditional RNN from being subjected to It is popular;One-dimensional convolution (Conv1D) is the convolution form of convolutional neural networks (CNNs), is usually used in series model and natural language Processing.
More, above-mentioned 6 statistical models are realized by the interface that scikit-learn is provided, and parameter is set as silent Recognize parameter.Neural network model is set as 4 hidden layers and corresponding 245 neurons of each hidden layer, 1 softmax connect entirely Layer is connect, activation primitive is relu and optimizer is stochastic gradient descent method (SGD), and in addition to this, each layer of initial parameter is uniform.Shot and long term memory recurrent neural network contains one LSTM layers and has respectively with 128 hidden units and three The hidden layer of 128,256,512 neural units, correspondingly, the last layer are the full articulamentums of softmax, and optimizer is SGD, ginseng Number is learning rate=0.01, decay=0.9, momentum=0.8.The parameter of Conv1D is set as hidden comprising 4 Hiding layer and it is respectively provided with 256 hiding neural units, the last layer is the full articulamentum of softmax, and activation primitive is relu and excellent Changing device is Adam.
Table 5 and Fig. 6 are illustrated to traffic accident data set using 6 statistical models, 3 deep learning models and CSP- The experimental result of training set and verifying collection accuracy that CNN is obtained.The result shows that in the accuracy of test set, it is proposed that CSP-CNN model be better than other statistical models and deep learning model, this explanation, CSP-CNN can well it is extensive On new traffic accident data set.Although the accuracy on training set, CSP-CNN is not highest, and training set accuracy is most Obviously there is over-fitting in high DT model, and the high CSP-CNN of training set second does not have then.One the possible reason is Because will be considered that and do not deposited between the feature of traffic accident data set when statistical models treat traffic accident data set vector In local correlations, space, combination and the deeper internal relation between traffic accident data set features are had ignored.Together Sample, for deep learning model, these deep learning models can not also analyze traffic thing from the angle of model structure Therefore the spatial relationship between data set features, and there are very strong correlation and inherences between these traffic accident data set features Relationship.And the CSP-CNN model that is mentioned herein, it is local sensing, can sufficiently extract traffic accident data set features it Between spatial relationship, syntagmatic and deeper internal relation, simple declaration is as shown in Figure 7.Fig. 7 is traffic accident number According to the picture element matrix form of collection image, from figure 7 it can be seen that CSP-CNN model passes through the filter (convolution kernel) of specific size, On the one hand, it can be extracted accordingly according to the different importance of subcharacter (such as 12 traffic accident subcharacters in Fig. 7) Traffic accident feature, on the other hand, CSP-CNN model given full play to the specific ability of local sensing, not will be considered that spy Be between sign have no it is associated, it can extract with the subcharacter of spatial relationship and internal relation combination after feature, for example, In Fig. 7, filter is learning to extract subcharacter lighting condition under sliding window, weather condition, and casualty category at the injures and deaths age, is related to Vehicle number, east to and northern directional combination after traffic accident feature, which clearly demonstrates mentioned CSP-CNN mould herein Type is how to extract the traffic accident feature rich in spatial relationship, syntagmatic and deeper internal relation.
Accuracy rate under the different model experiments of table 5
Essential purpose to the prediction of Traffic Casualties seriousness is in order in time to involved in generation traffic accident Personnel provide corresponding medical rescue, reduce accident casualties, notify corresponding urgent decision-making section in time, avoid causing Bigger property loss.For this purpose, the Traffic Casualties seriousness of prediction is further divided into three kinds of degree to divide by us Analysis: i.e. fender-bender, severe traffic accidents and fatal traffic accident.Since correctness is not evaluation model predictive ability Sole indicator, and for the practical application scene of binding model, we introduce accurate rate, recall rate and F1 Score and come to traffic Accident test set is analyzed, and wherein the calculation formula of accurate rate is as follows:
Wherein, TP (True Positive) indicates real example, i.e., true classification is positive example, and prediction classification is negative example;FP (False Positive) indicates false positive example, i.e., true classification is negative example, and prediction classification is positive example.
The calculation formula of recall rate is as follows:
Wherein, FN (False Negative) indicates false negative example, i.e., true classification is positive example, and prediction classification is negative example.
The calculation formula of F1 Score is as follows:
Table 6 and Fig. 8 are slight, different models under serious and fatal traffic accident accurate rate, recall rate and F1 Score In the experimental result of traffic accident test set.
Accurate rate, recall rate and the F1 Score of different model predictions under the different Traffic Casualties seriousness of table 6
By table 6 and Fig. 8 it is found that for it is on fender-bender test set the results show that CSP-CNN model accurate rate It is highest compared with other models, and recall rate is statistical models GB highest;For on severe traffic accidents test set As a result illustrate, the accurate rate of CSP-CNN, recall rate are all highest compared with other nine models;And for fatal traffic thing Therefore on test set the result shows that, the accurate rate of CSP-CNN, NN and Conv1D, recall rate are compared with other models, arranged side by side One.In conjunction with practical scenario analysis, for the prediction of fender-bender, we can allow for the accurate rate of prediction to exist centainly Error, because fender-bender can't cause the wounded heavy casualties and the heavy losses of property in very maximum probability, and , just must be relatively high to the precise requirements of prediction for serious and fatal traffic accident, as long as because slightly predicting not Accurately, the decision of corresponding emergency medical support and corresponding urgent department may cannot be provided, great people is finally brought Member's injures and deaths and property loss, then performance of CSP-CNN model is better than it if being analyzed in conjunction with specific situation angle from this His model.In general, accuracy rate and recall rate are interactional, and accuracy rate is high, and recall rate is with regard to low;Recall rate is low, accurately Rate is just high, both is ideally relatively high certainly, and in order to it is fair it is objective for the sake of, it is believed that with one with accurately It is common method that the closely related overall target F1 Score of the two indexs of rate and recall rate, which carrys out the performance of evaluation model,.From As a result as can be seen that in slight and severe traffic accidents test set, the F1 Score of mentioned CSP-CNN model is higher than herein Other models, and in fatal traffic accident test set, the F1 Score and NN, Conv1D the arranged side by side 1st of CSP-CNN model.
To sum up, it is either analyzed from the accuracy of model prediction, also allows for specific application scenario to different serious Traffic accident under degree is analyzed, and the performance of the CSP-CNN model mentioned herein is superior to other models.
This paper presents deep learning side's CSP-CNN models to predict traffic accident seriousness.With only focused in the past Different in the simple structure of traffic accident data, mentioned method can successfully issue the feature representation of traffic accident seriousness, than Such as time-space relationship nonlinear between traffic accident feature, syntagmatic and deeper internal relation.We are based on benefit Hereby 8 years traffic accident data sets are tested between the 2009-2016 of city council, by this paper institute climbing form type CSP-CNN with NBC, KNN, LR, DT, GB, SVM, Conv1D, NN and LSTM-RNN model are compared, the experimental results showed that this paper institute climbing form type Performance is better than other above-mentioned models.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims (10)

1. traffic accident seriousness predicts CSP-CNN model, which is characterized in that be made of following four part: mode input layer, Convolutional layer, full articulamentum and model output layer;
The mode input layer for inputting traffic accident data gray image set, and provides input for convolutional layer;
The convolutional layer, for extracting the abstract spy of traffic accident data set from the traffic accident data set gray level image of input Sign;
The full articulamentum, the feature vector conversion of the traffic accident data set for extracting and learning the last one convolutional layer After one-dimensional vector, linear process is carried out based on the one-dimensional vector, and export linear processing result;
The model output layer predicts traffic accident seriousness using Softmax activation primitive for the output to full articulamentum;
Wherein, the convolutional layer has 4, and 256 filters, convolution kernel size kernel size=3 is arranged in each volume base, Walk long stride=1 mends 0 parameter pad=1;
The full articulamentum includes 1 flatten layers and 128 hidden units;
The model output layer, the i.e. full articulamentum of softmax include 3 hidden units.
2. the modeling method of traffic accident seriousness prediction CSP-CNN model as described in claim 1, which is characterized in that specific Steps are as follows:
Step 1: the importance based on traffic accident feature is by traffic accident data conversion at traffic accident data gray image Collection, and it is input to mode input layer, the inputting mathematical form expression of traffic accident seriousness prediction model CSP-CNN is such as Under:
Wherein, d indicates that the index of traffic accident data set x, N indicate that the sum of traffic accident data set x, PC are traffic accident number According to father's Characteristic Number of collection x, maximum subcharacter number under father's feature that CC is all traffic accident data set x, max (PC, CC the maximum value of both PC and CC, P) are indicatedMMIndicate the gray level image picture element matrix x of traffic accident data setdIn M row M The pixel of column;
Step 2: convolutional calculation: convolutional calculation is carried out using activation primitive ReLU to the input that mode input layer provides, activates letter Number ReLU are as follows:
G (h)=max (0, h); (2)
Wherein, h is the input of convolutional Neural member;
Convolutional calculation formula are as follows:
Wherein: aK, lIndicate the row k l column element of convolutional layer Feature Map, wherein the value range of e and f is [1, F]; C is channel number, identical as filter number of convolutional layer;F is the size of filter, the width and height phase of filter Together;wC, e, fIndicate the e row f column weight of c-th of channel filter;pc,k,lIndicate c-th of channel grayscale image of input picture The pixel element of the row k l column of picture;pc,k+e,l+fIndicate the kth+e row l+f of c-th of channel gray level image of input picture The pixel element of column;wbThe bias term for indicating filter, when each model running, random initializtion wb
For the input of convolutional Neural member;
Step 3: full articulamentum calculates: the feature vector that the last one convolutional layer extracts and learns is passed through using following equation Input of the flatten operational transition at one-dimensional vector as full articulamentum:
aflatten=flatten ([a1,a2,...,ac]),c∈[1,C]; (4)
Wherein, aflattenIndicate the one-dimensional vector of transformation, the Feature Map of the full articulamentum after as flatten;[a1, a2..., ac] be the last one convolutional layer output, as the last one convolutional layer extract and study feature vector [Feature Map1,Feature Map2,…,FeatureMapc];
The calculation formula of full articulamentum is as follows:
Wherein:Indicate the linear convergent rate of full articulamentum, wflIndicate the weight of full articulamentum, bflIndicate the biasing of full articulamentum ?;
Step 4: the prediction of traffic accident seriousness: setting traffic accident menace level is fender-bender or serious traffic thing Therefore or fatal traffic accident three classes, output of the model output layer according to full articulamentumUtilize the pre- test cross of Softmax activation primitive Logical accident (Crash) severity, exports the probability value of the traffic accident grade for setting, the maximum traffic accident grade of probability value is as pre- The traffic accident seriousness of survey;
Step 5: being trained traffic accident seriousness prediction CSP-CNN model, confirmation CSP-CNN model hyper parameter combination.
3. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute The importance based on traffic accident feature in step 1 is stated by traffic accident data conversion into traffic accident data gray image set Realization process is as follows:
Step 1: obtaining the eigenmatrix FM by pretreated traffic accident data set;
Step 2: the total number according to original traffic casualty data collection distributes k thread, for each thread, by traffic accident Corresponding feature vector FV is converted to gray level image in the eigenmatrix FM of data set;
Step 3: the gray level image grayImage for the feature vector FV conversion that each thread obtains is stored in grayscale image chained list In grayImageList, and return to gray level image grayImage.
4. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 3, which is characterized in that institute Stating traffic accident data set in step 1, pretreated steps are as follows:
(1) imperfect, mistake and duplicate traffic accident data are deleted, and to influence Traffic Casualties seriousness Subcharacter is deleted;
(2) traffic accident data set normalizes, and removes the unit limitation of data, converts nondimensional pure values for data: benefit Traffic accident data set x is normalized with the standardized method Z-score Normalization in statistics, is counted According to symbol standardized normal distribution, the conversion function of Z-score Normalization are as follows:
Wherein, x*Some data under single feature are represented, u is the mean value of all data under single feature, and σ is under single feature The mark of all data is poor;Successively each feature in traffic accident data set x is respectively calculated.
5. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 3, which is characterized in that institute State the eigenmatrix FM that traffic accident data set is obtained in step 1, the specific steps are as follows:
Step 1.1. is according to whether determining original traffic casualty data related to the prediction of traffic accident seriousness concentrates certain data All father's feature fp:
Fp={ fp1..., fpm}; (7)
Wherein, m indicates that original traffic casualty data concentrates the quantity of father's feature of certain data;
The original traffic casualty data that step 1.2. obtains data prediction confirmation concentrates all subcharacter fc of certain data:
Wherein, [1, m] i ∈, j ∈ [1, n], fci,jIndicate that original traffic casualty data concentrates j-th of subcharacter of certain data, And father's feature of the subcharacter is fpi, and meet:AndWherein i ≠ j, i.e., each subcharacter category In and be pertaining only to 1 father's feature;The subcharacter number scale of i-th of father's feature is Npi=| fpi|;
Step 1.3. determines that original traffic casualty data concentrates the importance weight vector wc of all subcharacters of certain data:
Wc=(w1,1..., wi,j); (9)
Wherein, wi,jIndicate that original traffic casualty data concentrates the importance weight of j-th of subcharacter of certain data, and the son Feature belongs to father's feature fpi
Step 1.4. determines the feature vector FV of certain data in traffic accident data set, is certain in traffic accident data set The expression-form of data characteristics is a triple:
FV=<fp, fc, wc>; (10)
Step 1.5. determines the eigenmatrix FM of traffic accident data set, is the table of all data characteristicses of traffic accident data set It is the set of a feature vector up to form:
FM={ FV1,...,FVk, and FM ∈ Rk×n; (11)
Wherein, k indicates that the total number of original traffic casualty data collection, n indicate that original traffic casualty data concentrates certain data The quantity of subcharacter, Rk×nIndicate the matrix of a k × n.
6. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute It states in step 2 and the feature vector FV of certain data in traffic accident data set is converted into gray level image, specifically include following step It is rapid:
Step 2.1. sorts out original traffic casualty data collection feature: the feature vector according to certain data in traffic accident data set FV sorts out its n subcharacter respectively into corresponding m father feature, while initializing the importance power of all subcharacter fc It is worth vector wc;
Step 2.2. searches all father feature fp, finds comprising the largest number of father's features of subcharacter, and return to father's feature Subcharacter number;
The subcharacter number of above-mentioned return is compared by step 2.3. with m, and the maximum value of the two is defined as max_dim, so A full null matrix Mat is initialized afterwardsmax_dim×max_dimAs final traffic accident data set storage element;
Step 2.4. is according to original traffic casualty data collection to full null matrix Matmax_dim×max_dimIt is filled;
Step 2.5. calls the Reshap function of graphics process, gives full null matrix Matmax_dim×max_dimIncrease a channel, by it Be converted to gray level image grayImage.
7. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 5, which is characterized in that institute It states in step 2.4 according to original traffic casualty data collection to full null matrix Matmax_dim×max_dimThe specific steps being filled are such as Under:
The arrangement of step 2.4.1. father's feature descending: according to the weight w p of each father's feature, descending is carried out to all father's feature fp Arrangement, the weight w p of some father's featureiEqual to the importance weight w of subcharacters all under iti,jThe sum of, i.e. wpi=Σ wI, j
Step 2.4.2.Matmax_dim×max_dimRow filling: according to the weight w p of each father's feature, the father that all descendings are arranged is special Levy fp from it is most intermediate to upper and lower both sides according to the principle extension descending pad of " under being above greater than ";
Step 2.4.3. subcharacter descending arrangement: according to the importance weight w of the subcharacter under each father's featurei,j, under it Subcharacter carries out descending arrangement;
Step 2.4.4.Matmax_dim×max_dimColumn filling: in full null matrix Matmax_dim×max_dimIn corresponding row, i.e., each In father's feature, by the subcharacter of descendings all under it arrangement according to the principle of " L)R ", column filling is carried out;
Step 2.4.5. keeps full null matrix Matmax_dim×max_dimRemaining element value is that " 0 " is constant, obtains final result square Battle array.
8. special according to the modeling method of any one of the claim 5~7 traffic accident seriousness prediction CSP-CNN model Sign is that the original traffic casualty data concentrates the quantity m=5 of father's feature of certain data;The original traffic accident number According to the quantity n=12 for the subcharacter for concentrating certain data;
Fp={ Accident Characteristic1, road surface characteristic2, environmental factor3, vehicle characteristics4, Characteristics of Drivers ' Behavior5};
{ the east orientation position fc=1,1, north orientation position1,2, No. 1 road classification1,3, traffic injury time1,4, vehicle number that accident is related to Amount1,5, surface conditions2,6, lighting condition3,7, weather condition3,8, type of vehicle4,9, casualty category5,10, injures and deaths gender5,11, injures and deaths Age5,12}。
9. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 5, which is characterized in that institute It states and determines that original traffic casualty data concentrates the importance weight vector wc of all subcharacters of certain data in step 1.3, be 1000 iteration are carried out to 12 subcharacters of traffic accident data set using XGBoost method to obtain.
10. the modeling method of traffic accident seriousness prediction CSP-CNN model according to claim 2, which is characterized in that institute It states CSP-CNN model hyper parameter in step 5 to combine are as follows: batch size=128, loss function Categorical Crossentropy, optimizer are that gradient declines optimizer, learning rate 0.001, and error term 1e-07 initializes convolution kernel Using Glorot normal distribution initial method.
CN201810930337.0A 2018-08-15 2018-08-15 CSP-CNN model for predicting severity of traffic accident and modeling method thereof Expired - Fee Related CN109034264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810930337.0A CN109034264B (en) 2018-08-15 2018-08-15 CSP-CNN model for predicting severity of traffic accident and modeling method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810930337.0A CN109034264B (en) 2018-08-15 2018-08-15 CSP-CNN model for predicting severity of traffic accident and modeling method thereof

Publications (2)

Publication Number Publication Date
CN109034264A true CN109034264A (en) 2018-12-18
CN109034264B CN109034264B (en) 2021-11-19

Family

ID=64631661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810930337.0A Expired - Fee Related CN109034264B (en) 2018-08-15 2018-08-15 CSP-CNN model for predicting severity of traffic accident and modeling method thereof

Country Status (1)

Country Link
CN (1) CN109034264B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109697852A (en) * 2019-01-23 2019-04-30 吉林大学 Urban road congestion degree prediction technique based on timing traffic events
CN110516228A (en) * 2019-07-04 2019-11-29 湖南星汉数智科技有限公司 Name entity recognition method, device, computer installation and computer readable storage medium
CN110569839A (en) * 2019-08-09 2019-12-13 河海大学常州校区 Bank card number identification method based on CTPN and CRNN
CN110648537A (en) * 2019-09-28 2020-01-03 安徽百诚慧通科技有限公司 Traffic accident correlation analysis method based on Haen's law
CN111009323A (en) * 2019-11-12 2020-04-14 河北工业大学 KNN-ANN-based prediction method for subdural hematoma injury
CN111339978A (en) * 2020-03-02 2020-06-26 北京建筑大学 Method for recognizing traffic index time series mode by using convolutional neural network model
CN111401828A (en) * 2020-02-28 2020-07-10 上海近屿智能科技有限公司 Dynamic intelligent interviewing method, device and equipment for strengthening sorting and computer storage medium
CN112131794A (en) * 2020-09-25 2020-12-25 天津大学 Hydraulic structure multi-effect optimization prediction and visualization method based on LSTM network
CN112804253A (en) * 2021-02-04 2021-05-14 湖南大学 Network flow classification detection method, system and storage medium
CN112839034A (en) * 2020-12-29 2021-05-25 湖北大学 Network intrusion detection method based on CNN-GRU hierarchical neural network
CN112860854A (en) * 2021-01-29 2021-05-28 深圳蓝贝科技有限公司 Online monitoring and fault repairing system and method for vending machine
CN113344254A (en) * 2021-05-20 2021-09-03 山西省交通新技术发展有限公司 Method for predicting traffic flow of expressway service area based on LSTM-LightGBM-KNN
WO2022099608A1 (en) * 2020-11-13 2022-05-19 金序能 Method for acquiring attribute category of traffic accident on highway
CN114582126A (en) * 2022-03-04 2022-06-03 深圳市综合交通与市政工程设计研究总院有限公司 Intelligent management and control method and system suitable for ultra-long tunnel traffic and giving consideration to efficiency safety
US11443623B1 (en) 2022-04-11 2022-09-13 King Fahd University Of Petroleum And Minerals Real time traffic crash severity prediction tool

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070203866A1 (en) * 2006-02-27 2007-08-30 Kidd Scott D Method and apparatus for obtaining and using impact severity triage data
US20080275747A1 (en) * 2007-04-20 2008-11-06 Kabushiki Kaisha Toshiba Incident/accident report analysis apparatus and method
CN105512624A (en) * 2015-12-01 2016-04-20 天津中科智能识别产业技术研究院有限公司 Smile face recognition method and device for human face image
CN106250840A (en) * 2016-07-27 2016-12-21 中国科学院自动化研究所 Face based on degree of depth study opens closed state detection method
CN106338517A (en) * 2016-09-23 2017-01-18 江苏大学 Intelligent judgment method for fruit freshness based on coordination of visual information and olfactory information
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN106778657A (en) * 2016-12-28 2017-05-31 南京邮电大学 Neonatal pain expression classification method based on convolutional neural networks
CN107194323A (en) * 2017-04-28 2017-09-22 阿里巴巴集团控股有限公司 Car damage identification image acquiring method, device, server and terminal device
CN107203778A (en) * 2017-05-05 2017-09-26 平安科技(深圳)有限公司 PVR intensity grade detecting system and method
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107609464A (en) * 2017-07-24 2018-01-19 南京邮电大学 A kind of real-time high-precision human face quick detection method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070203866A1 (en) * 2006-02-27 2007-08-30 Kidd Scott D Method and apparatus for obtaining and using impact severity triage data
US20080275747A1 (en) * 2007-04-20 2008-11-06 Kabushiki Kaisha Toshiba Incident/accident report analysis apparatus and method
CN105512624A (en) * 2015-12-01 2016-04-20 天津中科智能识别产业技术研究院有限公司 Smile face recognition method and device for human face image
CN106250840A (en) * 2016-07-27 2016-12-21 中国科学院自动化研究所 Face based on degree of depth study opens closed state detection method
CN106338517A (en) * 2016-09-23 2017-01-18 江苏大学 Intelligent judgment method for fruit freshness based on coordination of visual information and olfactory information
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN106778657A (en) * 2016-12-28 2017-05-31 南京邮电大学 Neonatal pain expression classification method based on convolutional neural networks
CN107194323A (en) * 2017-04-28 2017-09-22 阿里巴巴集团控股有限公司 Car damage identification image acquiring method, device, server and terminal device
CN107203778A (en) * 2017-05-05 2017-09-26 平安科技(深圳)有限公司 PVR intensity grade detecting system and method
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107609464A (en) * 2017-07-24 2018-01-19 南京邮电大学 A kind of real-time high-precision human face quick detection method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
JIAN BO YANG 等: "Deep Convolutional Neural Networks On Multichannel Time Series For Human Activity Recognition", 《PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI 2015)》 *
LI-YEN CHANG 等: "Analysis of traffic injury severity: An application of non-parametric classification tree techniques", 《ACCIDENT ANALYSIS AND PREVENTION》 *
MAHER IBRAHIM SAMEEN 等: "Applications of Deep Learning in Severity Prediction of Traffic Accidents", 《GCEC 2017》 *
MAHER IBRAHIM SAMEEN 等: "Severity Prediction of Traffic Accidents with Recurrent Neural Networks", 《APPL SCI 2017》 *
胡骥 等: "基于有序 Logit 与 Probit 模型的交通事故严重性影响因素分析", 《安全与环境学报》 *
马壮林 等: "公路隧道交通事故严重程度预测模型研究", 《中国安全科学学报》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109697852A (en) * 2019-01-23 2019-04-30 吉林大学 Urban road congestion degree prediction technique based on timing traffic events
CN110516228A (en) * 2019-07-04 2019-11-29 湖南星汉数智科技有限公司 Name entity recognition method, device, computer installation and computer readable storage medium
CN110569839A (en) * 2019-08-09 2019-12-13 河海大学常州校区 Bank card number identification method based on CTPN and CRNN
CN110569839B (en) * 2019-08-09 2023-05-16 河海大学常州校区 Bank card number identification method based on CTPN and CRNN
CN110648537A (en) * 2019-09-28 2020-01-03 安徽百诚慧通科技有限公司 Traffic accident correlation analysis method based on Haen's law
CN111009323A (en) * 2019-11-12 2020-04-14 河北工业大学 KNN-ANN-based prediction method for subdural hematoma injury
CN111009323B (en) * 2019-11-12 2023-11-10 河北工业大学 KNN-ANN-based subdural hematoma damage prediction method
CN111401828A (en) * 2020-02-28 2020-07-10 上海近屿智能科技有限公司 Dynamic intelligent interviewing method, device and equipment for strengthening sorting and computer storage medium
CN111339978A (en) * 2020-03-02 2020-06-26 北京建筑大学 Method for recognizing traffic index time series mode by using convolutional neural network model
CN112131794A (en) * 2020-09-25 2020-12-25 天津大学 Hydraulic structure multi-effect optimization prediction and visualization method based on LSTM network
WO2022099608A1 (en) * 2020-11-13 2022-05-19 金序能 Method for acquiring attribute category of traffic accident on highway
CN112839034B (en) * 2020-12-29 2022-08-05 湖北大学 Network intrusion detection method based on CNN-GRU hierarchical neural network
CN112839034A (en) * 2020-12-29 2021-05-25 湖北大学 Network intrusion detection method based on CNN-GRU hierarchical neural network
CN112860854A (en) * 2021-01-29 2021-05-28 深圳蓝贝科技有限公司 Online monitoring and fault repairing system and method for vending machine
CN112804253B (en) * 2021-02-04 2022-07-12 湖南大学 Network flow classification detection method, system and storage medium
CN112804253A (en) * 2021-02-04 2021-05-14 湖南大学 Network flow classification detection method, system and storage medium
CN113344254A (en) * 2021-05-20 2021-09-03 山西省交通新技术发展有限公司 Method for predicting traffic flow of expressway service area based on LSTM-LightGBM-KNN
CN114582126A (en) * 2022-03-04 2022-06-03 深圳市综合交通与市政工程设计研究总院有限公司 Intelligent management and control method and system suitable for ultra-long tunnel traffic and giving consideration to efficiency safety
US11443623B1 (en) 2022-04-11 2022-09-13 King Fahd University Of Petroleum And Minerals Real time traffic crash severity prediction tool

Also Published As

Publication number Publication date
CN109034264B (en) 2021-11-19

Similar Documents

Publication Publication Date Title
CN109034264A (en) Traffic accident seriousness predicts CSP-CNN model and its modeling method
Zheng et al. Traffic accident’s severity prediction: A deep-learning approach-based CNN network
He et al. Mining transition rules of cellular automata for simulating urban expansion by using the deep learning techniques
CN109034448B (en) Trajectory prediction method based on vehicle trajectory semantic analysis and deep belief network
Pradhan A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS
Wang et al. A band selection method for airborne hyperspectral image based on chaotic binary coded gravitational search algorithm
Montalbo et al. Classification of fish species with augmented data using deep convolutional neural network
CN110321862B (en) Pedestrian re-identification method based on compact ternary loss
CN104966106A (en) Biological age step-by-step predication method based on support vector machine
Hatim et al. Addressing challenges and demands of intelligent seasonal rainfall forecasting using artificial intelligence approach
CN116469561A (en) Breast cancer survival prediction method based on deep learning
CN116307103A (en) Traffic accident prediction method based on hard parameter sharing multitask learning
CN117077005B (en) Optimization method and system for urban micro-update potential
Shanthi et al. Gender specific classification of road accident patterns through data mining techniques
Rajeshwari et al. Dermatology disease prediction based on firefly optimization of ANFIS classifier
Nour et al. Road traffic accidents injury data analytics
CN110335160A (en) A kind of medical treatment migratory behaviour prediction technique and system for improving Bi-GRU based on grouping and attention
AlAfandy et al. Artificial neural networks optimization and convolution neural networks to classifying images in remote sensing: A review
Zhou et al. Advanced CRITIC–GRA–GMM model with multiple restart simulation for assuaging decision uncertainty: An application to transport safety engineering for OECD members
CN111708865B (en) Technology forecasting and patent early warning analysis method based on improved XGboost algorithm
Xu et al. MM-UrbanFAC: Urban functional area classification model based on multimodal machine learning
CN114049522A (en) Garbage classification system based on deep learning
Romalt et al. Prediction of Cardio Vascular Disease by Deep Learning and Machine Learning-A Combined Data Science Approach
Sankaran et al. An automated prediction of remote sensing data of Queensland-Australia for flood and wildfire susceptibility using BISSOA-DBMLA scheme
Jawla et al. Crime Forecasting using Folium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211119