CN111967712A - Traffic risk prediction method based on complex network theory - Google Patents

Traffic risk prediction method based on complex network theory Download PDF

Info

Publication number
CN111967712A
CN111967712A CN202010649490.3A CN202010649490A CN111967712A CN 111967712 A CN111967712 A CN 111967712A CN 202010649490 A CN202010649490 A CN 202010649490A CN 111967712 A CN111967712 A CN 111967712A
Authority
CN
China
Prior art keywords
traffic
network
grid
model
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010649490.3A
Other languages
Chinese (zh)
Other versions
CN111967712B (en
Inventor
李大庆
郑参
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010649490.3A priority Critical patent/CN111967712B/en
Publication of CN111967712A publication Critical patent/CN111967712A/en
Application granted granted Critical
Publication of CN111967712B publication Critical patent/CN111967712B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • G06Q50/40
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a traffic risk prediction method based on a complex network theory, which comprises the following steps: step A: dividing grids based on empirical data to construct a double-layer traffic network model; and B: extracting and screening features based on a complex network theory; and C: risk prediction is carried out based on an ensemble learning theory; step D: evaluating and verifying the model; through the steps, two dimensions of the function and the structure of the traffic system are comprehensively considered, scientific and reliable technical support and theoretical support are provided for the identification of traffic risks, and important support is provided for risk diagnosis of the traffic system, formulation of targeted management control measures and improvement of traffic operation reliability; the method has the advantages of strong systematicness, high portability and easy operation, and solves the problem that risks in a complex traffic system are difficult to identify and predict.

Description

Traffic risk prediction method based on complex network theory
Technical Field
The invention provides a traffic risk prediction method based on a complex network theory, and relates to the technical fields of risk analysis, network science and the like.
Background
Risk refers to a possible occurrence of an event that, if occurring, can impede the development of the system, even go to death, and is also defined as the uncertainty of whether an event occurred or not. The risk exists in the system objectively, and the loss caused by the risk can be prevented or reduced by adopting a precautionary measure, but the risk cannot be eliminated. In a complex system, because risks in the system often appear in the characteristics of sudden occurrence, large spread range and strong destructive power, great difficulty is brought to the identification, prediction and prevention of system risks, new challenges are also provided to the research of risk management, control and prevention of the complex system, and the loss caused by the occurrence of the system risks can bring great influence to the life of people and even the operation of the society, so that the accurate prediction of the risks in the complex system by adopting a scientific and reasonable method is necessary. The traffic system plays an important role in the aspects of travel, urban operation and the like, and in recent years, with the rapid development of mobile interconnection and vehicle-mounted technology, the traffic system has the characteristic of high complexity in structure and function. Under the complex and changeable environment and demand, the traffic system can face the occurrence of artificial and natural risk conditions such as traffic accidents, construction closure, rainstorm, snow disasters and the like, the traffic risk events often cause traffic jam, and meanwhile, the traffic system has the characteristic of space-time evolution, and the risk events can be spread in the traffic system after the occurrence of the traffic risk events, so that a large amount of extra cost is added for the travel of residents, and great resource waste is brought to the society.
In the current research of risk identification and prediction of a traffic system, the main methods include a Model-based analysis method, qualitative analysis and quantitative analysis, particularly, the structure and the function of the system are described based on a Process Flow Diagram (PFD) and grey correlation analysis, and the risk is identified and predicted by analyzing the system deviation generation condition and the correlation degree among all influencing factors and quantizing the system deviation generation condition and the correlation degree among all influencing factors; in addition, with the advent of the big data age and the development of technology thereof, Knowledge-based analysis methods have been developed, and the main methods thereof include a causal relationship model, a machine learning model, a deep learning model, and the like, which are based on empirical data generated by a traffic system, such as: and (3) traffic flow, vehicle-mounted speed and the like, and an unknown relation and a pattern in the data are discovered and revealed by constructing a historical data set application model, so that the risk state in a traffic system is identified and predicted. The method only uses the known model and data to predict the risks of the traffic system from the state of the traffic system, does not dynamically consider the incidence relation and the evolution mode among the risks in the traffic system from the network level, and is difficult to explain the internal mechanism of the risk formation of the traffic system. Therefore, aiming at the traffic system with high structural and functional complexity, the invention combines the complex network theory and the machine learning method to identify and predict the risk of the traffic system, provides a new perspective and a new method for researching the risk identification prediction and management control in the traffic system, enriches the cognition of people on the risk in the traffic system, and has important significance for ensuring the healthy and stable operation of the traffic system.
Disclosure of Invention
Objects of the invention
The invention is mainly used for solving the problem of risk identification and prediction under the background of a complex system and a network structure, the conventional method mainly analyzes the risk of a traffic system from the function of the system, and the invention provides a traffic risk prediction method based on a complex network theory by comprehensively considering two dimensions of the function and the structure of the traffic system from the perspective of the complex network aiming at the high complexity and the time-space evolution characteristic of the traffic system and the problem that the conventional method cannot well identify and predict the risk of the traffic system. The method provided by the invention can effectively identify and predict the risks of the traffic system, and provides important support for risk diagnosis of the traffic system, formulation of targeted management control measures and improvement of traffic operation reliability.
(II) technical scheme
In order to achieve the purpose, the method adopts the technical scheme that: a traffic risk prediction method based on a complex network theory is provided.
The invention relates to a traffic risk prediction method based on a complex network theory, which comprises the following steps:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: and (5) evaluating and verifying the model.
Through the steps, the purpose of risk prediction of the traffic system can be achieved, the method is strong in systematicness, high in transportability and easy to operate, and the problem that risks in a complex traffic system are difficult to identify and predict is solved.
The step A of establishing the double-layer traffic network model based on the empirical data division grids comprises the following steps of: firstly, acquiring basic information of roads in a research area, wherein the basic information mainly comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of a research area range and the longitude and latitude information of road sections and intersections, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congestion roads exist between grids is used as a judgment bar for judging whether edges are connected or not, the number of the congestion roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method; the specific method comprises the following steps:
step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure BDA0002574365680000031
Step A3: construction of grid traffic congestion network model G1(N1,L1);
Step A4: construction of mesh node traffic network model G2(N2,L2);
In step a1, the "grid area is divided based on geographic information" specifically includes the following steps: firstly, extracting traffic network models and traffic road information required by dividing grid areas from a geographic information system (Mapinfo) file by using programming software Python, wherein the extracted information mainly comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections, network topological structure information of a researched traffic system and the like, and in the process of extracting the longitude and latitude of the intersections, the invention uses Python to call a Baidu map Application Programming Interface (API) and adopts a sequential traversal method to obtain the longitude and latitude information of the intersections by matching the topological structure of a road network with the names of the intersections, and processes the road with failed longitude and latitude acquisition due to the difference of the names of the road intersections on the Baidu map and the Mapinfo to obtain an accurate and standard longitude and latitude information data set of the traffic system road network; secondly, calculating the area S and the latitude and longitude dereferencing range of the researched area according to the obtained traffic road information of the researched area and the longitude and latitude information of the intersection, and scientifically and reasonably determining the number of the divided grids to be N × M according to the actual background condition of the researched area, so that the area of each grid is S/(N × M); finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network aiming at each grid area, and recording;
therein, the speed data preprocessing described in the step A2 obtains the relative speed matrix
Figure BDA0002574365680000041
", it is as follows: in this step, first, according to actual traffic operation data of a vehicle Global Positioning System (GPS),at any one time tiExpressing the speeds corresponding to all R roads into a vector form V according to the sequence relation of the roadsi=(v1,v2,…,vR) (ii) a Further, the above process is repeated for all T moments, and finally the velocity vectors V at all moments are integratediGenerating an initial velocity matrix
Figure BDA0002574365680000042
Secondly, in the process of collecting the speed information of the traffic system by using the floating car technology, the speed information of each area at each moment cannot be completely collected and reserved due to the influence of the network communication technology and human and natural factors, so that the original speed information of the traffic system needs to be subjected to speed compensation processing, namely an original speed matrix
Figure BDA0002574365680000043
There is a partial missing value (actually recorded as 0) and therefore, it is necessary to find the velocity matrix
Figure BDA0002574365680000044
The velocity missing value in (1), i.e. the element with the value of 0 in the matrix, is subjected to velocity compensation; for tiTime-lapse road RjIs compensated for by first finding the road R in the road network G (N, L)jSet of neighboring roads
Figure BDA0002574365680000045
Searching whether the speed record exists on the road in the set at the moment, and if one element in the set has the speed record, taking the average value of the elements in the set, wherein the specific formula is as follows:
Figure BDA0002574365680000051
in the above formula, the first and second carbon atoms are,
Figure BDA0002574365680000052
road R indicating lack of speedjAt tiThe speed compensation value at the time of day,
Figure BDA0002574365680000053
road R indicating lack of speedjSet of neighboring roads
Figure BDA0002574365680000054
Is not a sum of 0 element values, J represents a speed-missing road RjSet of neighboring roads
Figure BDA0002574365680000055
The number of elements other than 0;
if the road RjAll the neighboring road speeds are not recorded, the road R is determinedjIs compensated to 0, the original velocity matrix is used after each compensation
Figure BDA0002574365680000056
Updated to compensated
Figure BDA0002574365680000057
Repeating the above process at each moment until all 0 values in the velocity matrix are compensated to obtain the completed velocity matrix
Figure BDA0002574365680000058
In the original absolute velocity matrix
Figure BDA0002574365680000059
After the road speed compensation is completed, because the road grades at all levels are different, normalization processing is carried out on the compensated speed matrix to obtain the relative speed of the compensated speed matrix, and the judgment standard is unified; for any road RjFrom velocity matrix
Figure BDA00025743656800000510
Extracting the speed vector of the road at all times
Figure BDA00025743656800000511
And extracts the maximum speed limit of the road section
Figure BDA00025743656800000512
The velocity vector of the moment
Figure BDA00025743656800000513
Is divided by the maximum speed limit
Figure BDA00025743656800000514
To obtain a normalized velocity
Figure BDA00025743656800000515
Obtaining a normalized velocity matrix
Figure BDA00025743656800000516
As follows:
Figure BDA00025743656800000517
wherein, in step A3, the step of "building the grid traffic congestion network model G1(N1,L1) ", it is as follows: for each grid area divided in the step A, firstly, according to actual map data under each grid area, using software tools such as Python, Mapinfo and the like to extract structure information among roads and road intersection information contained in each grid area; secondly, selecting a suitable geographical coverage range of traffic according to the requirement of actual research, such as selecting a five-ring traffic network in Beijing; then, according to a complex network method, abstracting a road intersection in each grid area as a node in the network, abstracting a road in the traffic network of the grid area as a connecting edge between nodes in the network, and taking the relative speed of each road as the weight of the connecting edge so as to establish a grid traffic congestion network in each grid area; meanwhile, most roads of the traffic network run in two directions and have directionality, so the traffic network is constructed by the methodThe grid traffic jam network is a directed weighting network;
therein, the step A4 for "constructing the mesh node traffic network model G2(N2,L2) ", it is as follows: firstly, constructing an intersection traffic network model between grids according to intersection information contained in each grid area and road topological structure information of a traffic network (whole network) of the whole research area, namely deleting the road topological structure information contained in the grid area on the basis of the whole network; secondly, counting the number of congested roads between the grid areas and recording the number; and finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids as connecting edges or not by applying a complex network theory and a complex network method according to the information, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as connecting edge weights.
The method for extracting and screening the features based on the complex network theory in the step B comprises the following steps: for each time tiThe grid traffic congestion network and the grid node traffic network (referred to as a double-layer traffic network for short) set a seepage threshold q (t) for seepage analysis, and determine the seepage threshold q (t) through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and nodes (grids) in the grid node traffic network under the seepage threshold q (t) at each moment, extracting the characteristics of each grid area by using the theory and method of a complex network, wherein the characteristics comprise the structural and functional characteristics such as maximum jam sub-cluster, node median, node degree mean, the average speed of the grid jam network, the number of first-order neighbor congested roads and the like, screening the extracted characteristics by using a machine learning method on the basis, selecting the characteristics which greatly contribute to the traffic risk identification and prediction effect, constructing a high-quality sample characteristic set, and improving the traffic risk identification and prediction effect and efficiency to the maximum extent; at the same time, with each grid zoneLabeling a grid area at the time t according to the proportion of the congested roads at the time t + delta t in the domain; the specific steps of the process are as follows:
step B1: analyzing seepage of a traffic network;
step B2: extracting risk features based on a complex network;
step B3: screening risk characteristics based on machine learning;
in step B1, the traffic network seepage analysis specifically includes the following steps: a seepage theory is applied to carry out seepage analysis on a double-layer traffic network, firstly, a control variable, namely a seepage threshold value is given for the traffic network at each moment, and the control variable is set as q (t), so that each road in the traffic network can present two states: unblocked state (i.e. v)i_ratio(t) > q (t) and congestion status (i.e., v)i_ratioQ (t) is less than or equal to q (t); deleting the unblocked connecting edges in the traffic network from the original network, and keeping the congested connecting edges in the original traffic network, wherein the rest network is the traffic network in a congested state at the moment t, and is referred to as a congested network for short; the next q (t) value at each moment corresponds to a congestion network, and as the q (t) value is reduced, the traffic network becomes more congested, namely, the more failed edges are, the traffic network becomes more sparse, so that the traffic congestion risk at the current moment is identified and predicted when the proper seepage threshold value q (t), namely the urban traffic network is in the stage with the most abundant congestion information, is selected;
the "risk feature extraction based on complex network" described in step B2 is specifically performed as follows: in the step, a grid traffic jam network and a grid node traffic network are constructed for each moment under a seepage threshold q (t), and from the viewpoint of statistical physics, a complex network theory and a method are applied to preliminarily extract micro and macro characteristics of a grid area of a double-layer traffic network at each moment from the two viewpoints of structure and function; firstly, on a microscopic level, each grid traffic congestion network is used as a research object, and the microscopic features of each grid area are calculated at the key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macro level, aiming at the constructed grid node traffic network model, taking the nodes (grid areas) thereof as research objects, and calculating the macro features of the grid areas (nodes) at each moment, as shown in fig. 2, such as the micro features: the maximum congestion subgroups of the grid traffic congestion network, the mean value of node betweenness, the mean value of node degree, the mean value of aggregation coefficient, the average speed and the growth rate of the congestion network, and the like, wherein the macro characteristics are as follows: the node average path length, the node strength, the node betweenness, the node degree, the growth rate and the like of the grid node traffic network;
in the invention, a method is provided for extracting features from the perspective of a complex network, the feature extraction of a grid is exemplified, and the features of an actual traffic system can be preliminarily extracted in a targeted manner from two aspects of the structure and the function of the actual traffic system according to the actual background and the actual situation of the actual traffic system, so that a sample feature set is constructed, and an initial feature matrix M is constructedf
The "risk feature screening based on machine learning" described in step B3 is specifically performed as follows: in step B2, extracting the functional and structural features of the grid region at each time based on the related knowledge of the complex network, and then constructing an initial feature matrix MfIn order to improve the accuracy and precision of risk identification and prediction in the traffic system, a relevant theoretical method of machine learning is used for carrying out feature selection on a preliminarily constructed sample feature set in the step, so that a high-quality sample feature set is screened out, and the effect of risk identification and prediction in the traffic system is improved to the greatest extent; meanwhile, the structure and function characteristics of the traffic system are screened, important characteristics are screened out, irrelevant characteristics are removed, dimension disasters can be relieved, the difficulty of learning tasks is reduced, and the generalization capability of an over-fitting enhanced machine learning model is reduced; aiming at the characteristic that a traffic system has high complexity of space-time evolution and the optimization of a given learner, the invention uses a relatively classical LVW (Las Vegas wrapper) method in a wrapping modeThe feature selection is performed, as shown in fig. 3, and the specific steps are as follows:
(1) setting an initial optimal error E to be infinite, setting the current optimal feature subset to be an attribute complete set A, and setting the repetition time t to be 0;
(2) randomly generating a group of feature subsets A ', and calculating the error E' of the classifier when the feature subsets are used;
(3) if E ' is smaller than E, making A ' ═ A and E ' and repeating the steps (2) and (3), otherwise T + +, and jumping out of the cycle when T is larger than or equal to the stop control parameter T;
in the calculation process, the LVM method directly takes the performance of the finally used learner as the evaluation criterion of the feature subset, selects the feature subset which is most favorable for the performance of the given learner and is tailored, screens out a high-quality sample feature set, and constructs a feature matrix
Figure BDA0002574365680000081
Wherein, the step C of "risk identification and prediction based on ensemble learning theory" includes the following steps: in order to accurately identify and predict the congestion risk in the traffic system and effectively control the congestion risk, the method comprises the steps of firstly constructing an integrated learning model by using machine learning and relevant mathematical knowledge; secondly, in order to eliminate the influence of non-uniform dimension among the feature vectors on the model, a feature scaling method is used for data feature set
Figure BDA0002574365680000082
Carrying out standardization processing to obtain a standard sample feature matrix
Figure BDA0002574365680000083
Finally, in order to ensure that the model learns the characteristic knowledge of the risks in the traffic system as much as possible, the standard sample characteristic matrix is subjected to
Figure BDA0002574365680000091
Dividing the model into a training set and a test set according to a certain proportion (a: b), training the ensemble learning model by using the training set data, and thenIdentifying and predicting risks in a grid area of the traffic system at the current moment by using a trained ensemble learning model; the specific steps of the process are as follows:
step C1: constructing an ensemble learning model;
step C2: carrying out risk identification and prediction by using an ensemble learning model;
in step C1, the integrated learning model is constructed as follows: the invention aims to learn a more stable and better-performance model by using risk historical data information of a traffic system, the integrated learning model is more prominent in learning compared with a single classifier model, and in order to make up for the defect of learning of the single classifier model, the integrated learning theory is introduced in the invention, and the integrated learning model is constructed to carry out risk identification and prediction on the traffic system; the ensemble learning is to combine a plurality of weak supervision models to obtain a better and more comprehensive strong supervision model, and the potential core idea is that even if a certain weak classifier obtains wrong prediction, other weak classifiers can correct the errors, the current mainstream ensemble learning framework comprises Bagging, Boosting and Stacking, the invention uses the Bagging framework and the associated theoretical method of ensemble learning to construct a random forest model to identify and predict the risk of the traffic system, as shown in fig. 4, the implementation steps are as follows:
(1) assume that there is a dataset D ═ xi1,xi2,…,xin,yi}(i∈[1,m]) With a characteristic number N, with a sample generation sampling space (m x N) put backm*n
(2) Building a base learner (decision tree): for each sample dj={xi1,xi2,…,xik,yi}(i∈[1,m]) (where K < M) generating decision trees and recording the result h of each decision treej(x);
(3) Train T times of
Figure BDA0002574365680000092
Where φ (x), is a mathematical model having: absolute majority voting, relative majorityVoting methods, weighted voting methods, and the like;
a special binary classifier, namely a random forest model, is constructed through the processes, risks in the traffic system are identified and predicted, in the process, the classification function is a symbolic function, output values are 0 and 1, and low risks and high risks in a grid area are respectively represented as follows:
Figure BDA0002574365680000101
in the above formula, f (x)i) Representing the risk status of the ith grid area, 0 representing low risk and 1 representing high risk;
meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the traffic system, and a proper ensemble learning framework and model can be selected according to the distribution characteristics of data samples to identify and predict the risks, so that the risk identification and prediction effects of the traffic system are further improved;
in step C2, "risk identification and prediction using ensemble learning model" specifically includes the following steps: in this step, based on the feature set of the high-quality sample extracted and screened in the step C, i.e. the feature matrix
Figure BDA0002574365680000102
Identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1; because the difference between characteristic dimensions in the historical sample data set can affect the performance of the ensemble learning model, when the model is used for risk identification and prediction, firstly, the sample characteristic set of a research object needs to be subjected to characteristic scaling, the influence of different dimensions among characteristic vectors on the model precision is eliminated, the convergence rate of the model is improved, and a standard sample characteristic matrix is obtained
Figure BDA0002574365680000103
The mainstream feature scaling method in machine learning mainly comprises the maximum-minimum normalization (min-max normalization), average normalizationValue normalization (Mean normalization), normalization (normalization), maximum absolute value normalization (Scaling to unit length), etc., for a sample feature set of a traffic system
Figure BDA0002574365680000104
In the mainstream method for scaling the characteristics, a proper characteristic scaling method can be selected according to the condition of an actual traffic system, the characteristics of a data characteristic set and the characteristics of an applied machine learning method in actual application, so that the maximum accuracy and precision of risk identification and prediction in the traffic system are ensured;
after scaling the characteristics of the sample data set in the traffic system, in this step, the standard sample characteristic matrix based on the traffic system
Figure BDA0002574365680000105
And C, identifying and predicting risks in the traffic system by using the integrated learning model constructed in the step C, and learning the characteristics of the integrated learning model needing to learn the risks in the process, so that the standard sample characteristic set is used for learning the characteristics of the risks in the invention
Figure BDA0002574365680000106
Randomly dividing the training set into a training set and a testing set according to a certain proportion (a: b), wherein the training set is used for training the random forest wheel model to enable the random forest wheel model to learn the characteristics of risks to the maximum extent, and the testing set is used for testing the training effect of the model.
Wherein, the model evaluation and verification in step D is performed as follows: in the process of identifying and predicting the risk in the traffic system by using the ensemble learning model constructed in the step C, in order to accurately and scientifically evaluate the performance of the model, in the step, firstly, evaluation indexes are reasonably selected based on the actual traffic system condition and the final target of the invention, for example: accuracy, precision, recall, F1 values, etc., the nature of which is calculated from a Confusion Matrix (fusion Matrix); secondly, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in the step, so that the scientificity and reliability of model evaluation are further improved; the method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
wherein, the "selecting model evaluation index" in step D1 is specifically performed as follows: the invention is directed at the risk in the traffic system to discern and predict, its final goal is to employ the integrated learning model to discern the risk in the traffic system accurately and scientifically, its essence belongs to the abnormal detection problem in the machine learning, the main characteristic is to have the unbalanced problem of data classification, namely the sample size of the normal data is large and the sample size of the risk data is small, therefore, it can't reflect the model performance quality objectively to use the rate of accuracy alone; according to the invention, the risk identification detection problem is faced in a scene, under the scene, the model is evaluated by adopting two evaluation indexes of recall rate and accuracy, and the formula is as follows:
Figure BDA0002574365680000111
Figure BDA0002574365680000112
in the formula, Accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of positive cases predicted from negative cases, FN is the number of negative cases predicted from positive cases;
the prediction error condition of the real risk unit in the traffic system is better, because if the real congestion risk in the traffic system is not identified, the traffic system is damaged to a great extent once the real congestion risk occurs, and therefore, the recall rate needs to be concerned more; meanwhile, in order to ensure that the normal accurate prediction is normal, reduce the error rate of the normal sample prediction and enable a manager of the traffic system to accurately manage and control the real risk in the traffic system to the maximum extent under the limited resource cost, the accuracy and the recall rate are introduced as the evaluation indexes of the model;
the "evaluation analysis of the model" described in step D2 is specifically performed as follows: in the step, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the integrated learning model is evaluated by using a cross validation method in machine learning, so that the scientificity and reliability of model evaluation are further improved; the classical methods of cross-validation are mainly: the invention relates to a leave-one method, a K-fold cross validation method, a self-service method and the like, wherein the self-service method is used for cross validation, and the steps are as follows:
(1) randomly selecting one sample in a data set containing N samples each time, and taking the sample as a training sample;
(2) putting the randomly selected samples in the step (1) back into the original data set, and sampling the samples in a put-back mode for N times to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) after N times of extraction, the original data set probably has
Figure BDA0002574365680000121
Will not appear in the new dataset, and therefore, samples that do not appear in the new dataset will be taken as validation sets;
(4) repeating the above steps M times, M models can be trained, the values of the evaluation indexes can be obtained, and then the performance evaluation value of the model can be obtained by taking the average value.
Through the steps, based on the complex network theory and the integrated learning theory method, from the perspective of the complex network, the two dimensions of the function and the structure of the traffic system are comprehensively considered, and scientific and reliable technical support and theoretical support are provided for the identification of traffic risks; the technical method provided by the invention can efficiently and accurately identify and predict the risk of the traffic system, and provides important support for risk diagnosis of the traffic system, establishment of targeted management control measures and improvement of traffic operation reliability.
(III) advantages and effects
The invention provides a traffic risk prediction method based on a complex network theory, which has the following advantages:
(1) global property: the traffic network model is constructed from the micro level and the macro level to extract the functional and structural characteristics of the traffic network model, so that the accuracy of the risk prediction of the traffic system is greatly improved, and the traffic network model has great significance for understanding the risk evolution mechanism of the traffic system and improving the reliability of the traffic system;
(2) and (3) timeliness: the invention can monitor the traffic state and predict the future risk in real time, and provides powerful support for the formulation and implementation of the risk control strategy of the traffic system, thereby ensuring the healthy and stable operation of the system;
(3) and (3) expandability: the risk prediction method provided by the invention can be expanded to the risk identification and prediction of other types of complex systems, such as biological systems, communication systems, financial systems and the like.
(4) The method of the invention is scientific, has good manufacturability and has wide popularization and application value.
Drawings
Fig. 1 is a flow chart of a traffic risk prediction method according to the present invention.
FIG. 2 is a traffic risk characterization hierarchy of the present invention.
FIG. 3 is a logic diagram of the process of wrapped feature selection of the present invention.
Fig. 4 is a random forest model architecture diagram of the present invention.
FIG. 5 is a trend chart of evaluation indexes of the random forest model of the present invention.
The numbers, symbols and codes in the figures are explained as follows:
s: the area of the region of interest;
Vi:tithe speed vectors of R roads at the moment;
Figure BDA0002574365680000131
an initial velocity matrix;
Figure BDA0002574365680000132
compensating the normalized speed matrix;
G1(N1,L1): a grid traffic congestion network model;
G2(N2,L2): a mesh node traffic network model;
q (t): a seepage threshold of the traffic network at time t;
Vi_ratio: a normalized velocity vector;
Mf: an initial feature matrix;
Figure BDA0002574365680000133
the screened high-quality characteristic matrix;
Figure BDA0002574365680000141
a high-quality feature matrix after feature scaling;
f(xi): risk status of ith grid area
Accuracy: the model accuracy rate;
recall: model recall;
TP: the number of correct cases predicted;
TN: the number of negative cases correctly predicted;
FP: predicting negative examples as the number of positive examples;
FN: the positive examples are predicted as the number of negative examples.
Detailed Description
In order to make the technical problems and technical solutions to be solved by the present invention clearer, the following detailed description is made with reference to the accompanying drawings and specific embodiments. It is to be understood that the embodiments described herein are for purposes of illustration and explanation only and are not intended to limit the invention.
The invention is further described with reference to the following description and embodiments in conjunction with the accompanying drawings.
The actual traffic system data used in the embodiment of the invention is obtained by counting the real-time speed data of the floating cars on each road section within a certain time span of all roads in the five-ring area of Beijing, which is provided by QF technology company, at a time interval of 1 minute and a time granularity of higher, and at the same time, the time interval is 0:00-23:59 and 1440 moments are total, and the data of 2015, 10 months and 20 days are used for research and analysis in the embodiment.
The traffic risk prediction method based on the complex network theory of the embodiment of the invention is shown in figure 1, and the specific implementation steps are as follows:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: and (5) evaluating and verifying the model.
Through the steps, the purpose of risk prediction of the traffic system can be achieved, the method is strong in systematicness, high in transportability and easy to operate, and the problem that risks in a complex traffic system are difficult to identify and predict is solved.
The step A of establishing the double-layer traffic network model based on the empirical data division grids comprises the following steps of: firstly, acquiring basic information of roads in a research area, wherein the basic information mainly comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of a research area range and the longitude and latitude information of road sections and intersections, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congestion roads exist between grids is used as a judgment bar for judging whether edges are connected, the number of the congestion roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method.
Step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure BDA0002574365680000151
Step A3: construction of grid traffic congestion network model G1(N1,L1);
Step A4: construction of mesh node traffic network model G2(N2,L2);
In step a1, the "grid area is divided based on geographic information" specifically includes the following steps: firstly, extracting traffic network models and traffic road information required by grid area division by utilizing a Python language Mapinfo file, wherein the extracted information mainly comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections, network topological structure information of a Beijing city five-ring traffic system and the like; secondly, calculating the area S in the Beijing five-ring area to be 667 square kilometers, the longitude range of 116.20-116.56 and the latitude range of 39.76-40.03 according to the obtained Beijing five-ring traffic road information and the longitude and latitude information of the crossroad, scientifically and reasonably determining the number of the divided grids to be 2500 according to the actual background condition in the Beijing five-ring area, and then determining the area of each grid to be 516 m; and finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network aiming at each grid area, and recording.
The "speed data preprocessing" described in step A2 obtains a relative speed matrix
Figure BDA0002574365680000161
", it is as follows: in this step, first, actual traffic operation data of a vehicle-mounted Global Positioning System (GPS) is acquired at an arbitrary timing tiExpressing the speeds corresponding to all R roads into a vector form V according to the sequence relation of the roadsi=(v1,v2,…,vR) (ii) a Further, the above process is repeated for all T moments, and finally the velocity vectors V at all moments are integratediGenerating an initial velocity matrix
Figure BDA0002574365680000162
Secondly, in the process of collecting the speed information of the five-ring traffic system in Beijing by using the floating car technology, the speed information of each area at each moment can not be completely collected and reserved due to the influence of the network communication technology and human and natural factors, so that the original speed information of the traffic system needs to be subjected to speed compensation processing, namely an original speed matrix
Figure BDA0002574365680000163
There is a partial missing value (actually recorded as 0) and therefore, it is necessary to find the velocity matrix
Figure BDA0002574365680000164
The velocity missing value in (1), i.e. the element with the value of 0 in the matrix, is subjected to velocity compensation; for tiTime-lapse road RjIs compensated for by first finding the road R in the road network G (N, L)jSet of neighboring roads
Figure BDA0002574365680000165
Finding out whether the speed record exists in the road in the set at the moment, if one element in the set exists in the speed record, judging whether the speed record exists in the road in the setTaking the average value of the elements in the set, wherein the specific formula is as follows:
Figure BDA0002574365680000166
in the above formula, the first and second carbon atoms are,
Figure BDA0002574365680000167
road R indicating lack of speedjAt tiThe speed compensation value at the time of day,
Figure BDA0002574365680000168
road R indicating lack of speedjSet of neighboring roads
Figure BDA0002574365680000169
Is not a sum of 0 element values, J represents a speed-missing road RjSet of neighboring roads
Figure BDA00025743656800001610
The number of elements other than 0 in (1).
If the road RjAll the neighboring road speeds are not recorded, the road R is determinedjIs compensated to 0, the original velocity matrix is used after each compensation
Figure BDA00025743656800001611
Updated to compensated
Figure BDA00025743656800001612
Repeating the above process at each moment until all 0 values in the velocity matrix are compensated to obtain the completed velocity matrix
Figure BDA00025743656800001613
In the original absolute velocity matrix
Figure BDA0002574365680000171
After the road speed compensation is completed, because eachBecause of different road grades, the compensated speed matrix needs to be normalized to obtain the relative speed, and the judgment standards are unified. For any road RjFrom velocity matrix
Figure BDA0002574365680000172
Extracting the speed vector of the road at all times
Figure BDA0002574365680000173
And extracts the maximum speed limit of the road section
Figure BDA0002574365680000174
The velocity vector of the moment
Figure BDA0002574365680000175
Is divided by the maximum speed limit
Figure BDA0002574365680000176
To obtain a normalized velocity
Figure BDA0002574365680000177
Obtaining a normalized velocity matrix
Figure BDA0002574365680000178
As follows:
Figure BDA0002574365680000179
"construction of grid traffic Congestion network model G" described in step A31(N1,L1) ", it is as follows: aiming at each grid area divided in the step A, firstly, according to the five-ring actual map data in Beijing City under each grid area, the structure information between roads and the road intersection information contained in each grid area are extracted by software tools such as Python, Mapinfo and the like; secondly, selecting a five-ring traffic network in Beijing; then, according to the method of the complex network, the roads are divided into each grid areaThe intersection is abstracted into nodes in the network, roads in the grid area traffic network are abstracted into connecting edges among the nodes in the network, and the relative speed of each road is used as the weight of the connecting edges, so that a grid traffic congestion network is established in each grid area; meanwhile, most roads of the five-ring traffic network in Beijing are in bidirectional driving and have directionality, so the grid traffic jam network constructed by the method is a directed weighted network.
"construction of mesh node traffic network model G" described in step A42(N2,L2) ", it is as follows: firstly, constructing an intersection traffic network model between grids according to intersection information contained in each grid area and road topological structure information of a whole Beijing city five-ring traffic network (whole network), namely deleting the road topological structure information contained in the grid area on the basis of the whole network; secondly, counting the number of congested roads between the grid areas and recording the number; and finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids as connecting edges or not by applying a complex network theory and a complex network method according to the information, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as connecting edge weights.
The method for extracting and screening the features based on the complex network theory in the step B comprises the following steps: for each time tiThe grid traffic congestion network and the grid node traffic network (referred to as a double-layer traffic network for short) set a seepage threshold q (t) for seepage analysis, and determine the seepage threshold q (t) to be 0.5 through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and each node (grid) in the grid node traffic network with the seepage threshold value of 0.5 at each moment, extracting the characteristics of each grid area including maximum jam sub-cluster, node betweenness mean value and node degree by applying the theory and method of a complex networkThe average value, the average speed of the grid congestion network, the number of first-order neighbor congested roads and other structural and functional characteristics are screened by a machine learning method on the basis, the characteristics which greatly contribute to the traffic risk identification and prediction effect are selected, a high-quality sample characteristic set is constructed, and the traffic risk identification and prediction effect and efficiency are improved to the greatest extent; and meanwhile, labeling the grid area at the time t according to the proportion of the congested road at the time t + delta t in each grid area. The specific steps of the process are as follows:
step B1: analyzing seepage of a traffic network;
step B2: extracting risk features based on a complex network;
step B3: screening risk characteristics based on machine learning;
the "seepage analysis of the traffic network" described in step B1 is specifically performed as follows: a seepage theory is applied to carry out seepage analysis on a double-layer traffic network, firstly, a control variable, namely a seepage threshold value is given for the traffic network at each moment, and the control variable is set as q (t), so that each road in the traffic network can present two states: unblocked state (i.e. v)i_ratio(t) > q (t) and congestion status (i.e., v)i_ratioQ (t) is less than or equal to q (t); deleting the unblocked connecting edges in the traffic network from the original network, and keeping the congested connecting edges in the original traffic network, wherein the rest network is the traffic network in a congested state at the moment t, and is referred to as a congested network for short; the next q (t) value at each moment corresponds to a congestion network, and as the q (t) value is reduced, the traffic network becomes more congested, namely, the more failed edges, the traffic network becomes more sparse, so that the traffic congestion risk at the current moment is identified and predicted when the proper seepage threshold value q (t) is 0.5, namely, the urban traffic network is in the stage with the most abundant congestion information;
the "risk feature extraction based on complex network" described in step B2 is specifically performed as follows: in the step, the grid traffic congestion network and the grid node traffic network are constructed at each moment under the condition that the seepage threshold q (t) is 0.5, and from the point of view of statistical physics, a complex network theory and a complex network method are used for preliminarily extracting micro and macro characteristics of a grid area of the double-layer traffic network at each moment from the point of view of structure and function. Firstly, on a microscopic level, each grid traffic congestion network is used as a research object, and the microscopic features of each grid area are calculated at the key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macro level, aiming at the constructed grid node traffic network model, taking the nodes (grid areas) thereof as research objects, and calculating the macro features of the grid areas (nodes) at each moment, as shown in fig. 2, such as the micro features: the maximum congestion subgroups of the grid traffic congestion network, the mean value of node betweenness, the mean value of node degree, the mean value of aggregation coefficient, the average speed and the growth rate of the congestion network, and the like, wherein the macro characteristics are as follows: the node average path length, the node strength, the node betweenness, the node degree, the growth rate and the like of the grid node traffic network.
In the invention, a method is provided for extracting features from the perspective of a complex network, the feature extraction of a grid is exemplified, and the features of an actual five-ring traffic system in Beijing City can be preliminarily extracted in a targeted manner according to the actual background and situation of the system and from two aspects of the structure and the function of the system, so as to construct a sample feature set and an initial feature matrix MfDimension (8752,40,30), i.e. 8752 samples, each sample having 40 features.
The "risk feature screening based on machine learning" described in step B3 is specifically performed as follows: in step B2, extracting the functional and structural features of the grid region at each time based on the related knowledge of the complex network, and then constructing an initial feature matrix MfIn order to improve the accuracy and precision of risk identification and prediction in a five-ring traffic system in Beijing, a relevant theoretical method of machine learning is applied to carry out feature selection on a preliminarily constructed sample feature set in the steps, and a high-quality sample feature set is screened out, and mostThe effects of risk identification and prediction in the traffic system are improved to a great extent; meanwhile, the structure and functional characteristics of the five-ring traffic system in Beijing are screened, important characteristics are screened out, irrelevant characteristics are removed, dimension disasters can be relieved, the difficulty of learning tasks is reduced, and the generalization capability of an over-fitting enhanced machine learning model is reduced; aiming at the high complexity characteristic of space-time evolution of a five-ring traffic system in Beijing and the optimization of a given learner, the invention uses a relatively classic LVW (Las Vegas wrapper) method in a wrapping mode to perform characteristic selection, as shown in figure 3. The LVM method is applied to screen out high-quality samples with the characteristics as follows: the point betweenness variance, the edge betweenness variance, the grid congested road proportion and the node betweenness of the grid node traffic network are 10 characteristics in total, and a high-quality characteristic matrix is constructed
Figure BDA0002574365680000201
The dimensions were (8752,10,30), i.e. a total of 8752 samples, each sample sharing 10 high quality features.
Wherein, the step C of 'risk identification and prediction based on ensemble learning theory' comprises the following steps: in order to accurately identify and predict the congestion risk in the five-ring traffic system in Beijing, and effectively control the congestion risk, the method comprises the following steps of firstly constructing an integrated learning model by using machine learning and mathematical related knowledge; secondly, in order to eliminate the influence of non-uniform dimension among the feature vectors on the model, a feature scaling method is used for data feature set
Figure BDA0002574365680000202
Carrying out standardization processing to obtain a standard sample feature matrix
Figure BDA0002574365680000203
Dimension (8752,10, 30); finally, in order to ensure that the model learns the characteristic knowledge of the risk in the five-ring road traffic system in Beijing City as much as possible, the standard sample characteristic matrix is subjected to
Figure BDA0002574365680000204
According to the following steps: 3, dividing the ratio into a training set and a testing set, namely, the number of samples in the training set is 6126, the number of samples in the testing set is 2626, training the ensemble learning model by using the data in the training set, and then, identifying and predicting the risk of the grid area of the traffic system at the current moment by using the trained ensemble learning model. The specific steps of the process are as follows:
step C1: constructing an ensemble learning model;
step C2: carrying out risk identification and prediction by using an ensemble learning model;
the "building ensemble learning model" described in step C1 is implemented as follows: the invention aims to learn a more stable and better-performance model by using risk historical data information of a five-ring traffic system in Beijing, and compared with a single classifier model, an integrated learning model is more prominent in learning. The ensemble learning is to combine a plurality of weak supervision models to obtain a better and more comprehensive strong supervision model, and the potential core idea is that even if a certain weak classifier obtains wrong prediction, other weak classifiers can correct the errors, the current mainstream ensemble learning framework comprises Bagging, Boosting and Stacking.
A special binary classifier, namely a random forest model, is constructed through the processes, risks in a five-ring traffic system in Beijing are identified and predicted, in the process, a classification function is a symbolic function, output values are 0 and 1, and low risks and high risks in a grid area are respectively represented as follows:
Figure BDA0002574365680000211
in the above formula, f (x)i) Indicating the risk status of the ith grid area, 0 representing a low congestion risk and 1 representing a high congestion risk.
Meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the five-ring traffic system in Beijing according to the distribution characteristics of data samples, and a proper ensemble learning framework and model can be selected to identify and predict the risks, so that the effects of identifying and predicting the risks of the traffic system are further improved.
In step C2, the method for risk identification and prediction using ensemble learning model includes: in this step, based on the feature set of the high-quality sample extracted and screened in the step C, i.e. the feature matrix
Figure BDA0002574365680000212
And (4) identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1. Because the difference between characteristic dimensions in the historical sample data set can affect the performance of the ensemble learning model, when the model is used for risk identification and prediction, firstly, the sample characteristic set of a research object needs to be subjected to characteristic scaling, the influence of different dimensions among characteristic vectors on the model precision is eliminated, the convergence rate of the model is improved, and a standard sample characteristic matrix is obtained
Figure BDA0002574365680000213
The mainstream feature Scaling method in machine learning mainly comprises maximum-minimum normalization (min-max normalization), average normalization (Mean normalization), normalization (normalization), maximum-absolute normalization (Scaling to unit length) and the like, and the method is used for a sample feature set of a traffic system
Figure BDA0002574365680000214
The mainstream method for scaling features selects standardized features according to the actual conditions of the five-ring traffic system in Beijing, the characteristics of the data feature set and the applied machine learning methodThe scaling method ensures the maximum accuracy and precision of risk identification and prediction in the traffic system.
After the feature scaling is carried out on the sample data set in the five-ring road traffic system in Beijing City, in the step, the standard sample feature matrix based on the traffic system
Figure BDA0002574365680000221
Identifying and predicting risks in the traffic system by using the random forest model constructed in the step C1, wherein in the process, the random forest model needs to learn the characteristics of the risks, so that the standard sample characteristic set is used in the embodiment
Figure BDA0002574365680000222
And randomly dividing the random forest into a training set and a testing set according to the proportion of 7:3, wherein the number of samples in the training set is 6126, the number of samples in the testing set is 2626, and the training set is used for training a random forest model to learn the characteristics of the congestion risk to the maximum extent.
The method for evaluating and verifying the model in the step D comprises the following steps: in the process of identifying and predicting the risk in the traffic system by using the ensemble learning model constructed in the step C, in order to accurately and scientifically evaluate the performance of the model, in the step, firstly, evaluation indexes are reasonably selected based on the actual traffic system condition and the final target of the invention, for example: accuracy, precision, recall, F1 values, etc., the nature of which is calculated from a Confusion Matrix (fusion Matrix); secondly, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in the step, so that the scientificity and the reliability of the evaluation of the model are further improved. The method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
the "selection of model evaluation index" described in step D1 is specifically performed as follows: the invention aims at identifying and predicting risks in a traffic system, and the final aim is to accurately and scientifically identify the risks in the traffic system by using an integrated learning model, which essentially belongs to the problem of abnormal detection in machine learning. According to the invention, the risk identification detection problem is faced in a scene, under the scene, the model is evaluated by adopting two evaluation indexes of recall rate and accuracy, and the formula is as follows:
Figure BDA0002574365680000223
Figure BDA0002574365680000231
in the formula, Accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of positive cases predicted from negative cases, and FN is the number of negative cases predicted from positive cases.
The prediction error condition of the road traffic system in the five rings of Beijing city is better as less as possible in the truly risky units in the road traffic system in the five rings of Beijing city, because if the true congestion risk in the road traffic system in the five rings of Beijing city is not identified, once the true congestion risk occurs, the traffic system is damaged to a great extent, and therefore, the recall rate needs to be paid more attention; meanwhile, in order to ensure that the normal and accurate prediction is normal, reduce the error rate of the normal sample prediction and enable a manager of the traffic system to accurately manage and control the real risk in the traffic system to the maximum extent under the limited resource cost, the accuracy rate is introduced as the evaluation index of the model. The random forest model in the ensemble learning is used for identifying and predicting the congestion risk of the road traffic system in the five rings of Beijing city, the accuracy rate is 89.83%, the recall rate is 86.74%, the level is high, and the performance of the model is good.
The "evaluation analysis of the model" described in step D2 is specifically performed as follows: in the step, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in machine learning, and the scientificity and reliability of model evaluation are further improved. The classical methods of cross-validation are mainly: the invention relates to a leave-one method, a K-fold cross validation method, a self-service method and the like, wherein the self-service method is used for cross validation, and the steps are as follows:
(1) randomly selecting one sample at a time in a data set containing 8752 samples, and using the sample as a training sample;
(2) putting the randomly selected sample in (1) back into the original data set, and then sampling 8752 times in a putting-back mode to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) after 8752 times of extraction, 3221 samples in the original data set do not appear in the new data set, and therefore, the samples which do not appear in the new data set are taken as a verification set;
(4) repeating the above steps 10 times, 10 models can be trained, and the values of the evaluation indexes can be obtained, and then averaging is performed, so that the performance evaluation value of the model can be obtained.
As shown in fig. 5, the random forest model is used for identifying and predicting the congestion risk of the road traffic system in the five rings of beijing city, and the self-service method is used for performing cross validation on the model for 10 times, wherein the average value of the accuracy is about 92.84%, and the average value of the recall rate is about 92.45%, and is at a higher level, which indicates that the model has stronger generalization capability and better performance, can accurately and reliably identify and predict the congestion risk in the road traffic system in the five rings of beijing city, and provides powerful guarantee for ensuring safe, stable and healthy operation.
The invention has not been described in detail and is within the skill of the art.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention.

Claims (5)

1. A traffic risk prediction method based on a complex network theory is characterized in that: the method comprises the following steps:
step A: dividing grids based on empirical data to construct a double-layer traffic network model;
and B: extracting and screening features based on a complex network theory;
and C: risk prediction is carried out based on an ensemble learning theory;
step D: and (5) evaluating and verifying the model.
2. The traffic risk prediction method based on the complex network theory as claimed in claim 1, wherein: the establishing of the double-layer traffic network model based on the empirical data division grids in the step A comprises the following steps: firstly, acquiring basic information of roads in a research area, wherein the basic information comprises two parts, namely traffic network road information and longitude and latitude information of a traffic road intersection, dividing the basic information into N-M grid areas according to the area and the size of a research area range and the longitude and latitude information of road sections and the longitude and latitude information of the intersection, and labeling the grid areas; secondly, aiming at each grid area, constructing a grid traffic jam network model by using a complex network theory and a method according to actual traffic data, intersection as a node, road section as an edge and relative speed of the road section as an edge weight in a grid on a microscopic level; on a macroscopic level, each grid area is used as a node, whether congested roads exist between grids is used as a judgment bar for judging whether edges are connected or not, the number of the congested roads existing between the grids is used as an edge weight, and a grid node traffic network model is constructed by applying a complex network theory and a method; the specific method comprises the following steps:
step A1: dividing grid areas based on geographic information;
step A2: preprocessing speed data to obtain relative speed matrix
Figure FDA0002574365670000011
Step A3: construction of grid traffic congestion network model G1(N1,L1);
Step A4: construction of mesh node traffic network model G2(N2,L2);
Wherein, in the step a1, the grid area is divided based on the geographic information, which is specifically done as follows: firstly, extracting traffic network models and traffic road information required for dividing grid areas from a geographic information system (Mapinfo file) by using programming software Python, wherein the extracted information comprises vehicle-mounted speed of each road at each moment, longitude and latitude information of intersections and network topological structure information of a researched traffic system, calling a Baidu map Application Programming Interface (API) by using Python and matching the topological structure of the road network and the names of the intersections by adopting a sequential traversal method to obtain longitude and latitude information of the intersections, and processing the road and intersection information which cause longitude and latitude acquisition failure due to the difference of the names of the road intersections on the Baidu map and the Mapinfo to obtain an accurate standard traffic system road network longitude and latitude information data set; secondly, calculating the area S and the latitude and longitude dereferencing range of the researched area according to the obtained traffic road information of the researched area and the longitude and latitude information of the intersection, and scientifically and reasonably determining the number of the divided grids to be N × M according to the actual background condition of the researched area, so that the area of each grid is S/(N × M); finally, according to the divided grid areas, counting which intersections are in the grid according to the longitude and latitude information of each intersection in the traffic network aiming at each grid area, and recording;
wherein, the speed data preprocessing described in the step A2 is used to obtain the relative speed matrix
Figure FDA0002574365670000021
The method comprises the following specific steps: in this step, first, actual traffic operation data of a GPS, which is a vehicle-mounted global positioning system, is acquired at an arbitrary time tiThe corresponding speeds of all R roads are calculated according to the sequence of the roadsRelationship, expressed in vector form Vi=(v1,v2,…,vR) (ii) a Further, the above process is repeated for all T moments, and finally the velocity vectors V at all moments are integratediGenerating an initial velocity matrix
Figure FDA0002574365670000022
Secondly, in the process of collecting the speed information of the traffic system by using the floating car technology, due to the influence of the network communication technology and human and natural factors, the speed information of each area at each moment can not be completely collected and reserved, so that the original speed information of the traffic system needs to be subjected to speed compensation processing, namely an original speed matrix
Figure FDA0002574365670000023
There are partial missing values, and therefore, a speed matrix needs to be found
Figure FDA0002574365670000024
The velocity missing value in (1), i.e. the element with the value of 0 in the matrix, is subjected to velocity compensation; for tiTime-lapse road RjIs compensated for by first finding the road R in the road network G (N, L)jSet of neighboring roads
Figure FDA0002574365670000025
Searching whether the speed record exists on the road in the set at the moment, and if one element in the set has the speed record, taking the average value of the elements in the set, wherein the specific formula is as follows:
Figure FDA0002574365670000026
in the above formula, the first and second carbon atoms are,
Figure FDA0002574365670000027
road R indicating lack of speedjAt tiThe speed compensation value at the time of day,
Figure FDA0002574365670000028
road R indicating lack of speedjSet of neighboring roads
Figure FDA0002574365670000029
Is not a sum of 0 element values, J represents a speed-missing road RjSet of neighboring roads
Figure FDA00025743656700000210
The number of elements other than 0;
if the road RjAll the neighboring road speeds are not recorded, the road R is determinedjIs compensated to 0, the original velocity matrix is used after each compensation
Figure FDA00025743656700000211
Updated to compensated
Figure FDA00025743656700000212
Repeating the above process at each moment until all 0 values in the velocity matrix are compensated to obtain the completed velocity matrix
Figure FDA00025743656700000213
In the original absolute velocity matrix
Figure FDA0002574365670000031
After the road speed compensation is completed, because the road grades at all levels are different, normalization processing is carried out on the compensated speed matrix to obtain the relative speed of the compensated speed matrix, and the judgment standard is unified; for any road RjFrom velocity matrix
Figure FDA0002574365670000032
Extracting the speed vector of the road at all times
Figure FDA0002574365670000033
And extracts the maximum speed limit of the road section
Figure FDA0002574365670000034
The velocity vector of the moment
Figure FDA0002574365670000035
Is divided by the maximum speed limit
Figure FDA0002574365670000036
To obtain a normalized velocity
Figure FDA0002574365670000037
Obtaining a normalized velocity matrix
Figure FDA0002574365670000038
As follows:
Figure FDA0002574365670000039
wherein, the step A3 is used for constructing the grid traffic jam network model G1(N1,L1) The method comprises the following specific steps: aiming at each grid area divided in the step A, firstly, according to actual map data under each grid area, using Python and Mapinfo software tools to extract structure information among roads and road intersection information contained in each grid area; secondly, selecting a proper geographical coverage area of traffic according to the requirement of actual research, abstracting a road intersection in each grid area as a node in the network according to a complex network method, abstracting the road in the grid area traffic network as a connecting edge between nodes in the network, and taking the relative speed of each road as the weight of the connecting edge so as to establish a grid traffic congestion network in each grid area; at the same time, most of the traffic networksThe roads are driven in two directions and have directionality, so the constructed grid traffic jam network is a directed weighting network;
wherein, the step A4 is to construct the mesh node traffic network model G2(N2,L2) The method comprises the following specific steps: firstly, constructing an intersection traffic network model between grids according to intersection information contained in a plurality of grid areas and the traffic network of the whole research area, namely the road topological structure information of the whole grid, namely deleting the road topological structure information contained in the grid areas on the basis of the whole grid; secondly, counting the number of congested roads between the grid areas and recording the number; and finally, abstracting a grid area into nodes, abstracting whether congestion roads exist between grids as connecting edges or not by applying a complex network theory and a complex network method according to the information, and establishing a grid node traffic network model by taking the number of the congestion roads between the grids as connecting edge weights.
3. The traffic risk prediction method based on the complex network theory as claimed in claim 1, wherein: the feature extraction and screening based on the complex network theory described in the step B is performed as follows: for each time tiThe grid traffic congestion network and the grid node traffic network, which are referred to as a double-layer traffic network for short, set a seepage threshold q (t) for seepage analysis, and determine the seepage threshold q (t) through the seepage analysis of the double-layer traffic network; secondly, aiming at each grid traffic jam network and each grid in the grid node traffic network at each moment under the seepage threshold q (t), namely the grid, as research objects, extracting the characteristics of each grid area by applying the theory and method of a complex network, wherein the characteristics comprise the structural and functional characteristics of the maximum jam sub-cluster, the mean value of the node betweenness, the mean value of the node degree, the average speed of the grid jam network and the number of first-order neighbor jam roads, and applying a machine learning method to extract the extracted characteristics on the basisScreening characteristics, namely selecting characteristics which greatly contribute to the traffic risk identification and prediction effect, constructing a high-quality sample characteristic set, and improving the traffic risk identification and prediction effect and efficiency to the greatest extent; meanwhile, labeling the grid area at the time t according to the proportion of the congested road at the time t + delta t in each grid area; the specific steps of the process are as follows:
step B1: analyzing seepage of a traffic network;
step B2: extracting risk features based on a complex network;
step B3: screening risk characteristics based on machine learning;
in step B1, the seepage analysis of the traffic network specifically includes the following steps: a seepage theory is applied to carry out seepage analysis on a double-layer traffic network, firstly, a control variable, namely a seepage threshold value is given for the traffic network at each moment, and the control variable is set as q (t), so that each road in the traffic network can present two states: unblocked state i.e. vi_ratio(t) > q (t) and congestion status vi_ratioQ (t) is less than or equal to q (t); deleting the unblocked connecting edges in the traffic network from the original network, and keeping the congested connecting edges in the original traffic network, wherein the rest network is the traffic network in a congested state at the moment t, and is referred to as a congested network for short; the next q (t) value at each moment corresponds to a congestion network, and as the q (t) value is reduced, the traffic network becomes more congested, namely, the more failed edges are, the traffic network becomes more sparse, so that the traffic congestion risk at the current moment is identified and predicted when the proper seepage threshold value q (t), namely the urban traffic network is in the stage with the most abundant congestion information, is selected;
wherein, in the step B2, the extracting of the risk features based on the complex network specifically includes the following steps: constructing a grid traffic jam network and a grid node traffic network at each moment under a seepage threshold q (t), and preliminarily extracting micro and macro characteristics of a grid area of a double-layer traffic network at each moment from the viewpoint of statistics physics by using a complex network theory and a complex network method from the viewpoint of structure and function; firstly, on a microscopic level, each grid traffic congestion network is used as a research object, and the microscopic features of each grid area are calculated at the key seepage threshold value at each moment; the grid traffic congestion network has different characteristics at different moments, and the congestion network in the grid area can show dynamic characteristics in space along with the evolution of time, so that the grid traffic congestion network has a spatio-temporal characteristic; secondly, on a macroscopic level, aiming at the constructed grid node traffic network model, taking a node, namely a grid area, as a research object, calculating the macroscopic characteristics of the grid area, namely the node, the maximum congestion sub-cluster of the grid traffic congestion network, the mean value of node betweenness, the mean value of node degree, the mean value of aggregation coefficient, the average speed of the congestion network and the growth rate thereof at each moment, wherein the macroscopic characteristics are as follows: the average path length of nodes, the strength of the nodes, the node betweenness, the node degree and the growth rate of the nodes of the grid node traffic network;
a method is provided for extracting features from the perspective of a complex network, the feature extraction of a grid is exemplified, and the features of an actual traffic system can be preliminarily extracted in a targeted manner from two aspects of the structure and the function of the actual traffic system according to the actual background and the actual situation of the actual traffic system, so that a sample feature set is constructed, and an initial feature matrix M is constructedf
The risk feature screening based on machine learning in step B3 is specifically performed as follows: in step B2, extracting the functional and structural features of the grid region at each time based on the related knowledge of the complex network, and then constructing an initial feature matrix MfIn order to improve the accuracy and precision of risk identification and prediction in the traffic system, a relevant theoretical method of machine learning is used for carrying out feature selection on a preliminarily constructed sample feature set in the step, so that a high-quality sample feature set is screened out, and the effect of risk identification and prediction in the traffic system is improved to the greatest extent; meanwhile, the structure and function characteristics of the traffic system are screened, important characteristics are screened out, irrelevant characteristics are removed, dimension disasters can be relieved, the difficulty of learning tasks is reduced, and the generalization capability of an over-fitting enhanced machine learning model is reduced; has the characteristics of high complexity of space-time evolution aiming at a traffic system and aims toOptimizing a given learner, and selecting characteristics by using a classic LVW (Las Vegas wrapper) method in a wrapping mode, wherein the method comprises the following specific steps:
(1) setting an initial optimal error E to be infinite, setting the current optimal feature subset to be an attribute complete set A, and setting the repetition time t to be 0;
(2) randomly generating a group of feature subsets A ', and calculating the error E' of the classifier when the feature subsets are used;
(3) if E ' is smaller than E, making A ' ═ A and E ' and repeating the steps (2) and (3), otherwise T + +, and jumping out of the cycle when T is larger than or equal to the stop control parameter T;
in the calculation process, the LVM method directly takes the performance of the finally used learner as the evaluation criterion of the feature subsets, selects the feature subsets which are most beneficial to the performance and customized for the given learner, screens out high-quality sample feature sets, and constructs a feature matrix
Figure FDA0002574365670000051
4. The traffic risk prediction method based on the complex network theory as claimed in claim 1, wherein: in step C, risk identification and prediction are performed based on ensemble learning theory, which includes the following steps: in order to accurately identify and predict the congestion risk in the traffic system and effectively control the congestion risk, the method comprises the steps of firstly constructing an integrated learning model by using machine learning and relevant mathematical knowledge; secondly, in order to eliminate the influence of non-uniform dimension among the feature vectors on the model, a feature scaling method is used for data feature set
Figure FDA0002574365670000061
Carrying out standardization processing to obtain a standard sample feature matrix
Figure FDA0002574365670000062
Finally, in order to ensure that the model learns the characteristic knowledge of the risks in the traffic system as much as possible, the standard sample characteristic matrix is subjected to
Figure FDA0002574365670000063
Dividing the traffic system into a training set and a test set according to a preset proportion (a: b), training an ensemble learning model by using training set data, and then identifying and predicting risks of a grid area of the traffic system at the current moment by using the trained ensemble learning model; the specific steps of the process are as follows:
step C1: constructing an ensemble learning model;
step C2: carrying out risk identification and prediction by using an ensemble learning model;
wherein, in the step C1, the ensemble learning model is constructed by the following specific steps: a random forest model is constructed by using a Bagging framework and an integrated learning related theoretical method to identify and predict risks of a traffic system, and the method comprises the following implementation steps:
(1) assume that there is a dataset D ═ xi1,xi2,…,xin,yi}(i∈[1,m]) With a characteristic number N, with a sample generation sampling space (m x N) put backm*n
(2) Constructing a base learner, namely a decision tree: for each sample dj={xi1,xi2,…,xik,yi}(i∈[1,m]) Where K < M, generating decision trees and recording the result h of each decision treej(x);
(3) Train T times of
Figure FDA0002574365670000064
Where φ (x), is a mathematical model having: absolute majority voting, relative majority voting, and weighted voting;
a special binary classifier, namely a random forest model, is constructed through the processes, risks in the traffic system are identified and predicted, in the process, the classification function is a symbolic function, output values are 0 and 1, and low risks and high risks in a grid area are respectively represented as follows:
Figure FDA0002574365670000065
in the above formula, f (x)i) Representing the risk status of the ith grid area, 0 representing low risk and 1 representing high risk;
meanwhile, an ensemble learning model is constructed by applying an ensemble learning theory to identify and predict risks of the traffic system, and a proper ensemble learning framework and model can be selected according to the distribution characteristics of data samples to identify and predict the risks, so that the effects of identifying and predicting the risks of the traffic system are further improved;
wherein, in the step C2, the risk identification and prediction is performed by using the ensemble learning model, which specifically includes the following steps: in this step, based on the feature set of the high-quality sample extracted and screened in the step C, i.e. the feature matrix
Figure FDA0002574365670000071
Identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C1; because the difference between characteristic dimensions in the historical sample data set can affect the performance of the ensemble learning model, when the model is used for risk identification and prediction, firstly, the sample characteristic set of a research object needs to be subjected to characteristic scaling, the influence of different dimensions among characteristic vectors on the model precision is eliminated, the convergence rate of the model is improved, and a standard sample characteristic matrix is obtained
Figure FDA0002574365670000072
The mainstream feature Scaling method in machine learning comprises min-max normalization, Mean normalization, Standardization and Scaling to unit length, wherein the method is used for a sample feature set of a traffic system
Figure FDA0002574365670000073
Mainstream methods of feature scaling;
number of samples in traffic systemAfter feature scaling of the data set, in this step, based on the standard sample feature matrix of the traffic system
Figure FDA0002574365670000074
And C, identifying and predicting risks in the traffic system by using the integrated learning model constructed in the step C, and learning the characteristics of the integrated learning model needing to learn the risks in the process, so that the standard sample characteristic set
Figure FDA0002574365670000075
According to a predetermined ratio, namely: and b, randomly dividing the training set into a training set and a testing set, wherein the training set is used for training the random forest wheel model to furthest learn the characteristics of risks, and the testing set is used for testing the training effect of the model.
5. The traffic risk prediction method based on the complex network theory as claimed in claim 1, wherein: the model evaluation and validation described in step D is performed as follows: in the process of identifying and predicting risks in the traffic system by using the ensemble learning model constructed in the step C, in order to accurately and scientifically evaluate the performance of the model, in the step, evaluation indexes are reasonably selected based on the actual traffic system condition and the final target, for example: accuracy, precision, recall and F1 values, the nature of which is calculated from a Confusion Matrix, i.e. fusion Matrix; secondly, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the ensemble learning model is evaluated by using a cross validation method in the step, so that the scientificity and reliability of model evaluation are further improved; the method specifically comprises the following substeps:
step D1: selecting a model evaluation index;
step D2: evaluating and analyzing the model;
wherein, the model evaluation index selected in step D1 is specifically made as follows: the method aims at identifying and predicting risks in a traffic system, and the final aim is to accurately and scientifically identify the risks in the traffic system by using an integrated learning model, the essence of the method belongs to the problem of abnormal detection in machine learning, and the problem of unbalanced data categories exists, namely the sample size of normal data is large while the sample size of risk data is small, so that the quality of model performance cannot be objectively reflected by the single use of the accuracy rate; according to the faced scene, the risk identification and detection problem is solved, under the scene, the model is evaluated by adopting two evaluation indexes of recall rate and accuracy, and the formula is as follows:
Figure FDA0002574365670000081
Figure FDA0002574365670000082
in the formula, Accuracy represents Accuracy, recall represents recall, and TP is the number of correct predicted cases; TN is the number of correctly predicted negative cases, FP is the number of positive cases predicted from negative cases, FN is the number of negative cases predicted from positive cases;
the evaluation analysis of the model in step D2 is specifically performed as follows: in the step, in order to prevent the model from being over-fitted and accurately evaluate the generalization ability of the model, the integrated learning model is evaluated by using a cross validation method in machine learning, so that the scientificity and reliability of model evaluation are further improved; classical methods of cross-validation are: the leave-one method, the K-turn cross validation and the self-service method are used for cross validation, and the steps are as follows:
(1) randomly selecting one sample in a data set containing N samples each time, and taking the sample as a training sample;
(2) putting the randomly selected samples in the step (1) back into the original data set, and sampling the samples in a put-back mode for N times to generate a data set with the same size as the original data set, wherein the new data set is a training set;
(3) after N times of extraction, the original data set probably has
Figure FDA0002574365670000083
Will not appear in the new dataset, and therefore, samples that do not appear in the new dataset will be taken as validation sets;
(4) repeating the steps M times, training M models and obtaining the values of the evaluation indexes, and then taking the average value to obtain the performance evaluation value of the model.
CN202010649490.3A 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory Active CN111967712B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010649490.3A CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010649490.3A CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Publications (2)

Publication Number Publication Date
CN111967712A true CN111967712A (en) 2020-11-20
CN111967712B CN111967712B (en) 2023-04-07

Family

ID=73361398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010649490.3A Active CN111967712B (en) 2020-07-08 2020-07-08 Traffic risk prediction method based on complex network theory

Country Status (1)

Country Link
CN (1) CN111967712B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989374A (en) * 2021-03-09 2021-06-18 闪捷信息科技有限公司 Data security risk identification method and device based on complex network analysis
CN112991743A (en) * 2021-04-22 2021-06-18 泰瑞数创科技(北京)有限公司 Real-time traffic risk AI prediction method based on driving path and system thereof
CN113034913A (en) * 2021-03-22 2021-06-25 平安国际智慧城市科技股份有限公司 Traffic congestion prediction method, device, equipment and storage medium
CN115985089A (en) * 2022-12-01 2023-04-18 西部科学城智能网联汽车创新中心(重庆)有限公司 Method and device for perceiving vulnerable traffic participants based on cloud
CN116307737A (en) * 2023-05-06 2023-06-23 交通运输部水运科学研究所 Dangerous cargo container security risk prediction method based on port berth congestion degree

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583494A (en) * 2018-11-28 2019-04-05 重庆邮电大学 The feature extraction and prediction technique of dynamic network link based on structure Sub-Image Feature
CN110211378A (en) * 2019-05-29 2019-09-06 北京航空航天大学 A kind of urban transportation health indicator system appraisal procedure based on Complex Networks Theory
US20190355244A1 (en) * 2018-09-18 2019-11-21 Beihang University Method for anticipating tipping point of traffic resilience based on percolation analysis
CN111081016A (en) * 2019-12-18 2020-04-28 北京航空航天大学 Urban traffic abnormity identification method based on complex network theory

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190355244A1 (en) * 2018-09-18 2019-11-21 Beihang University Method for anticipating tipping point of traffic resilience based on percolation analysis
CN109583494A (en) * 2018-11-28 2019-04-05 重庆邮电大学 The feature extraction and prediction technique of dynamic network link based on structure Sub-Image Feature
CN110211378A (en) * 2019-05-29 2019-09-06 北京航空航天大学 A kind of urban transportation health indicator system appraisal procedure based on Complex Networks Theory
CN111081016A (en) * 2019-12-18 2020-04-28 北京航空航天大学 Urban traffic abnormity identification method based on complex network theory

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANXI GAO 等: "《Recent Progress on the Resilience of Complex Networks》", 《ENERGIES》 *
LIMIAO ZHANG 等: "《Scale-free resilience of real traffic jams》", 《PNAS》 *
高自友等: "复杂网络理论与城市交通系统复杂性问题的相关研究", 《交通运输系统工程与信息》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989374A (en) * 2021-03-09 2021-06-18 闪捷信息科技有限公司 Data security risk identification method and device based on complex network analysis
CN113034913A (en) * 2021-03-22 2021-06-25 平安国际智慧城市科技股份有限公司 Traffic congestion prediction method, device, equipment and storage medium
CN112991743A (en) * 2021-04-22 2021-06-18 泰瑞数创科技(北京)有限公司 Real-time traffic risk AI prediction method based on driving path and system thereof
CN115985089A (en) * 2022-12-01 2023-04-18 西部科学城智能网联汽车创新中心(重庆)有限公司 Method and device for perceiving vulnerable traffic participants based on cloud
CN115985089B (en) * 2022-12-01 2024-03-19 西部科学城智能网联汽车创新中心(重庆)有限公司 Method and device for perceiving weak traffic participants based on cloud
CN116307737A (en) * 2023-05-06 2023-06-23 交通运输部水运科学研究所 Dangerous cargo container security risk prediction method based on port berth congestion degree
CN116307737B (en) * 2023-05-06 2023-07-18 交通运输部水运科学研究所 Dangerous cargo container security risk prediction method based on port berth congestion degree

Also Published As

Publication number Publication date
CN111967712B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN111967712B (en) Traffic risk prediction method based on complex network theory
CN111081016B (en) Urban traffic abnormity identification method based on complex network theory
Liu et al. Modeling different urban growth patterns based on the evolution of urban form: A case study from Huangpi, Central China
CN104503874A (en) Hard disk failure prediction method for cloud computing platform
CN109840660A (en) A kind of vehicular characteristics data processing method and vehicle risk prediction model training method
CN114547827B (en) Infrastructure group running state evaluation method, electronic device and storage medium
CN114330812A (en) Landslide disaster risk assessment method based on machine learning
KR20090093174A (en) Method and system for evaluating ground-water pollution vulnerability and risk assessment
Liu et al. A comprehensive risk analysis of transportation networks affected by rainfall‐induced multihazards
CN114036841A (en) Landslide incidence prediction method and system based on semi-supervised support vector machine model
Wang et al. Design and implementation of early warning system based on educational big data
Liu et al. Traffic dynamics exploration and incident detection using spatiotemporal graphical modeling
Pampoore-Thampi et al. Mining GIS data to predict urban sprawl
CN113191642B (en) Regional landslide sensitivity analysis method based on optimal combination strategy
Soldan et al. Short-term forecast of EV charging stations occupancy probability using big data streaming analysis
Tang et al. Flood forecasting based on machine learning pattern recognition and dynamic migration of parameters
Kovačević et al. Sampling and machine learning methods for a rapid earthquake loss assessment system
Momeni et al. Pattern‐based calibration of cellular automata by genetic algorithm and Shannon relative entropy
CN117540303A (en) Landslide susceptibility assessment method and system based on cross semi-supervised machine learning algorithm
CN117275215A (en) Urban road congestion space-time prediction method based on graph process neural network
CN116720098A (en) Abnormal behavior sensitive student behavior time sequence modeling and academic early warning method
CN115829150A (en) Accumulated water prediction system
Weifeng et al. On rural typologies with neural network method: Case study on Xining region
Dong et al. Short-term traffic flow forecasting of road network based on spatial-temporal characteristics of traffic flow
CN112465189A (en) Method for predicting number of court settlement plans based on time-space correlation analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant