Background art:
with the increase of traffic demand and the aggravation of traffic jam, the traditional method can not solve the problem of the traffic jam, so in recent years, intelligent traffic is greatly developed, and the research on the traffic jam detection attracts the wide interest of students because of a part of an Advanced Traveler Information System (ATIS) for the traffic jam detection. Traffic congestion is divided into periodic traffic congestion and sporadic traffic congestion, and compared with the periodic traffic congestion, the sporadic traffic congestion has a larger influence on traffic travelers because delay caused by the sporadic traffic congestion is unexpected and seriously interferes with routing decisions of people. Since traffic accidents usually cause occasional traffic jams, people certainly consider that traffic accident detection is a necessary prerequisite for traffic jam detection, early research puts attention on traffic accident detection, and occasional traffic jams are judged by using results of traffic accident detection. Since the 70 s of the nineteenth century, many traffic accident detection Algorithms (AIDs) based on monitoring were proposed. Subsequently, it has been proposed to detect traffic accidents from a statistical point of view using bayesian and standard normal deviation algorithms. Still other researchers have used pattern recognition algorithms for detection, such as the California algorithm. The california algorithm is also one of the earliest detection algorithms, and although the accuracy is greatly deficient, the california algorithm is improved in subsequent researches. In addition, some data-driven neural network Algorithms (ANNs) are also applied to event detection.
However, traffic accidents cannot completely replace sporadic traffic congestion, as traffic accidents are only one of the many causes of sporadic traffic congestion. Although the causes of sporadic traffic congestion vary, the effects of these causes on the distribution of traffic conditions of various types are similar. Therefore, the real-time traffic state can be used as a guide for accidental traffic jam, and whether accidental traffic jam exists or not can be judged through the change of the traffic state. Therefore, knowing the traffic state data distribution and mode such as traffic flow, speed and the like under normal conditions is the key for detecting accidental traffic jam, and after determining the data of the normal conditions, the difference between the real data and the normal data can represent accidental traffic jam. In order to extract normal traffic state data distribution and patterns, many existing researches focus on setting a threshold value, that is, an initial preset value is given, and when the observed real value of the traffic data is greater than the preset value, accidental traffic congestion can be considered to occur. Thus, conventional sporadic traffic detection typically includes: data acquisition, data preprocessing, data threshold design of the normal traffic state in a single time mode/dimension, and 4 parts of accidental congestion detection. The main flow is shown in fig. 1, firstly, the traffic state data (generally, road density data) of a road section is collected; then, preprocessing the original data, including screening abnormal values, filling lost data and the like, and if the data is detector data, eliminating low-detection-rate data; then, according to methods such as historical experience or statistics, time is used as an independent variable, traffic state data is used as a dependent variable, and the upper limit of the threshold value of the traffic state data under normal conditions is set; and finally, comparing the acquired traffic state data with a normal threshold value, and detecting the time and the place of the accidental traffic jam. However, the periodic travel habits of commuters not only bring the early-late peak and the off-peak, but also enable the traffic state to have a week-mode characteristic, and the traffic of working days is obviously higher than that of weekends. A single threshold determined from traffic state data such as traffic flow, speed, etc. may not accurately represent their normal distribution and pattern. Furthermore, the method is simple. The traffic data of adjacent paths and position points are correlated, which means that the traffic data has strong spatial correlation. Therefore, a model that can exploit the inter-day, inter-week distribution and spatial correlation of traffic data is needed to detect sporadic traffic congestion.
Early threshold-based detection methods mostly take single traffic data or data vectors as research objects, and one vector can only represent characteristics of one aspect of the traffic data at most, such as a day pattern characteristic, a week pattern characteristic or a space characteristic, but cannot combine the characteristics for consideration. In recent years, many researchers have begun to express the spatiotemporal correlation of traffic data using matrices, i.e., traffic data is constructed as matrix models rather than as simple vectors. The detection method based on BRPCA assumes that the distribution of sporadic traffic events in time and space is sparse, and the normal traffic data distribution is low-rank, because the detection method proves that the traffic observation values have strong correlated time and space modes and show periodicity, and the traffic data observation values of adjacent upstream road sections and adjacent downstream road sections show strong correlation. Experimental results based on the BRPCA method demonstrate that outliers such as delays can be well detected and tracked through low rank and sparsity of traffic data. In addition, in the research of urban road network sporadic traffic jam detection proposed in 2014 by Berk Anbaroglu et al, it is also proposed to fully utilize the space-time correlation of traffic state variables, and they utilize a clustering analysis method to cluster traffic states similar in time and space together and then determine the occurrence time, location and magnitude of sporadic traffic jam. They demonstrated that the accuracy of detection can be improved by making full use of the spatiotemporal correlation of traffic conditions. In summary, in the view of a large trend, the detection accuracy can be greatly improved by combining a plurality of traffic state data and fully utilizing the space-time correlation of the traffic state data. However, the second-order matrix can only utilize two mode characteristics at the same time, and the traffic data is natural multi-mode data, and the data model construction needs to be improved to fully utilize the multi-mode characteristics. In terms of incomplete traffic data processing, the inventors have demonstrated in 2013 that a tensor model can take full advantage of the potential multi-modal characteristics of traffic data, and thus is superior to many matrix-based models. Therefore, the inventor considers the advantages of the low-rank characteristic and the tensor model of the traffic data and constructs a robust tensor decomposition model to extract the low-rank and sparse parts of the traffic data, so that the space-time position information of the sporadic traffic jam is obtained.
After the selection of the model is completed, how to fully utilize various traffic data is also important for congestion detection, most existing researches are based on a single traffic state variable, but sporadic traffic has a lot of influence on the traffic state, and the change conditions of different states correspond to different types of congestion, so that if the accuracy of sporadic traffic congestion detection is to be improved, a plurality of traffic state variables should be combined. Therefore, the inventor considers coupling various traffic state variables on the basis of constructing a traffic tensor data robustness decomposition model, and realizes the coupling decomposition of multiple traffic variables.
Disclosure of Invention
In view of the above, the present invention provides a coupling robustness tensor decomposition-based sporadic traffic congestion detection method, so as to solve the problem in the traffic congestion detection in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a coupling robustness tensor decomposition-based sporadic traffic congestion detection method comprises the following steps:
collecting various traffic variable data;
constructing a tensor model according to the multiple traffic variable data;
performing filling preprocessing on lost data in the multiple traffic variable data;
constructing a coupled robustness decomposition model of various traffic variable data;
and carrying out accidental communication congestion detection according to the coupling robustness decomposition model.
Further, the constructing a tensor model according to the multiple traffic variable data comprises:
according to the characteristics of the traffic state parameters, a tensor model of the following functions is constructed:
where N denotes the number of detectors, M denotes the number of weeks, W denotes 7 days in a week or 5 days of a weekday, and T denotes the time of day.
Further, the constructing of the coupled robust decomposition model of the multiple traffic variable data includes:
carrying out coupling robustness tensor solution on various traffic state variable data; and identifying the sparse part obtained by decomposing the coupling robustness tensor as space-time position and size information of the sporadic communication.
Further, the function of performing the coupled robustness component tensor solution on the multiple traffic state variable data is as follows:
wherein,
a tensor model constructed of the quantities of representation or observation data,
represents a low rank portion;
a sparse portion is represented by a graph of,
irepresenting white noise of the data as a whole, wherein
Sparse distribution common to the three data;
further, the matrix factor obtained by CP decomposition of the low rank part obeys normal distribution
R is more than 1 and less than R; 1 < N < N, wherein R represents CP rank and N represents the order of tensor;
further, the sparse portion
Each element of (1)
A value of 0-1, obeying Bernoulli distribution
Its conjugate prior
Obeying Beta distribution
Sparse part
Is subject to a normal distribution
White noise obeys normal distribution
Further, the sparse part obtained by decomposing the coupling robustness tensor is marked as space-time position and size information of the occurrence of the sporadic communication, wherein the sparse part
Marked as the spatio-temporal location of the occurrence of sporadic traffic congestion, sparse part
Is shown and
and the size of the sporadic traffic jam corresponding to the space-time position.
Has the advantages that:
the method comprises the steps of firstly, collecting various traffic variable data; then according to the characteristics of various traffic data, constructing the traffic data into tensor models with the same size and the same order number; then, carrying out filling pretreatment on the lost data; then constructing a Bayesian robustness tensor decomposition model, providing a Bayesian robustness tensor decomposition model with a self-adaptive rank, and describing shared sparse distribution of different types of traffic data from a probability angle, thereby constructing a coupled robustness decomposition model of multiple types of traffic data; and finally, designing a solving method to realize high-precision rapid accidental traffic jam detection.
Detailed Description
The embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
Aiming at the defects of the traditional method, the invention provides a coupled Bayesian robust tensor decomposition-based sporadic congestion detection method, the technical route of which is shown in figure 2, and various traffic variable data (traffic flow, road density, speed and the like) are collected firstly; then according to the characteristics of various traffic data, constructing the traffic data into tensor models with the same size and the same order number; then, carrying out filling pretreatment on the lost data; then constructing a Bayesian robustness tensor decomposition model, providing a Bayesian robustness tensor decomposition model with a self-adaptive rank, and describing shared sparse distribution of different types of traffic data from a probability angle, thereby constructing a coupled robustness decomposition model of multiple types of traffic data; and finally, designing a solving method to realize high-precision rapid accidental traffic jam detection.
(1) Traffic data collection
The invention relates to a method for extracting and storing information from a California highway database PeMS (2014 obtained from http:// pets. dot. ca. gov /) 1/4/year to 3/2015/2 yearThe time interval of the daily traffic flow data, road density data and speed data is 5min, and the road density is time density.
(2) Tensor model construction of multiple traffic state variable data
According to the characteristics of the traffic state parameters, a tensor model can be generally established as
Where N denotes the number of detectors, M denotes the number of weeks, W denotes 7 days in a week or 5 days of a weekday, and T denotes the time of day. The fourth order tensor model of the three traffic data here is as shown in fig. 3, and when only data of one detector is considered, the detector dimension is 1, that is, the third order tensor model becomes. Such tensor models can adequately characterize and exploit the spatiotemporal multi-modal correlations of traffic data.
(3) Lost data preprocessing
Due to irresistible factors such as weather and facilities, the acquired traffic data is usually incomplete, so that subsequent data analysis is influenced. Aiming at the problem, the inventor adopts a traffic data tensor filling method proposed in 2013 to perform filling preprocessing on the lost data.
(4) Coupled robustness component number solution for various traffic state variable data
After various traffic data are constructed into a tensor model, a proper robust tensor decomposition model and a proper robust tensor decomposition method are required to be constructed to obtain a low rank (normal traffic data distribution) and a sparse part (sporadic traffic distribution) of the traffic data, and meanwhile, the coupling distribution among various data is extracted.
First, a suitable robust tensor resolution model that can be generalized to coupling needs to be studied. The general form of the robustness tensor decomposition model is shown in fig. 4, the observed tensor represents a tensor model constructed by measurement or observation data, lowrank represents a low-rank part obtained by decomposition, Sparse represents a Sparse part obtained by decomposition, and Noise represents normal white Noise distribution. In order to make the model more explanatory, the inventor designs and adopts a Bayesian robust tensor decomposition model to describe the significance of traffic data and parts obtained by decomposition thereof from the perspective of probability. And a self-adaptive rank increasing and decreasing method is provided to quickly acquire the rank of the low-rank part.
Then, the coupling relationship between various traffic state data is considered, that is, when occasional traffic congestion occurs, different traffic state data share the same abnormal sparse distribution. The inventor expresses a sparse part as an element product of a 0-1 identification tensor and a normal distribution tensor on the basis of robust tensor decomposition, and further constructs a coupled robust tensor decomposition model as shown in formula (1):
a tensor model constructed of the quantities of representation or observation data,
representing a low-rank part (normal distribution) obtained by decomposing a low-rank distribution by using CP (CANDECOMP/PARAFAC);
representing sparse portions (sporadic traffic congestion),
irepresenting white noise of the data as a whole, wherein
Is a sparse distribution common to all three data. And then carrying out prior assignment and posterior solution on the probability distribution of each part. For each type of data, according to the natural law, it is assumed that:
the matrix factor obtained by the low-rank part through CP decomposition obeys normal distribution
R is more than 1 and less than R; 1 < N < N, wherein R represents CP rank and N represents the order of tensor;
each element of (1)
A value of 0 to 1, subject to Bernoulli's scoreCloth
Its conjugate prior
Obeying Beta distribution
Is subject to a normal distribution
White noise obeys normal distribution
Other hyperparameters follow the conjugate Gamma distribution of the normal distribution: v to (c)
0,d
0),γ~(e
0,f
0). Notably, in order to extract the low rank characteristic of the data, the invention adopts a multi-gamma process to describe the over-diagonal parameter lambda in the CP decomposition model
rI.e. by
l~(a
0,1),a
01, R ═ 1, …, R; when the rank r increases, τ
rThe rank of (c) is increasing rapidly,
gradually approaches 0 to thereby make λ
rApproaching to 0, thereby ensuring the low rank characteristic of data.
After a robustness tensor decomposition model is constructed, all that is needed is to solve the posterior distribution of each part of the model; firstly, the overall joint probability is expressed as a formula (2), then, the posterior distribution is obtained by utilizing the linear operation property of Gaussian distribution and a Gibbs sampling method, and the updating process is expressed as a formula (4) -a formula (10).
The posterior distribution of each variable is:
wherein λ isrAn increase and decrease threshold can be set when all lambdarWhen the values are all larger than a certain threshold value, the rank is automatically increased; on the contrary, when one of them is λrLess than the threshold, the rank is decreased by 1 and λ is discardedrThe corresponding rank one tensor. Therefore, the self-adaptive rank is obtained on the basis of ensuring the low-rank characteristic.
(5) Identifying sparse parts obtained by decomposing coupling robustness tensor as space-time position and size information of sporadic communication
As shown in fig. 5, taking coupling two traffic data as an example, the method of the present invention can extract a low rank part and a sparse part of the traffic data, and we will use the sparse part
Identifying spatio-temporal locations of sporadic traffic congestion occurrences, utilizing sparse portions
Is shown and
and the size of the sporadic traffic jam corresponding to the space-time position.
(6) Effect of the experiment
On the basis of model establishment, the inventor conducts experiments on real data, history record non-periodic traffic jam events given by a traffic management department are used as true values, and the ratio of the number of the accidental jam events correctly detected by the detection method to the number of the true values is used as a detection accuracy standard. The method is characterized in that experiments are respectively compared with a traditional congestion detection method Standard Normal Designs (SND), a method of Coupled BPRCA newly developed in 2014, a method of RSTD newly developed in 2015 and the method provided by the invention, and the parameter adjustment Standard of the comparison experiments is that the false detection rate (false positive: the detection method detects accidental congestion but no accidental congestion event occurs in the true value) of each method is basically equal. When data of all days of 48 weeks were obtained using 1 detector, the results are shown in table 1, and the method of the present invention achieves a better effect, followed by BPRCA. When two adjacent detectors are considered, the detection results are shown in table 2, the detection rate is reduced overall, but tensor-based methods (RSTD and coupled BRPTF) achieve better effects; the effect of the BPRCA method is significantly reduced compared to that of the single detector. The reason is analyzed, an entrance ramp exists between the data of the two detectors (when an abnormal event occurs, part of drivers may choose to exit the expressway), and the time and place influence of traffic jam of the two adjacent detectors is reduced.
Detailed analysis of the 7-day laboratory results revealed that most of the undetected congestion was distributed on weekends for BRPCA, RSTD, and the methods of the invention, particularly the latter two; it is suspected that traffic data between working days is more similar in day pattern, i.e. the normal distribution rank is lower, due to commuting requirements, so that if working days are considered separately, the detection accuracy may be higher, and for commuters, the most important consideration is abnormal congestion on working days. In order to prove whether the guess is correct or not, the invention carries out experiments on the data set with only working days in 48 weeks, and the results are shown in tables 3 and 4, wherein the results of only considering one detector are shown in table 3, and the results of considering two adjacent detectors are shown in table 4, so that the detection precision of the tensor-based method is greatly improved compared with that of 7 days. The method provided by the invention has obvious advantages.
TABLE 1 detection accuracy of data of all days of 48 weeks by different methods using one detector
TABLE 2 detection accuracy of different methods using 48 weeks of data from two adjacent detectors for all days
TABLE 3 detection accuracy of various methods using 48-week working day data of one detector
TABLE 4 detection accuracy of 48-week working day data by two adjacent detectors by different methods
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above embodiments further explain the object, technical solution and advantageous effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.