CN111768635A

CN111768635A - A Coupling Robust Tensor Decomposition-Based Approach for Occasional Traffic Congestion Detection

Info

Publication number: CN111768635A
Application number: CN202010257048.6A
Authority: CN
Inventors: 李琴; 丁璠; 谭春华; 伍元凯; 叶林辉; 陈晓轩
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2020-04-02
Filing date: 2020-04-02
Publication date: 2020-10-13
Anticipated expiration: 2040-04-02
Also published as: CN111768635B

Abstract

The invention discloses a coupling robustness tensor decomposition-based sporadic traffic jam detection method, which comprises the steps of firstly, collecting various traffic variable data; then according to the characteristics of various traffic data, constructing the traffic data into tensor models with the same size and the same order number; then, carrying out filling pretreatment on the lost data; then constructing a Bayesian robustness tensor decomposition model, providing a Bayesian robustness tensor decomposition model with a self-adaptive rank, and describing shared sparse distribution of different types of traffic data from a probability angle, thereby constructing a coupled robustness decomposition model of multiple types of traffic data; and finally, designing a solving method to realize high-precision rapid accidental traffic jam detection.

Description

A Coupling Robust Tensor Decomposition-Based Approach for Occasional Traffic Congestion Detection

技术领域：Technical field:

本发明涉及智能交通领域，尤其涉及一种基于耦合鲁棒性张量分解的偶发性交通拥堵检测方法。The invention relates to the field of intelligent transportation, in particular to a method for detecting occasional traffic congestion based on coupled robustness tensor decomposition.

背景技术：Background technique:

随着交通需求的增加，交通拥堵加剧，传统的方法已经不能解决拥堵问题，因此近几年，智能交通得到大力发展，交通拥堵检测先进的出行者信息系统(ATIS)的一部分，所以，对交通拥堵检测的研究也吸引了学者们的广泛兴趣。交通拥堵分为周期性交通拥堵和偶发性交通拥堵两种，相比而言，偶发性交通拥堵给交通出行者带来的影响更大，因为它造成的延误在人们意料之外，严重干扰人们的路径选择决策。由于交通意外事故通常会导致偶发性交通拥堵，人们理所当然地认为交通事故检测是交通拥堵检测的必要前提，早期研究就把注意力放在交通事故检测上，利用交通事故检测的结果来判断偶发性交通拥堵。自十九世纪70年代以来，许多基于监控的交通事故检测算法(AID)被提出。随后，有人提出利用贝叶斯和标准正态偏离算法从统计学的角度检测交通事故。还有一些研究者利用模式识别算法进行检测，如加利福尼亚算法等。加利福尼亚算法也是最早的几大检测算法之一，虽然精度上有很大的欠缺，但在后续研究中也得到了改进。除此之外，一些数据驱动的神经网络算法(ANN)也被应用到事件检测中。With the increase of traffic demand and the intensification of traffic congestion, traditional methods have been unable to solve the problem of congestion. Therefore, in recent years, intelligent transportation has been vigorously developed, and traffic congestion detection is part of the Advanced Traveler Information System (ATIS). Therefore, for traffic Research on congestion detection has also attracted extensive interest from scholars. Traffic congestion is divided into periodic traffic congestion and occasional traffic congestion. In comparison, occasional traffic congestion has a greater impact on traffic travelers, because it causes unexpected delays and seriously disturbs people. path selection decision. Since traffic accidents usually lead to occasional traffic congestion, people take it for granted that traffic accident detection is a necessary prerequisite for traffic congestion detection. traffic congestion. Since the 1870s, many surveillance-based traffic accident detection algorithms (AIDs) have been proposed. Subsequently, it was proposed to detect traffic accidents from a statistical point of view using Bayesian and standard normal deviation algorithms. There are also some researchers using pattern recognition algorithms for detection, such as the California algorithm. The California algorithm is also one of the earliest detection algorithms. Although there is a big lack of accuracy, it has also been improved in subsequent research. Besides, some data-driven neural network algorithms (ANNs) are also applied to event detection.

但是，交通事故不能完全替代偶发交通拥堵，因为交通事故仅仅只是造成偶发交通拥堵的众多原因之一。尽管造成偶发交通拥堵的原因不一，但是这些原因产对各类交通状态的分布的影响却是类似的。因而，实时的交通状态可以作为偶发性交通拥堵的指引，通过交通状态的变化来判断是否存在偶发性交通拥堵。因此，知道正常情况下的交通流量、速度等交通状态数据分布和模式是检测偶发性交通拥堵的关键，确定这些正常状态的数据后，真实数据和正常数据之间的不同之处就可表示偶发交通拥堵。为了提取出正常的交通状态数据分布和模式，目前已有的许多研究将注意力放在设置阈值上，即给定一个初始预设值，当观测到的交通数据真实值大于这个预设值后，就可以认为出现了偶发性交通拥堵。因此，传统偶发性交通检测一般包括：数据采集，数据预处理，单一时间模式/维度上的额正常交通状态数据阈值设计，偶发性拥堵检测4个部分。其主要流程如图1所示，首先采集路段交通状态数据(一般为道路密度数据)；然后对原始数据进行预处理，包括筛除异常值、对丢失数据进行填充等，如果是检测器数据，还需剔除低检测率数据；然后根据历史经验或者统计学等方法，以时间为自变量，以交通状态数据为因变量，设定正常情况下的交通状态数据的阈值上限；最后将采集到的交通状态数据与正常阈值相比较，检测出偶发性交通拥堵发生的时间和地点。但是，通勤出行者周期性的出行习惯不仅带来早晚高峰和非高峰之分，也使得交通状态存在周模式特性，工作日的交通明显高于周末。由交通流量、速度等交通状态数据确定的单一阈值可能不能准确地表示他们的正常分布和模式。此外。相邻路径和位置点的交通数据之间相互关联，这意味着交通数据存在很强的空间相关性。所以，需要一个可以充分利用交通数据天之间、周之间分布和空间相关性的模型来检测偶发性交通拥堵。However, traffic accidents are not a complete substitute for occasional traffic jams, as traffic accidents are only one of many causes of occasional traffic jams. Although the causes of occasional traffic congestion vary, the effects of these causes on the distribution of various traffic states are similar. Therefore, the real-time traffic status can be used as a guide for occasional traffic congestion, and whether there is occasional traffic congestion can be judged through changes in the traffic status. Therefore, knowing the distribution and pattern of traffic state data such as traffic flow and speed under normal conditions is the key to detecting occasional traffic congestion. traffic congestion. In order to extract the normal traffic state data distribution and pattern, many existing studies have focused on setting the threshold, that is, given an initial preset value, when the actual value of the observed traffic data is greater than the preset value , it can be considered that there is occasional traffic congestion. Therefore, traditional accidental traffic detection generally includes four parts: data collection, data preprocessing, threshold design of normal traffic state data in a single time mode/dimension, and accidental congestion detection. The main process is shown in Figure 1. First, the road traffic status data (usually road density data) is collected; then the original data is preprocessed, including filtering out outliers, filling missing data, etc. If it is detector data, It is also necessary to eliminate the low detection rate data; then, according to historical experience or statistics and other methods, with time as the independent variable and traffic state data as the dependent variable, set the upper limit of the threshold value of the traffic state data under normal conditions; Traffic status data is compared to normal thresholds to detect when and where occasional traffic jams occur. However, the cyclical travel habits of commuters not only bring about the distinction between morning and evening peaks and non-peaks, but also make the traffic state have a weekly pattern, and the traffic on weekdays is significantly higher than on weekends. A single threshold determined by traffic state data such as traffic flow, speed, etc. may not accurately represent their normal distribution and patterns. also. The traffic data of adjacent paths and location points are correlated, which means that there is a strong spatial correlation in the traffic data. Therefore, a model that can fully utilize the distribution and spatial correlation of traffic data between days and weeks is needed to detect occasional traffic congestion.

早期的基于阈值的检测方法大多数以单个交通数据或者数据向量为研究对象，一个向量最多只能表示交通数据某一方面的特性，比如表达了天模式特性、周模式特性或者空间特性，而不能将这些特性结合来考虑。近年来，有很多研究者开始利用矩阵来表达交通数据的时空相关性，即将交通数据构建成矩阵模型而不是简单的向量。其中，基于BRPCA的检测方法假设偶发性交通事件在时间和空间上的分布是稀疏的，正常的交通数据分布是低秩的，因为经他们证明，交通观测值具有强相关的时间和空间模式，呈现出周期性，相邻的上游路段和下游路段的交通数据观测值呈现很强的相关性。基于BRPCA方法的实验结果证明，通过交通数据的低秩和稀疏性，诸如延误之类的异常值可以被很好地检测和追踪。此外，Berk Anbaroglu等人于2014年提出的城市道路网偶发性交通拥堵检测研究中，也提到需充分利用交通状态变量的时空相关性，他们利用聚类分析方法，将时间和空间相似的交通状态聚集在一起然后确定偶发性交通拥堵的发生时间和地点以及大小。他们证明，充分利用交通状态的时空相关性可以提高检测的精度。总结起来，就大趋势来看，结合多个交通状态数据，充分利用它们的时空相关性可以在很大程度上提高检测的精度。然而，二阶矩阵也只能同时利用两个模式特性，而交通数据是天然的多模式数据，要充分利用其多模式特性，数据模型构建仍需改进。在不完整交通数据处理方面，发明人于2013年证明，一个张量模型可以充分利用交通数据潜在的多模式特性，从而优于许多基于矩阵的模型。因此，发明人考虑结合交通数据的低秩特性和张量模型的优势，构建鲁棒性张量分解模型以提取交通数据的低秩和稀疏部分，从而获得偶发性交通拥堵的时空位置信息。Most of the early threshold-based detection methods take a single traffic data or data vector as the research object, and a vector can only represent one aspect of traffic data at most, such as sky pattern characteristics, weekly pattern characteristics or spatial characteristics, but not the characteristics of traffic data. Consider these properties together. In recent years, many researchers have begun to use matrices to express the spatiotemporal correlation of traffic data, that is, to construct the traffic data into a matrix model instead of a simple vector. Among them, the BRPCA-based detection method assumes that the distribution of sporadic traffic events is sparse in time and space, and the normal traffic data distribution is low-rank, because they prove that traffic observations have strongly correlated temporal and spatial patterns, It shows periodicity, and the traffic data observations of adjacent upstream and downstream sections show strong correlation. The experimental results based on the BRPCA method demonstrate that outliers such as delays can be well detected and tracked by the low rank and sparsity of traffic data. In addition, Berk Anbaroglu et al. proposed in 2014 the detection of occasional traffic congestion in urban road networks, they also mentioned that it is necessary to make full use of the spatiotemporal correlation of traffic state variables. The states are aggregated and then determine when, where and how much occasional traffic jams occur. They demonstrate that taking full advantage of the spatiotemporal correlation of traffic states can improve the detection accuracy. To sum up, as far as the general trend is concerned, combining multiple traffic state data and making full use of their spatiotemporal correlations can greatly improve the detection accuracy. However, the second-order matrix can only utilize two modal characteristics at the same time, and the traffic data is a natural multi-modal data. To make full use of its multi-modal characteristics, the data model construction still needs to be improved. In terms of incomplete traffic data processing, the inventors demonstrated in 2013 that a tensor model can take full advantage of the potential multimodal nature of traffic data and thus outperform many matrix-based models. Therefore, the inventors consider combining the low-rank characteristics of traffic data and the advantages of tensor models to construct a robust tensor decomposition model to extract low-rank and sparse parts of traffic data, thereby obtaining spatiotemporal location information of occasional traffic congestion.

在完成了模型的选择后，如何充分地利用各类交通数据对拥堵检测也是很重要的，大多数已存在的研究以单一交通状态变量为基础来研究，但是偶发性交通对交通状态的影响很多，而且不同状态的变化情况对应不同种类的拥堵，因此，如要提高偶性发交通拥堵检测的精度，应该结合多个交通状态变量。因此，发明人在交通张量数据鲁棒性分解模型构建的基础上，考虑耦合多种交通状态变量，实现多交通变量的耦合分解。After the selection of the model is completed, how to make full use of various traffic data is also very important for congestion detection. Most of the existing studies are based on a single traffic state variable, but the impact of occasional traffic on the traffic state is many , and the changes of different states correspond to different types of congestion. Therefore, to improve the detection accuracy of occasional traffic congestion, multiple traffic state variables should be combined. Therefore, on the basis of constructing a robust decomposition model of traffic tensor data, the inventor considers coupling multiple traffic state variables to realize the coupling decomposition of multiple traffic variables.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的目的是提供一种基于耦合鲁棒性张量分解的偶发性交通拥堵检测方法，以解决现有技术中交通拥堵检测中存在的问题。In view of this, the purpose of the present invention is to provide an occasional traffic congestion detection method based on coupled robust tensor decomposition, so as to solve the problems existing in the traffic congestion detection in the prior art.

为实现上述目的，本发明采用以下技术方案：To achieve the above object, the present invention adopts the following technical solutions:

一种基于耦合鲁棒性张量分解的偶发性交通拥堵检测方法，该方法包括：A method for sporadic traffic congestion detection based on coupled robust tensor decomposition, the method comprising:

采集多种交通变量数据；Collect data on various traffic variables;

根据所述多种交通变量数据，构建张量模型；constructing a tensor model according to the plurality of traffic variable data;

对所述多种交通变量数据中丢失的数据进行填充预处理；Filling and preprocessing the missing data in the multiple traffic variable data;

构建多种交通变量数据的耦合鲁棒性分解模型；Build a coupled robust decomposition model of multiple traffic variable data;

根据所述耦合鲁棒性分解模型，进行偶发性交通拥堵检测。Occasional traffic jam detection is performed according to the coupled robust decomposition model.

进一步地，所述根据所述多种交通变量数据，构建张量模型，包括：Further, constructing a tensor model according to the multiple traffic variable data includes:

根据交通状态参数的特点，构建如下函数的张量模型：According to the characteristics of traffic state parameters, a tensor model of the following functions is constructed:

其中，N表示检测器的数目，M表示周的数目，W表示一周里有7天或者工作日5天，T表示时刻。Among them, N represents the number of detectors, M represents the number of weeks, W represents 7 days in a week or 5 working days, and T represents time.

进一步地，所述构建多种交通变量数据的耦合鲁棒性分解模型，包括：Further, the construction of a coupled robust decomposition model for multiple traffic variable data includes:

对多种交通状态变量数据进行耦合鲁棒性分张量解；将耦合鲁棒性张量分解得到的稀疏部分标识为偶发性交通发生的时空位置和大小信息。The coupled robust sub-tensor solution is performed on various traffic state variable data; the sparse part obtained by the coupled robust tensor decomposition is identified as the spatiotemporal location and size information of sporadic traffic occurrences.

进一步地，所述对多种交通状态变量数据进行耦合鲁棒性分张量解的函数为：Further, the function of coupling robust sub-tensor solutions to various traffic state variable data is:

其中，

表示量或者观测数据构建成的张量模型，

表示低秩部分；

表示稀疏部分，ε_i表示数据整体白噪声，其中

为三种数据共有的稀疏分布；in,

Represents a tensor model constructed from quantity or observation data,

represents the low-rank part;

represents the sparse part, ε _i represents the overall white noise of the data, where

is a sparse distribution common to the three types of data;

进一步地，所述低秩部分通过CP分解得到的矩阵因子服从正态分布

1＜r＜R；1＜n＜N，其中R表示CP秩，N表示张量的阶数；Further, the matrix factor obtained by CP decomposition of the low-rank part obeys a normal distribution

1<r<R;1<n<N, where R represents the CP rank and N represents the order of the tensor;

进一步地，所述稀疏部分

的每个元素

取值为0-1，服从伯努利分布

其共轭先验

服从Beta分布

稀疏部分

的每个元素服从正态分布

白噪声服从正态分布

Further, the sparse part

each element of

The value is 0-1, obeying Bernoulli distribution

its conjugate prior

obey Beta distribution

sparse part

Each element of is normally distributed

White noise follows a normal distribution

进一步地，所述将耦合鲁棒性张量分解得到的稀疏部分标识为偶发性交通发生的时空位置和大小信息，其中，所述稀疏部分

标识为偶发性交通拥堵发生的时空位置，稀疏部分

表示与

对应时空位置的偶发性交通拥堵大小。Further, the sparse part obtained by decomposing the coupling robustness tensor is identified as spatiotemporal location and size information of occasional traffic occurrences, wherein the sparse part is

Identifies spatiotemporal locations where occasional traffic jams occur, sparse parts

means with

The size of the occasional traffic jam corresponding to the spatiotemporal location.

有益效果：Beneficial effects:

本发明首先采集各种交通变量数据；然后根据各类交通数据的特点，将其构建成同尺寸和同阶数的张量模型；然后对丢失的数据进行填充预处理；然后构建贝叶斯鲁棒性张量分解模型，提出自适应秩的贝叶斯鲁棒性张量分解模型，并从概率角度描述不同种类交通数据的共有的稀疏分布，从而构建多种交通数据的耦合鲁棒性分解模型；最后设计求解方法，实现高精度快速偶发性交通拥堵检测。The invention first collects various traffic variable data; then according to the characteristics of various traffic data, it is constructed into a tensor model of the same size and order; then the missing data is filled and preprocessed; Rod tensor decomposition model, a Bayesian robust tensor decomposition model of adaptive rank is proposed, and the common sparse distribution of different types of traffic data is described from the perspective of probability, so as to construct a coupled robust decomposition of various traffic data Finally, a solution method is designed to achieve high-precision and rapid detection of occasional traffic congestion.

附图说明Description of drawings

图1是现有技术中心偶发性交通拥堵检测方法的一般流程图；1 is a general flow chart of a method for detecting occasional traffic jams in the prior art center;

图2是本发明提出的基于耦合贝叶斯鲁棒性张量分解的偶发性拥堵检测流程；Fig. 2 is the sporadic congestion detection process based on coupled Bayesian robust tensor decomposition proposed by the present invention;

图3是本发明实施例中，各种交通变量数据同尺寸大小的张量模型参考图；3 is a reference diagram of a tensor model of the same size of various traffic variable data in an embodiment of the present invention;

图4是本发明实施例中鲁棒性张量分解模型一般形式参考图；4 is a general form reference diagram of a robust tensor decomposition model in an embodiment of the present invention;

图5是本发明实施例中基于耦合鲁棒性张量分解的偶发性交通拥堵检测模型框架参考图。FIG. 5 is a reference diagram of a model framework for accidental traffic congestion detection based on coupled robust tensor decomposition in an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图，对本发明实施例进行进一步详细阐述。The embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.

针对传统方法的不足，本发明提出一种基于耦合贝叶斯鲁棒性张量分解的偶发性拥堵检测方法，其技术路线如图2所示，首先采集各种交通变量数据(交通流量、道路密度和速度等)；然后根据各类交通数据的特点，将其构建成同尺寸和同阶数的张量模型；然后对丢失的数据进行填充预处理；然后构建贝叶斯鲁棒性张量分解模型，提出自适应秩的贝叶斯鲁棒性张量分解模型，并从概率角度描述不同种类交通数据的共有的稀疏分布，从而构建多种交通数据的耦合鲁棒性分解模型；最后设计求解方法，实现高精度快速偶发性交通拥堵检测。In view of the shortcomings of the traditional method, the present invention proposes an occasional congestion detection method based on coupled Bayesian robust tensor decomposition. The technical route is shown in Figure 2. First, various traffic variable data (traffic flow, road density and speed, etc.); then according to the characteristics of various traffic data, it is constructed into a tensor model of the same size and order; then the missing data is filled and preprocessed; then a Bayesian robust tensor is constructed Decomposition model, a Bayesian robust tensor decomposition model of adaptive rank is proposed, and the common sparse distribution of different types of traffic data is described from the perspective of probability, so as to construct a coupled robust decomposition model of various traffic data; the final design A solution method to achieve high-precision and rapid detection of occasional traffic congestion.

(1)交通数据采集(1) Traffic data collection

本发明从美国加州高速公路数据库PeMS(http://pems.dot.ca.gov/)上获取2014 年4月1日至2015年3月2日的交通流量数据、道路密度数据和速度数据，其时间间隔为5min，道路密度为时间密度。The present invention obtains traffic flow data, road density data and speed data from April 1, 2014 to March 2, 2015 from the California highway database PeMS ( http://pems.dot.ca.gov/) , The time interval is 5min, and the road density is the time density.

(2)多种交通状态变量数据的张量模型构建(2) Construction of tensor models of various traffic state variable data

根据交通状态参数的特点，其张量模型一般可以被建为According to the characteristics of traffic state parameters, its tensor model can generally be built as

其中，N表示检测器的数目，M表示周的数目，W表示一周里有7天或者工作日5天，T表示时刻。这里三种交通数据的四阶张量模型如图3所示，当只考虑一个检测器的数据时，检测器维度为1，即变成三阶张量模型。这样的张量模型能够充分表征和利用交通数据的时空多模式相关性。Among them, N represents the number of detectors, M represents the number of weeks, W represents 7 days in a week or 5 working days, and T represents time. The fourth-order tensor models of the three types of traffic data here are shown in Figure 3. When only the data of one detector is considered, the detector dimension is 1, which means it becomes a third-order tensor model. Such tensor models are able to fully characterize and exploit the spatiotemporal multimodal correlation of traffic data.

(3)丢失数据预处理(3) Preprocessing of missing data

由于天气、设施等不可抗因素，采集到的交通数据通常存在不完整现象，从而影响后续数据分析。针对此问题，发明人采用其于2013年提出的交通数据张量填充方法对丢失数据进行填充预处理。Due to force majeure factors such as weather and facilities, the collected traffic data is usually incomplete, which affects subsequent data analysis. In response to this problem, the inventor uses the traffic data tensor filling method proposed in 2013 to preprocess the missing data.

(4)对多种交通状态变量数据进行耦合鲁棒性分张量解(4) Coupling robust sub-tensor solutions for various traffic state variable data

在将各种交通数据构建成张量模型之后，需要构建合适的鲁棒性张量分解模型和方法，来获取其低秩(正常交通数据分布)和稀疏部分(偶发性交通度分布)，同时也提取出多种数据间的耦合分布。After constructing various traffic data into tensor models, it is necessary to construct suitable robust tensor decomposition models and methods to obtain their low-rank (normal traffic data distribution) and sparse parts (sporadic traffic degree distribution), and at the same time The coupling distribution among various data is also extracted.

首先，需要研究合适的可推广至耦合的鲁棒性张量分解模型。鲁棒性张量分解模型的一般形式如图4所示，observed tensor表示测量或者观测数据构建成的张量模型，lowrank表示分解得到的低秩部分，Sparse表示分解得到的稀疏部分，Noise表示正常的白噪声分布。为了使模型更具解释性，发明人设计并采用贝叶斯鲁棒性张量分解模型，从概率角度的角度，描述交通数据及其分解得到的各部分的意义。并提出自适应增减秩方法，来快速获取低秩部分的秩。First, a suitable robust tensor decomposition model that generalizes to coupling needs to be studied. The general form of the robust tensor decomposition model is shown in Figure 4. Observed tensor represents the tensor model constructed from measurement or observation data, lowrank represents the low-rank part obtained by decomposition, Sparse represents the sparse part obtained by decomposition, and Noise represents normal white noise distribution. In order to make the model more interpretable, the inventor designs and adopts a Bayesian robust tensor decomposition model to describe the meaning of traffic data and its decomposed parts from a probabilistic point of view. And an adaptive rank increase and decrease method is proposed to quickly obtain the rank of the low rank part.

然后，考虑到多种交通状态数据之间的耦合关系，即当发生偶发性交通拥堵时，不同交通状态数据共享相同的异常稀疏分布。发明人在鲁棒性张量分解的基础上，将稀疏部分表示成一个0-1标识张量和一个正态分布张量的元素积，进一步构建的耦合鲁棒性张量分解模型如公式(1)所示：Then, the coupling relationship between multiple traffic state data is considered, that is, when occasional traffic congestion occurs, different traffic state data share the same anomalous sparse distribution. On the basis of robust tensor decomposition, the inventor expresses the sparse part as the element product of a 0-1 identification tensor and a normal distribution tensor, and further builds a coupled robust tensor decomposition model such as the formula ( 1) shown:

表示量或者观测数据构建成的张量模型，

表示低秩部分(正常分布)，通过低秩分布采用CP(CANDECOMP/PARAFAC)分解得到；

表示稀疏部分(偶发性交通拥堵)，ε_i表示数据整体白噪声，其中

为三种数据共有的稀疏分布。然后对各部分的概率分布进行先验赋值和后验求解。对于每一种数据，根据自然规律，假设:

Represents a tensor model constructed from quantity or observation data,

Represents the low-rank part (normal distribution), which is obtained by decomposing the low-rank distribution using CP (CANDECOMP/PARAFAC);

represents the sparse part (occasional traffic congestion), ε _i represents the overall white noise of the data, where

is a sparse distribution common to all three types of data. Then, the probability distribution of each part is assigned a priori and solved a posteriori. For each kind of data, according to the laws of nature, suppose:

低秩部分通过CP分解得到的矩阵因子服从正态分布

1＜r＜R；1＜n＜N，其中R表示CP秩，N表示张量的阶数；

的每个元素

取值为0-1，服从伯努利分布

其共轭先验

服从Beta分布

的每个元素服从正态分布

白噪声服从正态分布

其他超参数都服从正态分布的共轭Gamma分布：v～Γ(c₀,d₀)，γ～Γ(e₀,f₀)。值得注意的是，为了提取数据的低秩特性，本发明采用多伽马过程来描述CP分解模型中的超对角参数λ_r，即

δ_l～Γ(a₀,1),a₀＞1,r＝1,…,R；当秩r增大时，τ_r的秩迅速增长，

逐渐趋近于0，从而使得λ_r趋近于0，进而保证了数据的低秩特点。The matrix factors obtained by the CP decomposition of the low-rank part obey the normal distribution

each element of

The value is 0-1, obeying Bernoulli distribution

its conjugate prior

obey Beta distribution

Each element of is normally distributed

White noise follows a normal distribution

All other hyperparameters obey the conjugate Gamma distribution of the normal distribution: v～Γ(c ₀ ,d ₀ ), γ～Γ(e ₀ ,f ₀ ). It is worth noting that in order to extract the low-rank characteristics of the data, the present invention adopts a multi-gamma process to describe the superdiagonal parameter λ _r in the CP decomposition model, namely

δ _l ~Γ(a ₀ ,1), a ₀ >1, r=1,...,R; when the rank r increases, the rank of τ _r increases rapidly,

Gradually approaching 0, so that λ _r approaching 0, thus ensuring the low-rank characteristics of the data.

在构建好鲁棒性张量分解模型后，需要做的就是对其各部分的后验分布进行求解；首先总体的联合概率表示如式(2)所示，然后利用高斯分布的线性运算性质和Gibbs采样法获得后验分布，其更新过程如式(4)-式(10)所示。After constructing a robust tensor decomposition model, all that needs to be done is to solve the posterior distribution of each part; first, the overall joint probability is expressed as shown in equation (2), and then the linear operation properties of the Gaussian distribution and The posterior distribution is obtained by the Gibbs sampling method, and its update process is shown in equations (4)-(10).

各变量的后后验分布为：The posterior posterior distribution of each variable is:

(i)

(i)

(ii)

(ii)

(iii)

(iii)

(iv)

(iv)

(v)

(v)

(vi)

(vi)

(vii)

(vii)

(viii)

(viii)

其中，λ_r可以设置增减阈值，当所有λ_r都大于一定阈值时，自动增秩；相反，当其中的一个λ_r小于阈值时，秩减小1，并丢弃λ_r对应的秩一张量。从而在保证低秩特性的基础上，实现自适应秩的求取。Among them, λ _r can be set to increase or decrease the threshold. When all λ _r are greater than a certain threshold, the rank will be automatically increased; on the contrary, when one of λ _r is smaller than the threshold, the rank will be reduced by 1, and the rank corresponding to λ _r will be discarded. quantity. Therefore, the adaptive rank can be obtained on the basis of ensuring the low rank characteristic.

(5)将耦合鲁棒性张量分解得到的稀疏部分标识为偶发性交通发生的时空位置和大小信息(5) Identify the sparse part obtained by decomposing the coupled robustness tensor as the spatiotemporal location and size information of sporadic traffic occurrences

如图5所示，以耦合两种交通数据为例，利用本发明方法可以提取出交通数据的低秩部分，稀疏部分，我们将稀疏部分

标识为偶发性交通拥堵发生的时空位置，利用稀疏部分

表示与

对应时空位置的偶发性交通拥堵大小。As shown in Figure 5, taking the coupling of two kinds of traffic data as an example, the method of the present invention can extract the low-rank part and the sparse part of the traffic data.

Identify spatiotemporal locations where occasional traffic jams occur, using sparse parts

means with

(6)实验效果(6) Experimental effect

在模型建立的基础上，发明人在真实数据上进行实验，将交通管理部门给出的历史记录非周期交通拥堵事件作为真值，将检测方法正确检测出的偶发拥堵事件个数与真值个数之比为检测精度标准。分别对传统的拥堵检测方法Standard Normal Deviates(SND)、2014年新出的方法Coupled BPRCA方法、2015年新出方法RSTD方法和本发明提出的方法进行实验对比，对比实验的参数调节标准为使每种方法的误检率(假阳性：检测方法检测出偶发拥堵，但是真值中没发生偶发拥堵事件)基本相等。当采用1个检测器共48周所有天的数据时，检测结果如表1所示，本发明方法取得了较好的效果，其次为BPRCA。当考虑相邻两个检测器时，检测结果如表2所示，检测率整体有所下降，但是基于张量的方法(RSTD和coupled BRPTF)取得了较好的效果；BPRCA方法的效果明显较一个检测器时下降很多。分析原因得两个检测器数据间存在出入口匝道(当异常事件发生，有部分驾驶员可能会选择驶出高速公路)，导致相邻两个检测器发生交通拥堵的时间和地点影响减小。On the basis of model establishment, the inventor conducts experiments on real data, takes the historical records of aperiodic traffic congestion events given by the traffic management department as the true value, and compares the number of accidental congestion events correctly detected by the detection method with the true value The ratio of numbers is the standard of detection accuracy. The traditional congestion detection method Standard Normal Deviates (SND), the new method Coupled BPRCA method in 2014, the new method RSTD method in 2015 and the method proposed by the present invention are compared experimentally. The parameter adjustment standard of the contrast experiment is to make each The false detection rate (false positive: the detection method detects occasional congestion, but no occasional congestion event occurs in the true value) is basically equal. When the data of all days in a total of 48 weeks is used by one detector, the detection results are shown in Table 1. The method of the present invention has achieved good results, followed by BPRCA. When considering two adjacent detectors, the detection results are shown in Table 2, and the overall detection rate has decreased, but the tensor-based methods (RSTD and coupled BRPTF) have achieved better results; the effect of the BPRCA method is significantly better than drops a lot with one detector. Analysis of the reasons shows that there is an on-ramp between the two detectors (some drivers may choose to exit the expressway when an abnormal event occurs), which reduces the influence of the time and location of traffic congestion between the two adjacent detectors.

对7天的实验室结果进行详细分析发现，对于BRPCA、RSTD、以及本发明的方法，尤其是后两者，大多数没有检测出的拥堵分布在周末；猜想可能是因为通勤需求导致，工作日之间的交通数据在天模式上相似性更强，即正常分布秩更低，因此如果单独考虑工作日，也许检测精度会更高，而且对于通勤人员而言，最主要考虑就是工作日的异常拥堵。为了证明猜想是否正确，本发明在48周只有工作日的数据集上进行了实验，其结果如表3和表4所示，表3为仅考虑一个检测器的结果，表4为考虑相邻两个检测器的结果，可以看出，基于张量的方法检测精度较之7天大幅提升。本发明所提方法的优势明显。A detailed analysis of the 7-day laboratory results found that for BRPCA, RSTD, and the method of the present invention, especially the latter two, most of the undetected congestion was distributed on weekends; The traffic data between them are more similar in terms of day patterns, that is, the normal distribution rank is lower, so if working days are considered separately, the detection accuracy may be higher, and for commuters, the most important consideration is the anomaly of working days. congestion. In order to prove whether the conjecture is correct, the present invention conducts experiments on the 48-week data set with only working days, and the results are shown in Tables 3 and 4. Table 3 is the result of considering only one detector, and Table 4 is the result of considering adjacent detectors. From the results of the two detectors, it can be seen that the detection accuracy of the tensor-based method is greatly improved compared with 7 days. The advantages of the method proposed in the present invention are obvious.

表1不同方法利用一个检测器48周所有天数据的检测精度Table 1. Detection accuracy of different methods using one detector for all days in 48 weeks

表2不同方法利用相邻两个检测器48周所有天数据的检测精度Table 2 The detection accuracy of different methods using the data of all days in 48 weeks of two adjacent detectors

表3不同方法利用一个检测器48周工作日数据的检测精度Table 3. Detection accuracy of different methods using one detector for 48 working days

表4不同方法利用相邻两个检测器48周工作日数据的检测精度Table 4. The detection accuracy of different methods using the 48-week working day data of two adjacent detectors

专业人员应该还可以进一步意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、计算机软件或者二者的结合来实现，为了清楚地说明硬件和软件的可互换性，在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本发明的范围。Professionals should be further aware that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two. Interchangeability, the above description has generally described the components and steps of each example in terms of function. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of the present invention.

结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执行的软件模块，或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of a method or algorithm described in connection with the embodiments disclosed herein may be implemented in hardware, a software module executed by a processor, or a combination of the two. A software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.

以上的具体实施方式，对本发明的目的、技术方案和有益效果进行了进一步详细说明。所应理解的是，以上仅为本发明的具体实施方式而已，并不用于限定本发明的保护范围。凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above specific embodiments further describe the objectives, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above are only specific embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. A coupling robustness tensor decomposition-based sporadic traffic congestion detection method is characterized by comprising the following steps:

collecting various traffic variable data;

constructing a tensor model according to the multiple traffic variable data;

performing filling preprocessing on lost data in the multiple traffic variable data;

constructing a coupled robustness decomposition model of various traffic variable data;

and carrying out accidental communication congestion detection according to the coupling robustness decomposition model.

2. The method for detecting sporadic traffic congestion based on coupled robust tensor decomposition as recited in claim 1, wherein the constructing a tensor model from the plurality of traffic variable data comprises:

according to the characteristics of the traffic state parameters, a tensor model of the following functions is constructed:

where N denotes the number of detectors, M denotes the number of weeks, W denotes 7 days in a week or 5 days of a weekday, and T denotes the time of day.

3. The method for detecting the sporadic traffic congestion based on the coupled robustness tensor decomposition as recited in claim 1, wherein the constructing the coupled robustness decomposition model of the multiple traffic variable data comprises:

carrying out coupling robustness tensor solution on various traffic state variable data; and identifying the sparse part obtained by decomposing the coupling robustness tensor as space-time position and size information of the sporadic communication.

4. The method for detecting the sporadic traffic congestion based on the coupled robustness tensor decomposition as recited in claim 3, wherein the function of the coupled robustness tensor decomposition on the multiple traffic state variable data is as follows:

wherein,

a tensor model constructed of the quantities of representation or observation data,

represents a low rank portion;

a sparse portion is represented by a graph of,_irepresenting white noise of the data as a whole, wherein

Is a sparse distribution common to all three data.

5. The method of claim 4, wherein the matrix factor obtained by CP decomposition in the low-rank part obeys normal distribution

R is more than 1 and less than R; 1 < N < N, where R denotes CP rank and N denotes the order of the tensor.

6. The method of claim 4, wherein the sparse portion is a portion of the coupling robustness tensor decomposition-based sporadic traffic congestion detection method

Each element of (1)

A value of 0-1, obeying Bernoulli distribution

Its conjugate prior

Obeying Beta distribution

Sparse part

Is subject to a normal distribution

White noise obeys normal distribution

7. The method as claimed in claim 3, wherein the sparse part obtained by decomposing the coupling robustness tensor is identified as the space-time position and size of the occurrence of the accidental trafficInformation, wherein the sparse portion

Marked as the spatio-temporal location of the occurrence of sporadic traffic congestion, sparse part

Is shown and

and the size of the sporadic traffic jam corresponding to the space-time position.