CN103136239B - Transportation data loss recovery method based on tensor reconstruction - Google Patents

Transportation data loss recovery method based on tensor reconstruction Download PDF

Info

Publication number
CN103136239B
CN103136239B CN201110384954.3A CN201110384954A CN103136239B CN 103136239 B CN103136239 B CN 103136239B CN 201110384954 A CN201110384954 A CN 201110384954A CN 103136239 B CN103136239 B CN 103136239B
Authority
CN
China
Prior art keywords
tensor
data
traffic data
mode
mrow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110384954.3A
Other languages
Chinese (zh)
Other versions
CN103136239A (en
Inventor
谭华春
王武宏
冯广东
冯建帅
成斌
夏红卫
吴艳新
朱湧
阳钟兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201110384954.3A priority Critical patent/CN103136239B/en
Publication of CN103136239A publication Critical patent/CN103136239A/en
Application granted granted Critical
Publication of CN103136239B publication Critical patent/CN103136239B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Traffic Control Systems (AREA)

Abstract

The invention discloses a transportation data loss recovery method based on tensor reconstruction. The transportation data loss recovery method based on the tensor reconstruction aims to resolve the problem that precision is low and loss in a plurality of days can not be processed when an existing traditional transportation data loss recovery method based on a vector or a matrix form is used for recovering loss data. The transportation data loss recovery method based on the tensor reconstruction comprises that (a) transportation data are set in a multi-dimensional tensor form, loss tensor data are expressed through marked tensor, (b) the tensor data are spread on each mode, the relevance of all modes is calculated, and the weight of each mode is obtained, and (c) an objective function of loss data value recovery is set up and the loss data value of the objective function is solved according to the set tensor data and the calculation of the weight of each mode. The transportation data loss recovery method based on the tensor reconstruction is based on a multi-dimensional tensor model, all transportation time-space information is contained, the relevance of multi-mode is fully utilized, at the same time the original structure of multi-dimensional properties and the like of the transportation data is maintained, recovery precision is obviously superior to the traditional recovery method based on the vector or the matrix form, and an extreme case of the loss of a plurality of days can be solved well.

Description

Traffic data loss recovery method based on tensor reconstruction
Technical Field
The invention belongs to the field of intelligent traffic, and particularly relates to a traffic data loss recovery method.
Background
The traffic data loss recovery is a significant problem in an intelligent traffic system, and the recovery of the traffic loss data can improve the functions of the intelligent traffic system, for example, a traffic information distribution system, a traffic management system and the like all need complete and accurate traffic data, but in actual traffic, the traffic data is often incomplete due to equipment failure, transmission errors and the like, and the loss rate is reported to be 16% -93% according to related research, so that part of intelligent traffic subsystems cannot work normally, and therefore, the loss value of the incomplete traffic data needs to be estimated.
At present, the recovery methods for traffic data loss are mainly divided into two types: the recovery method based on the vector form is used for constructing the traffic data into the vector form and recovering the lost value by adopting an interpolation method or a regression method; the recovery method based on the matrix form builds the traffic data into the matrix form and adopts the matrix reconstruction theory to recover the lost value. However, both of these two types of recovery methods have their limitations and disadvantages, the former must be adopted when the loss rate is extremely low, and can only rely on the information of nearby points of the lost point in a single mode for recovery, and the recovery accuracy is low; the latter can improve the recovery accuracy to a certain extent by utilizing the correlation between traffic information and data in two modes, but the method cannot be used when the loss rate is high. In addition, the above two methods cannot fully utilize the multidimensional correlation characteristics of traffic data, and the improvement of recovery precision is severely restricted. And when the traffic data is lost for one or even several days, the two methods cannot be processed.
And recovering and researching multidimensional correlation characteristics of the traffic data based on the lost data reconstructed by the tensor, establishing a correlation criterion, judging the weight of each mode, and further fully utilizing the multi-mode correlation and the traffic space-time information so as to obtain the optimal estimation value of the lost point, thereby perfecting the traffic data. Compared with the prior art, the method keeps the original structure of the traffic data, obtains more accurate results, and can still obtain more ideal effect when losing data for one or more days. Some scholars perform loss value estimation on traffic matrix data by using a probability model based on Principal Component Analysis (PCA), wherein a matrix traffic data loss recovery method based on Bayesian Principal Component Analysis (BPCA) recently adopted by yield and the like is most interesting. For the introduction of this Method, reference may be made to the paper "A BPCA Based Missing value analyzing Method for Traffic Flow Volume Data" (a Traffic Flow Data loss recovery Method Based on Bayesian principal component analysis) (authors: yield, etc., by 2008 IEEEIntelligent Vehicles Symposium, 6.2008) and the paper "PPCA-Based Missing Data improvement for Traffic Flow Vvolume: a systematic approach (a method for recovering traffic flow data loss based on probabilistic principal component analysis) (authors of yield, etc. in IEEE Transaction on Intelligent Transport Systems (ITS), 2009, 9). The core of the method lies in acquiring the principal component and the global structure of the matrix traffic data through PCA, but the method can only obtain certain effect when the loss rate is lower than 50%, so that the method is difficult to meet the requirement of recovering the traffic data loss extreme condition for one or more days.
Under the background, it is important to research a recovery method that can improve the recovery accuracy and adaptively handle various loss situations.
Disclosure of Invention
Aiming at the limitation of the existing traffic data loss recovery method, the invention aims to provide a traffic data loss recovery method based on tensor reconstruction, which can improve the recovery precision of lost values and can handle special conditions of random loss of up to 90 percent, loss of one day or more and the like.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a traffic data loss recovery method based on tensor reconstruction comprises the following steps:
A. according to the multiple distribution rule of the traffic data, the traffic data is constructed into a multi-dimensional tensor data form, and the lost points of the traffic data are represented by marks;
B. normalizing the correlation coefficient of each mode by calculating the correlation of data of each mode to obtain the weight of each mode;
C. the traffic data loss recovery target function in the tensor form is established, the tensor reconstruction theory is adopted to convert the target function, the lost point markers and the weights of all modes are combined, and a traffic data recovery model based on the tensor reconstruction theory is established, and the model can achieve a good recovery effect on random loss and special loss.
The expression for solving the complete traffic data is as follows:
<math> <mrow> <mi>arg</mi> <mi>min</mi> <mo>:</mo> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>&Omega;</mi> <mo>*</mo> <mrow> <mo>(</mo> <mi>A</mi> <mo>-</mo> <mover> <mi>A</mi> <mo>^</mo> </mover> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>
constraint conditions are as follows: <math> <mrow> <mi>max</mi> <mrow> <mo>(</mo> <msub> <mover> <mi>A</mi> <mo>^</mo> </mover> <mrow> <msub> <mi>i</mi> <mn>1</mn> </msub> <msub> <mi>i</mi> <mn>2</mn> </msub> <mo>.</mo> <mo>.</mo> <mo>.</mo> <msub> <mi>i</mi> <mi>n</mi> </msub> </mrow> </msub> <mo>)</mo> </mrow> <mo>&le;</mo> <mi>C</mi> </mrow> </math>
wherein, A represents the original traffic data,representing the recovered traffic data; and omega is a mark tensor, marks a lost point, and has an element value of 0 and the rest of 1 at a place where the traffic data is lost. C represents the maximum traffic capacity, and the recovery value is ensured to meet the actual traffic condition.
The target function is converted by adopting a Lagrange method to obtain:
<math> <mrow> <msub> <mi>f</mi> <mi>&Omega;</mi> </msub> <mo>:</mo> <munder> <mi>min</mi> <mover> <mi>A</mi> <mo>&OverBar;</mo> </mover> </munder> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>&Omega;</mi> <mo>*</mo> <mrow> <mo>(</mo> <mi>A</mi> <mo>-</mo> <mover> <mi>A</mi> <mo>^</mo> </mover> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>+</mo> <mfrac> <mi>&lambda;</mi> <mn>2</mn> </mfrac> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mover> <mi>A</mi> <mo>^</mo> </mover> <mo>-</mo> <mi>C</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow> </math>
in this method, in order to avoid singular value decomposition, the data to be obtained is subjected toAnd performing a Tucker decomposition, wherein the Tucker decomposition is a decomposition mode in tensor decomposition.
<math> <mrow> <msub> <mi>f</mi> <mi>&Omega;</mi> </msub> <mo>:</mo> <munder> <mi>min</mi> <mrow> <mi>S</mi> <mo>,</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>,</mo> <mi>Z</mi> </mrow> </munder> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>&Omega;</mi> <mo>*</mo> <mrow> <mo>(</mo> <mi>A</mi> <mo>-</mo> <msub> <mi>S</mi> <mrow> <mo>&times;</mo> <mn>1</mn> </mrow> </msub> <msub> <mi>X</mi> <mrow> <mo>&times;</mo> <mn>2</mn> </mrow> </msub> <msub> <mi>Y</mi> <mrow> <mo>&times;</mo> <mn>3</mn> </mrow> </msub> <mi>Z</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>+</mo> <mfrac> <mi>&lambda;</mi> <mn>2</mn> </mfrac> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>S</mi> <mrow> <mo>&times;</mo> <mn>1</mn> </mrow> </msub> <msub> <mi>X</mi> <mrow> <mo>&times;</mo> <mn>2</mn> </mrow> </msub> <msub> <mi>Y</mi> <mrow> <mo>&times;</mo> <mn>3</mn> </mrow> </msub> <mi>Z</mi> <mo>-</mo> <mi>C</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow> </math>
Where S is×1X×2Y×3Z represents the Tucker decomposition in tensor decomposition; λ is the lagrange coefficient. The rank problem of the tensor can be well solved by adopting the Tucker decomposition, the Tucker decomposition is the mode rank of the tensor, namely the rank of the matrix after matrix expansion is carried out on the tensor, and in practical application, a Tucker model can more easily express the low-rank property of multi-dimensional data.
In addition, the multidimensional traffic data are decomposed to X, Y, Z modes through a Tucker, different weights are given to the modes according to a weight distribution algorithm on each mode, the optimization of multi-mode information utilization is achieved, and the solution is further carried out through a first-order (or second-order) optimization method.
Because the existing common Lagrangian method has the defect of convergence sub-linearity, the invention adopts an Augmented Lagrangian Multipliers method for optimization. The augmented Lagrange multiplier method is well applied to a matrix recovery algorithm, convergence is proved to have the property of Q linearity, and compared with a general Lagrange method, the speed is remarkably increased.
Firstly, the road traffic capacity limit in the formula (1)Appropriate relaxation of this constraint can be made, transforming into the following form:
L-C≤0 (4)
and each element value in the C is the maximum traffic capacity value of the road. Equation (1) can be converted to:
minL,S:||L||*+η||S||1 (5)
constraint conditions are as follows: a is L + S L-C is less than or equal to 0
Its augmented Lagrangian function can be expressed as:
<math> <mrow> <msub> <mi>L</mi> <mi>A</mi> </msub> <mrow> <mo>(</mo> <mi>L</mi> <mo>,</mo> <mi>S</mi> <mo>,</mo> <mi>M</mi> <mo>,</mo> <mi>N</mi> <mo>,</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mrow> <mo>|</mo> <mo>|</mo> <mi>L</mi> <mo>|</mo> <mo>|</mo> </mrow> <mo>*</mo> </msub> <mo>+</mo> <mi>&eta;</mi> <msub> <mrow> <mo>|</mo> <mo>|</mo> <mi>S</mi> <mo>|</mo> <mo>|</mo> </mrow> <mn>1</mn> </msub> <mo>+</mo> <mo>&lt;</mo> <mi>M</mi> <mo>,</mo> <mi>A</mi> <mo>-</mo> <mi>L</mi> <mo>-</mo> <mi>S</mi> <mo>></mo> <mo>+</mo> <mfrac> <mi>&alpha;</mi> <mn>2</mn> </mfrac> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>A</mi> <mo>-</mo> <mi>L</mi> <mo>-</mo> <mi>S</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> </mrow> </math>
<math> <mrow> <mo>+</mo> <mo>&lt;</mo> <mi>N</mi> <mo>,</mo> <mi>L</mi> <mo>-</mo> <mi>C</mi> <mo>></mo> <mo>+</mo> <mfrac> <mi>&beta;</mi> <mn>2</mn> </mfrac> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>L</mi> <mo>-</mo> <mi>C</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein L represents the finally recovered traffic data (low rank part), S represents the data polluted by noise (sparse part), A is distorted traffic data, alpha and beta are positive numbers, and M and N are Lagrange multipliers. Compared with the common Lagrange method, the convergence rate of the Lagrange multiplier is greatly improved due to the introduction of the Lagrange multiplier in the formula.
The step A is as follows:
traffic data is typically statistical in time series, and is usually viewed as a one-dimensional vector or two-dimensional matrix form. However, traffic data has multi-dimensional correlations, for example, high similarities (approximately periodicity) exist between day and day, week and week, and hour, and when traffic data is estimated, in order to utilize the correlations as much as possible, the traffic data needs to be constructed into multi-dimensional tensors.
The traffic data multidimensional tensor data form is determined according to the following expression:
A∈Rweek×Day×Hour (7)
wherein A represents traffic data; week denotes "Week" mode; day represents "Day" mode; hour stands for "hours" mode.
For traffic data where a loss occurs, it is determined as represented by:
wherein,
wherein, omega is the labeled tensor,representing traffic data.
The step B comprises the following steps:
b1, spreading the multi-dimensional traffic data to each mode, and calculating the correlation of each mode by using a similarity coefficient;
b2, normalizing the relevance of each mode and giving each mode weight;
the weights of the modes are calculated according to the following expressions:
<math> <mrow> <msub> <mi>W</mi> <mi>k</mi> </msub> <mo>=</mo> <mfrac> <msub> <mi>S</mi> <mi>k</mi> </msub> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msub> <mi>S</mi> <mi>k</mi> </msub> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>9</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein, WkRepresents the weight on the k-th mode (0 ≦ wk≤1);skDenotes a similarity coefficient (0. ltoreq. s) in the k-th modek≤1)。
The step B1 includes:
the tensor traffic data are expanded to each mode, and similarity coefficients of the modes are calculated according to the following expressions:
<math> <mrow> <msub> <mi>s</mi> <mi>k</mi> </msub> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>&GreaterEqual;</mo> <mi>i</mi> <mo>></mo> <mi>j</mi> <mo>&GreaterEqual;</mo> <mn>1</mn> </mrow> </msub> <msub> <mi>R</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>n</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>n</mi> <mi>k</mi> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>10</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein R isk(i, j) a correlation coefficient matrix representing a k-th mode; n iskThe number of data points representing the kth mode; skDenotes the similarity of the kth mode, 0. ltoreq.sk≤1。
The invention has the beneficial effects that:
the detection method provided by the invention has the following advantages:
the method has the advantages of high running speed and good effect, can process traffic data with the loss rate of 99 percent, can process the extreme condition of losing one or more days, has good universality, and greatly improves the recovery precision compared with the prior recovery method. The data of 16 × 12 × 24 actual traffic data were selected and compared experimentally with a PC of Matlab 7.0 in pentium (r) D, and fig. 4 and 5 show the effect of the method of the present invention compared to the conventional vector form-based recovery method and the latest traffic data loss recovery method, such as the methods mentioned in ITS 09 'and ITS 10'. From the standpoint of recovering accuracy, the conventional vector form-based method fails when the loss rate is large, and cannot handle the loss extreme case. The newly proposed Zhang method has the defects that the recovery precision is sharply reduced when the loss rate is higher than 50%, and the recovery capability effect is poor when extreme conditions are processed. The method is established on the basis of a tensor reconstruction theory, the recovery precision is slightly better than that of a Zhang method when the loss rate is lower than 50%, the recovery precision can still maintain a good effect when the loss rate is higher than 50%, the recovery precision is improved to different degrees when the loss rate is high and the loss rate is low, and the method shows a better effect when extreme conditions are processed.
In addition, the method of the invention adopts the weight distribution method based on the correlation, thus greatly improving the utilization efficiency of the correlation of each mode and further improving the recovery precision.
Drawings
Fig. 1 is a flowchart of a traffic data loss recovery method based on tensor reconstruction in an embodiment of the present invention;
FIG. 2 is a flowchart of obtaining weights of modes in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a multi-dimensional traffic data tensor form of step A of the invention;
FIG. 4a and FIG. 4b are diagrams illustrating the processing effect of the random loss of traffic data and the recovery methods in embodiment 1; wherein, fig. 4a is the original traffic data; FIG. 4b is a graph comparing the recovery effect of the Drift method and the method of the present invention; wherein, RMSE (root Mean Square error) is the root Mean Square error, the smaller the error, the better the effect;
FIG. 5 is a comparison of the processing effects of the recovery method of the present invention and the Evrim Acar recovery method when traffic data is lost for one or more days in example 2; wherein, rmse (root mean square error) is a root mean square error, and a smaller value indicates better effect.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
Example 1
In this embodiment, for the random loss of traffic data, as shown in fig. 1, the recovery process is performed in the following three steps:
1. constructing traffic data into tensor form and marking lost points
Since the traffic data of fig. 4(a) is 16-day traffic data, it can be constructed as tensor traffic data by the following expression:
A∈R16×12×24 (11)
wherein, A comprises 16 days, 24 hours a day, 12 minutes an hour and 5 minutes. According to the A size of 16 × 12 × 24, the mark tensorSize is also 16 x 12 x 24, said lost trafficThe data is determined by the following expression:
A′=Ω*A (12)
where a represents lost traffic data.
2. Determining weights of modes
Please refer to fig. 2 for a detailed flowchart of the process, which includes the following steps:
first, tensor traffic data is expanded to each mode, and since the traffic data is 3-dimensional in the embodiment, the tensor traffic data can be expanded to 3 modes, and the sizes of the tensor traffic data are 16 × 288, 12 × 364 and 192 × 24 respectively.
Then, similarity coefficients are respectively solved for the three mode data, the step can be completed by statistical software such as SPSS, and the concrete solving is determined according to the following expression:
<math> <mrow> <msub> <mi>s</mi> <mi>k</mi> </msub> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>&GreaterEqual;</mo> <mi>i</mi> <mo>></mo> <mi>j</mi> <mo>&GreaterEqual;</mo> <mn>1</mn> </mrow> </msub> <msub> <mi>R</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>n</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>n</mi> <mi>k</mi> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>/</mo> <mn>2</mn> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>13</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein R isk(i, j) a correlation coefficient matrix representing a k-th mode; n iskThe number of data points representing the kth mode; skDenotes the similarity of the kth mode, 0. ltoreq.sk≤1。
And finally, carrying out normalization processing on the 3 mode similarity coefficients to obtain the weight of each mode.
3. Estimating missing values
According to the calculation of the weight of each mode, assignment can be carried out on the proportion occupied by each mode after Tucker decomposition in a lost value objective function, and the converted objective function is solved and optimized, so that the recovered traffic data can be obtained as shown in fig. 4 (b);
the expression for the missing value is found as follows:
<math> <mrow> <msub> <mi>f</mi> <mi>&Omega;</mi> </msub> <mo>:</mo> <munder> <mi>min</mi> <mover> <mi>A</mi> <mo>&OverBar;</mo> </mover> </munder> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mi>&Omega;</mi> <mo>*</mo> <mrow> <mo>(</mo> <mi>A</mi> <mo>-</mo> <mover> <mi>A</mi> <mo>^</mo> </mover> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>+</mo> <mfrac> <mi>&lambda;</mi> <mn>2</mn> </mfrac> <msubsup> <mrow> <mo>|</mo> <mo>|</mo> <mover> <mi>A</mi> <mo>^</mo> </mover> <mo>-</mo> <mi>C</mi> <mo>|</mo> <mo>|</mo> </mrow> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>14</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein, A represents the original traffic data,representing the recovered traffic data; omega is a mark tensor, a lost point is marked, the element value of the lost point is 0 in the place where the traffic data is lost, and the rest is 1; c represents the maximum traffic capacity, and the recovery value is ensured to meet the actual traffic condition.
Further, for the case where one or more days of traffic data are lost, the processing according to embodiment 2 is sufficient.
Example 2
For the traffic data of 10 x 12 x 24, the tensor form is A e R10×12×24Corresponding to a tensor of label ofThe value of the marker tensor lost for k days is determined according to the following expression:
wherein,
wherein k represents the number of missing days, and the missing traffic tensor data can be obtained.
The remaining steps are the same as in embodiment 1, and the recovery effect of fig. 5 is obtained by calculating the weights of the respective modes and substituting them into the objective function, and finally estimating the missing value. The recovery effects in fig. 4 and fig. 5 are both obtained in Matlab environment, and if the method of the present invention is implemented by using C + + programming, the running time will be greatly reduced, thereby implementing the automaticity and real-time performance of traffic data loss recovery.
It should be noted that the above disclosure is only specific examples of the present invention, and those skilled in the art can devise various modifications according to the spirit and scope of the present invention.

Claims (1)

1. A traffic data loss recovery method based on tensor reconstruction comprises the following steps:
A. according to the multiple distribution rule of the traffic data, the traffic data is constructed into a multi-dimensional tensor data form, and the lost points of the traffic data are represented by marks, wherein the expression of the multi-dimensional tensor data is as follows:
A∈Rweek×Day×Hour
wherein, A multidimensional tensor data; week for "Week" mode, Day for "Day" mode; hour stands for "hours" mode;
the traffic data for which loss occurs is expressed as follows:
wherein, omega is the labeled tensor,representing traffic data;
B. normalizing the correlation coefficient of each mode by calculating the correlation of each mode data to obtain each mode weight, wherein each mode weight expression is as follows:
wherein, WkRepresents a weight on the k-th mode, 0 ≦ wk≤1,skDenotes a similarity coefficient in the k-th mode, 0. ltoreq. sk≤1;
C. Establishing a traffic data loss recovery target function in a tensor form, converting the target function by adopting an augmented Lagrange multiplier method, and constructing a traffic data recovery model based on a tensor reconstruction theory by combining lost point markers and weights of all modes; wherein the converted objective function is as follows:
wherein, A represents the original traffic data,representing the recovered traffic data; omega is a mark tensor, a lost point is marked, the element value of the lost point is 0 in the place where the traffic data is lost, and the rest is 1; c represents the maximum traffic capacity;
to avoid singular value decomposition, the data is evaluatedPerforming Tucker decomposition to obtain:
wherein S is×1X×2Y×3Z represents the Tucker decomposition of the tensor; λ is the lagrange coefficient.
CN201110384954.3A 2011-11-29 2011-11-29 Transportation data loss recovery method based on tensor reconstruction Expired - Fee Related CN103136239B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110384954.3A CN103136239B (en) 2011-11-29 2011-11-29 Transportation data loss recovery method based on tensor reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110384954.3A CN103136239B (en) 2011-11-29 2011-11-29 Transportation data loss recovery method based on tensor reconstruction

Publications (2)

Publication Number Publication Date
CN103136239A CN103136239A (en) 2013-06-05
CN103136239B true CN103136239B (en) 2015-03-25

Family

ID=48496074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110384954.3A Expired - Fee Related CN103136239B (en) 2011-11-29 2011-11-29 Transportation data loss recovery method based on tensor reconstruction

Country Status (1)

Country Link
CN (1) CN103136239B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105679022B (en) * 2016-02-04 2019-06-04 北京工业大学 A kind of complementing method of the multi-source traffic data based on low-rank
CN107220211A (en) * 2016-12-14 2017-09-29 北京理工大学 It is a kind of to merge the data re-establishing method that tensor filling and tensor recover
CN107992536B (en) * 2017-11-23 2020-10-30 中山大学 Urban traffic missing data filling method based on tensor decomposition
CN108804392B (en) * 2018-05-30 2021-11-05 福州大学 Traffic data tensor filling method based on space-time constraint
CN109377760A (en) * 2018-11-29 2019-02-22 北京航空航天大学 The detection of loss traffic data and restorative procedure based on iteration tensor algorithm
CN109584557B (en) * 2018-12-14 2021-02-26 北京工业大学 Traffic flow prediction method based on dynamic decomposition mode and matrix filling
CN110780604B (en) * 2019-09-30 2021-01-19 西安交通大学 Space-time signal recovery method based on space-time smoothness and time correlation
CN110766066B (en) * 2019-10-18 2023-06-23 天津理工大学 Tensor heterogeneous integrated vehicle networking missing data estimation method based on FNN
CN111311904B (en) * 2020-01-17 2021-06-11 东南大学 Traffic state estimation method based on floating car data weighted tensor reconstruction
CN111640298A (en) * 2020-05-11 2020-09-08 同济大学 Traffic data filling method, system, storage medium and terminal
CN112101730A (en) * 2020-08-18 2020-12-18 华南理工大学 Fully-distributed power grid-regional heat supply network joint scheduling method considering communication transmission errors

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620734A (en) * 2009-03-10 2010-01-06 北京中星微电子有限公司 Motion detecting method, motion detecting device, background model establishing method and background model establishing device
WO2010138536A1 (en) * 2009-05-27 2010-12-02 Yin Zhang Method and apparatus for spatio-temporal compressive sensing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101620734A (en) * 2009-03-10 2010-01-06 北京中星微电子有限公司 Motion detecting method, motion detecting device, background model establishing method and background model establishing device
WO2010138536A1 (en) * 2009-05-27 2010-12-02 Yin Zhang Method and apparatus for spatio-temporal compressive sensing

Also Published As

Publication number Publication date
CN103136239A (en) 2013-06-05

Similar Documents

Publication Publication Date Title
CN103136239B (en) Transportation data loss recovery method based on tensor reconstruction
CN107016708B (en) Image hash coding method based on deep learning
CN112801881B (en) High-resolution hyperspectral calculation imaging method, system and medium
CN110895878B (en) Traffic state virtual detector generation method based on GE-GAN
CN109255728B (en) Photovoltaic power generation power neural network prediction method based on chaotic phase space optimization reconstruction
CN1738143A (en) Power network topology error identification method based on mixed state estimation
CN110728187A (en) Remote sensing image scene classification method based on fault tolerance deep learning
CN104657962A (en) Image super-resolution reconstruction method based on cascading linear regression
CN105469110A (en) Non-rigid transformation image characteristic matching method based on local linear transfer and system
CN104598565A (en) K-means large-scale data clustering method based on stochastic gradient descent algorithm
CN115147655A (en) Oil gas gathering and transportation monitoring system and method thereof
CN106202756A (en) Based on monolayer perceptron owing determines blind source separating source signal restoration methods
CN115935147A (en) Traffic data recovery and abnormal value detection method represented by low-rank and sparse tensor
CN110766066B (en) Tensor heterogeneous integrated vehicle networking missing data estimation method based on FNN
CN112748483A (en) Air temperature forecast deviation correction method and device based on deep learning
CN114241230A (en) Target detection model pruning method and target detection method
CN107818328A (en) With reference to the deficiency of data similitude depicting method of local message
CN113515540A (en) Query rewriting method for database
CN105426543A (en) Image retrieval method based on Bessel statistic model
CN114283320A (en) Target detection method based on full convolution and without branch structure
CN104318046A (en) System and method for incrementally converting high dimensional data into low dimensional data
Cheng et al. A generic position based method for real root isolation of zero-dimensional polynomial systems
CN117853151A (en) Electronic commerce data analysis system and method based on big data
CN105488754A (en) Local linear migration and affine transformation based image feature matching method and system
CN112883063A (en) SPARQL query processing method on partition-based distributed RDF system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150325

Termination date: 20151129