CN110222041A - A kind of traffic data cleaning method restored based on tensor - Google Patents
A kind of traffic data cleaning method restored based on tensor Download PDFInfo
- Publication number
- CN110222041A CN110222041A CN201910433784.XA CN201910433784A CN110222041A CN 110222041 A CN110222041 A CN 110222041A CN 201910433784 A CN201910433784 A CN 201910433784A CN 110222041 A CN110222041 A CN 110222041A
- Authority
- CN
- China
- Prior art keywords
- tensor
- traffic data
- rank
- low
- cleaning method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000004140 cleaning Methods 0.000 title claims abstract description 28
- 230000011218 segmentation Effects 0.000 claims abstract description 17
- 238000011109 contamination Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 48
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000005611 electricity Effects 0.000 claims 1
- 238000001914 filtration Methods 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Quality & Reliability (AREA)
- Pure & Applied Mathematics (AREA)
- Software Systems (AREA)
- Traffic Control Systems (AREA)
Abstract
The present invention relates to a kind of traffic data cleaning methods restored based on tensor, comprising: verifies the low-rank characteristic of the traffic data;In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, the first object function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate truthful data and contamination data in the traffic data;The first object function is converted into the second objective function, is np problem described in second objective function;Second objective function is converted into third objective function;The third objective function is converted into segmentation Lagrangian;The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.The present invention is cleaned by the traffic data restored based on tensor, provides more reliable available traffic data, formulates more correct trip and management strategy convenient for traveler and traffic administration personnel.
Description
Technical field
The present invention relates to intelligent transportation fields, more particularly to a kind of traffic data cleaning method restored based on tensor.
Background technique
Increasing vehicles number causes the problems such as severe traffic congestion, in order to solve these problems, many
Traffic information includes that congestion information, the magnitude of traffic flow and travel time data etc. can be all applied in intelligent transportation system (ITS).
But due to bad weather, detector damage, Communication etc., usual collected data will receive a degree of damage,
So that data are contaminated, have some exceptional values, these exceptional values are unfavorable for the application and analysis of traffic data.
In past research, the recovery and cleaning that many data filtering methods be used to have exceptional value to obtain traffic data,
Including singular value decomposition, wavelet analysis, immune algorithm and spectral differences etc..Traffic flow data is built by these filtering methods
Time series, and it is as far as possible that its is waveform smoothing, it can use between the day of traffic data and day variation by spectrum analysis
Correlation.But it is similar to diurnal variation, also there is correlation between the traffic data of different sections of highway.However, the filtering side of front
This spatial coherence is not all used in method, they only use the day mode characteristic of traffic data, and think traffic number
According to restorability depend primarily on the smooth of data, and smooth threshold value be only merely rule of thumb determine, and be unable to from
Study obtains.In practice, the mode characteristic applied to is more, and the threshold value that Model Self-Learning obtains is more accurate, and corresponding data are extensive
Multiple precision also can be more accurate.
Therefore, based on the above issues, inventor feels the need to propose that one kind can make full use of traffic data multi-mode
The method of characteristic (all modes, day mode, hour mode etc.) come improve traffic data restore and cleaning precision.
Summary of the invention
In order to solve problem above, the present invention provides a kind of traffic data cleaning method restored based on tensor, is based on
The traffic data cleaning that tensor restores provides more accurately available traffic data, formulates convenient for traveler and traffic administration personnel
More correct trip and management strategy, for this purpose, the present invention provides a kind of traffic data cleaning restored based on tensor
Method, comprising the following steps:
Verify the low-rank characteristic of the traffic data;
In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, institute
Stating first object function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the friendship
Truthful data and contamination data in logical data;
The first object function is converted into the second objective function, is that NP is asked described in second objective function
Topic;
Second objective function is converted into third objective function;
The third objective function is converted into segmentation Lagrangian;
The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.
Preferably, the traffic data has multi-mode characteristic, and the multi-mode includes all modes, day mode and hour mould
One or more of formula, and the step of low-rank characteristic of the verifying traffic data includes:
By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank of the traffic data
Characteristic.
Preferably, the calculating of the correlation factor matrix is carried out by following formula:
Preferably, the step of first object function created for tensor recovery includes:
Due toAccording to original tensor Restoration model
To create the first object function:
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,It indicates
Frobenius norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
Preferably, second objective function are as follows:
Preferably, second objective function are as follows: the third objective function are as follows:
WithIt is respectivelyWithThe i-th mode expansion.
Preferably, the segmentation Lagrangian are as follows:
Wherein, Yi,ZiRepresent Lagrange factor, αi,βi> 0 represents punishment parameter.
Preferably, the segmentation Lagrangian is solved using the alternating direction implicit based on multiplier, to described
Segmentation Lagrangian, which is iterated, solves Mi、Ni、Until convergence obtains the low-rank part and the sparse portion
Point.
Preferably, the software module of the traffic data cleaning method restored based on tensor is placed in random access memory ram
Or memory or read only memory ROM or electrically programmable ROM or electrically erasable ROM or register or hard disk or removable magnetic
In disk or CD-ROM storage medium.
A kind of traffic data cleaning method restored based on tensor of the application, comprising: verify the low-rank of the traffic data
Characteristic;In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, described the
One objective function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the traffic number
Truthful data and contamination data in;The first object function is converted into the second objective function, the second target letter
It is np problem described in number;Second objective function is converted into third objective function;The third objective function is turned
Change segmentation Lagrangian into;The segmentation Lagrangian is solved, the low-rank part and the sparse portion are separated
Point.The present invention is cleaned by the traffic data restored based on tensor, provides more reliable available traffic data, is convenient for traveler
More correct trip and management strategy are formulated with traffic administration personnel.
Detailed description of the invention
Fig. 1 is the flow chart of the traffic data cleaning method according to an embodiment of the present invention restored based on tensor.
Specific embodiment
Present invention is further described in detail with specific embodiment with reference to the accompanying drawing:
The present invention provides a kind of traffic data cleaning method restored based on tensor, the traffic data restored based on tensor
Cleaning provides more accurate available traffic data, and convenient for traveler and traffic administration personnel formulation is more correctly gone on a journey and pipe
Reason strategy.
Currently, shown in widely used tensor Restoration model such as formula (1):
Wherein, rankiIndicate the i mode order of tensor;λiIndicate that the mode of tensor expansion is biased to;||.||FIt indicates
Frobenius norm.WithRespectively indicate the sparse part tensor of low-rank part tensor, initial data tensor sum.
Fig. 1 is the flow chart of the traffic data cleaning method according to an embodiment of the present invention restored based on tensor.Such as Fig. 1 institute
Show, the method comprising the steps of S101-S106:
At step 101, the low-rank characteristic of the traffic data is verified.
In one embodiment, traffic data has multi-mode characteristic, and the multi-mode characteristic includes all modes, day mode
One or more of with hour mode, and the step of low-rank characteristic of the verifying traffic data specifically can be
By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank characteristic of the traffic data.
Preferably, the calculating of the correlation factor matrix can be carried out by following formula:
At step 102, in the case where the traffic data has low-rank characteristic, creation is used for first that tensor restores
Objective function, the first object function include low-rank part and sparse part, and the low-rank part and the sparse part are divided
The truthful data and contamination data in the traffic data are not indicated.As described above, traffic data has multi-mode characteristic, it is described
Multi-mode may include one or more of all modes, day mode and hour mode.Traffic data includes the magnitude of traffic flow and row
The journey time.
By due toTensor Restoration model formula (1) can be equivalent to formula (4) to get
To first object function;
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,It indicates
Frobenius norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
At step 103, the first object function will be converted into the second objective function, in second objective function
Np problem is described.
In order to restoreFormula (4) is become into formula (5) to get to the second objective function.
At step 104, second objective function is converted into third objective function.
Specifically, the problem as described in formula (5) is np problem, it is possible to replace formula using formula (6)
(5) to get arrive third objective function.
WithIt is respectivelyWithThe i-th mode expansion.
At step 105, the third objective function is converted into segmentation Lagrangian.
Specifically, shown in the segmentation Lagrangian such as formula (10) of formula (6).
Wherein, Yi,ZiRepresent Lagrange factor, αi,βi> 0 represents punishment parameter;
At step 106, the segmentation Lagrangian is solved, the low-rank part and the sparse portion are separated
Point.
Specifically, according to formula (10), successively iteration seeks Mi、Ni、Until convergence, obtains described based on multiplier
The tensor recovery algorithms (as follows) of alternating direction implicit.Wherein,WithRespectively indicate the low-rank part sparse portion of tensor sum
The amount of saying good-bye.In this way, by solvingLow-rank part and sparse be partially separated are come, to realize to traffic data
Cleaning.
A kind of traffic data cleaning method restoring (ADMM-TR) based on tensor provided in an embodiment of the present invention, to right
Collected traffic data is pre-processed, and abnormal data are removed, so that more preferably accurate available traffic data is provided, such as
Flow, journey time etc..Traveler and traffic administration personnel can be formulated according to the data after cleaning more correctly trip and
Management strategy.
Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosure
Unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrate
The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description.
These functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.
Professional technician can use different methods to achieve the described function each specific application, but this realization
It should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can be executed with hardware, processor
The combination of software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only memory
(ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field
In any other form of storage medium well known to interior.
The above described is only a preferred embodiment of the present invention, being not the limit for making any other form to the present invention
System, and made any modification or equivalent variations according to the technical essence of the invention, still fall within present invention model claimed
It encloses.
Claims (9)
1. a kind of traffic data cleaning method restored based on tensor, which comprises the following steps:
Verify the low-rank characteristic of the traffic data;
In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, described the
One objective function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the traffic number
Truthful data and contamination data in;
The first object function is converted into the second objective function, is np problem described in second objective function;
Second objective function is converted into third objective function;
The third objective function is converted into segmentation Lagrangian;
The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.
2. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the friendship
Logical data have multi-mode characteristic, and the multi-mode includes one or more of all modes, day mode and hour mode, and
The step of low-rank characteristic of the verifying traffic data includes:
By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank spy of the traffic data
Property.
3. a kind of traffic data cleaning method restored based on tensor according to claim 2, it is characterised in that: the phase
The calculating for closing factor matrix is carried out by following formula:
4. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the wound
Build for tensor restore first object function the step of include:
Due toAccording to original tensor Restoration model
To create the first object function:
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,Indicate Frobenius
Norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
5. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: second mesh
Scalar functions are as follows:
6. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the third mesh
Scalar functions are as follows:
WithIt is respectivelyWithThe i-th mode expansion.
7. according to a kind of traffic data cleaning method restored based on tensor of claim 1, it is characterised in that: the segmentation glug
Bright day function are as follows:
Wherein, Yi,ZiRepresent Lagrange factor, αi,βi> 0 represents punishment parameter.
8. according to a kind of traffic data cleaning method restored based on tensor of claim 7, it is characterised in that: the segmentation glug is bright
Day function is solved using the alternating direction implicit based on multiplier, is iterated solution to the segmentation Lagrangian
Mi、Ni、Until convergence obtains the low-rank part and the sparse part.
9. according to a kind of traffic data cleaning method restored based on tensor of claim 1, it is characterised in that: described extensive based on tensor
The software module of multiple traffic data cleaning method is placed in random access memory ram or memory or read only memory ROM or electricity and can compile
In journey ROM or electrically erasable ROM or register or hard disk or moveable magnetic disc or CD-ROM storage medium.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910433784.XA CN110222041A (en) | 2019-05-23 | 2019-05-23 | A kind of traffic data cleaning method restored based on tensor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910433784.XA CN110222041A (en) | 2019-05-23 | 2019-05-23 | A kind of traffic data cleaning method restored based on tensor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110222041A true CN110222041A (en) | 2019-09-10 |
Family
ID=67818296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910433784.XA Pending CN110222041A (en) | 2019-05-23 | 2019-05-23 | A kind of traffic data cleaning method restored based on tensor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110222041A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111274525A (en) * | 2020-01-19 | 2020-06-12 | 东南大学 | Tensor data recovery method based on multi-linear augmented Lagrange multiplier method |
CN111739551A (en) * | 2020-06-24 | 2020-10-02 | 广东工业大学 | Multichannel cardiopulmonary sound denoising system based on low-rank and sparse tensor decomposition |
CN111768635A (en) * | 2020-04-02 | 2020-10-13 | 东南大学 | Coupling robustness tensor decomposition-based sporadic traffic congestion detection method |
CN113792254A (en) * | 2021-08-17 | 2021-12-14 | 大连理工大学 | Multi-test fMRI data Tucker decomposition method introducing space sparsity constraint |
-
2019
- 2019-05-23 CN CN201910433784.XA patent/CN110222041A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111274525A (en) * | 2020-01-19 | 2020-06-12 | 东南大学 | Tensor data recovery method based on multi-linear augmented Lagrange multiplier method |
CN111768635A (en) * | 2020-04-02 | 2020-10-13 | 东南大学 | Coupling robustness tensor decomposition-based sporadic traffic congestion detection method |
CN111739551A (en) * | 2020-06-24 | 2020-10-02 | 广东工业大学 | Multichannel cardiopulmonary sound denoising system based on low-rank and sparse tensor decomposition |
CN113792254A (en) * | 2021-08-17 | 2021-12-14 | 大连理工大学 | Multi-test fMRI data Tucker decomposition method introducing space sparsity constraint |
CN113792254B (en) * | 2021-08-17 | 2024-05-28 | 大连理工大学 | Multi-test fMRI data Tucker decomposition method introducing space sparse constraint |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222041A (en) | A kind of traffic data cleaning method restored based on tensor | |
Helmus et al. | A data driven typology of electric vehicle user types and charging sessions | |
CN111367961B (en) | Time sequence data event prediction method and system based on graph convolution neural network and application thereof | |
CN104523268B (en) | Electroencephalogram signal recognition fuzzy system and method with transfer learning ability | |
CN109002904B (en) | Hospital outpatient quantity prediction method based on Prophet-ARMA | |
CN106951825A (en) | A kind of quality of human face image assessment system and implementation method | |
Aquaro et al. | A Bayesian networks approach to operational risk | |
CN108399453A (en) | A kind of Electric Power Customer Credit Rank Appraisal method and apparatus | |
CN101425138A (en) | Human face aging analogue method based on face super-resolution process | |
CN107688819A (en) | The recognition methods of vehicle and device | |
CN105279964A (en) | Road network traffic data completion method based on low-order algorithm | |
CN108509843A (en) | A kind of face identification method of the Huber constraint sparse codings based on weighting | |
CN105574475A (en) | Common vector dictionary based sparse representation classification method | |
He et al. | A hybrid slantlet denoising least squares support vector regression model for exchange rate prediction | |
CN103530312B (en) | Use the method and system of the ID of many-sided footprint | |
CN111047078A (en) | Traffic characteristic prediction method, system and storage medium | |
CN106056627B (en) | A kind of robust method for tracking target based on local distinctive rarefaction representation | |
CN108647714A (en) | Acquisition methods, terminal device and the medium of negative label weight | |
CN110390012B (en) | Track aggregation method and device, storage medium and electronic equipment | |
CN107004136B (en) | Method and system for the face key point for estimating facial image | |
US20240029556A1 (en) | Short-term traffic flow prediction method based on causal gated-low-pass graph convolutional network | |
CN110633394B (en) | Graph compression method based on feature enhancement | |
CN104952065A (en) | Method for building multilayer detailed skeleton model of garment images | |
CN111325255A (en) | Specific crowd delineating method and device, electronic equipment and storage medium | |
Laude et al. | Optimization of inf-convolution regularized nonconvex composite problems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190910 |
|
RJ01 | Rejection of invention patent application after publication |