CN110222041A - A kind of traffic data cleaning method restored based on tensor - Google Patents

A kind of traffic data cleaning method restored based on tensor Download PDF

Info

Publication number
CN110222041A
CN110222041A CN201910433784.XA CN201910433784A CN110222041A CN 110222041 A CN110222041 A CN 110222041A CN 201910433784 A CN201910433784 A CN 201910433784A CN 110222041 A CN110222041 A CN 110222041A
Authority
CN
China
Prior art keywords
tensor
traffic data
rank
low
cleaning method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910433784.XA
Other languages
Chinese (zh)
Inventor
谭华春
伍元凯
李琴
冯建帅
陈晓轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201910433784.XA priority Critical patent/CN110222041A/en
Publication of CN110222041A publication Critical patent/CN110222041A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Quality & Reliability (AREA)
  • Pure & Applied Mathematics (AREA)
  • Software Systems (AREA)
  • Traffic Control Systems (AREA)

Abstract

The present invention relates to a kind of traffic data cleaning methods restored based on tensor, comprising: verifies the low-rank characteristic of the traffic data;In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, the first object function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate truthful data and contamination data in the traffic data;The first object function is converted into the second objective function, is np problem described in second objective function;Second objective function is converted into third objective function;The third objective function is converted into segmentation Lagrangian;The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.The present invention is cleaned by the traffic data restored based on tensor, provides more reliable available traffic data, formulates more correct trip and management strategy convenient for traveler and traffic administration personnel.

Description

A kind of traffic data cleaning method restored based on tensor
Technical field
The present invention relates to intelligent transportation fields, more particularly to a kind of traffic data cleaning method restored based on tensor.
Background technique
Increasing vehicles number causes the problems such as severe traffic congestion, in order to solve these problems, many Traffic information includes that congestion information, the magnitude of traffic flow and travel time data etc. can be all applied in intelligent transportation system (ITS). But due to bad weather, detector damage, Communication etc., usual collected data will receive a degree of damage, So that data are contaminated, have some exceptional values, these exceptional values are unfavorable for the application and analysis of traffic data.
In past research, the recovery and cleaning that many data filtering methods be used to have exceptional value to obtain traffic data, Including singular value decomposition, wavelet analysis, immune algorithm and spectral differences etc..Traffic flow data is built by these filtering methods Time series, and it is as far as possible that its is waveform smoothing, it can use between the day of traffic data and day variation by spectrum analysis Correlation.But it is similar to diurnal variation, also there is correlation between the traffic data of different sections of highway.However, the filtering side of front This spatial coherence is not all used in method, they only use the day mode characteristic of traffic data, and think traffic number According to restorability depend primarily on the smooth of data, and smooth threshold value be only merely rule of thumb determine, and be unable to from Study obtains.In practice, the mode characteristic applied to is more, and the threshold value that Model Self-Learning obtains is more accurate, and corresponding data are extensive Multiple precision also can be more accurate.
Therefore, based on the above issues, inventor feels the need to propose that one kind can make full use of traffic data multi-mode The method of characteristic (all modes, day mode, hour mode etc.) come improve traffic data restore and cleaning precision.
Summary of the invention
In order to solve problem above, the present invention provides a kind of traffic data cleaning method restored based on tensor, is based on The traffic data cleaning that tensor restores provides more accurately available traffic data, formulates convenient for traveler and traffic administration personnel More correct trip and management strategy, for this purpose, the present invention provides a kind of traffic data cleaning restored based on tensor Method, comprising the following steps:
Verify the low-rank characteristic of the traffic data;
In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, institute Stating first object function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the friendship Truthful data and contamination data in logical data;
The first object function is converted into the second objective function, is that NP is asked described in second objective function Topic;
Second objective function is converted into third objective function;
The third objective function is converted into segmentation Lagrangian;
The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.
Preferably, the traffic data has multi-mode characteristic, and the multi-mode includes all modes, day mode and hour mould One or more of formula, and the step of low-rank characteristic of the verifying traffic data includes:
By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank of the traffic data Characteristic.
Preferably, the calculating of the correlation factor matrix is carried out by following formula:
Preferably, the step of first object function created for tensor recovery includes:
Due toAccording to original tensor Restoration model
To create the first object function:
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,It indicates Frobenius norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
Preferably, second objective function are as follows:
Preferably, second objective function are as follows: the third objective function are as follows:
WithIt is respectivelyWithThe i-th mode expansion.
Preferably, the segmentation Lagrangian are as follows:
Wherein, Yi,ZiRepresent Lagrange factor, αii> 0 represents punishment parameter.
Preferably, the segmentation Lagrangian is solved using the alternating direction implicit based on multiplier, to described Segmentation Lagrangian, which is iterated, solves Mi、NiUntil convergence obtains the low-rank part and the sparse portion Point.
Preferably, the software module of the traffic data cleaning method restored based on tensor is placed in random access memory ram Or memory or read only memory ROM or electrically programmable ROM or electrically erasable ROM or register or hard disk or removable magnetic In disk or CD-ROM storage medium.
A kind of traffic data cleaning method restored based on tensor of the application, comprising: verify the low-rank of the traffic data Characteristic;In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, described the One objective function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the traffic number Truthful data and contamination data in;The first object function is converted into the second objective function, the second target letter It is np problem described in number;Second objective function is converted into third objective function;The third objective function is turned Change segmentation Lagrangian into;The segmentation Lagrangian is solved, the low-rank part and the sparse portion are separated Point.The present invention is cleaned by the traffic data restored based on tensor, provides more reliable available traffic data, is convenient for traveler More correct trip and management strategy are formulated with traffic administration personnel.
Detailed description of the invention
Fig. 1 is the flow chart of the traffic data cleaning method according to an embodiment of the present invention restored based on tensor.
Specific embodiment
Present invention is further described in detail with specific embodiment with reference to the accompanying drawing:
The present invention provides a kind of traffic data cleaning method restored based on tensor, the traffic data restored based on tensor Cleaning provides more accurate available traffic data, and convenient for traveler and traffic administration personnel formulation is more correctly gone on a journey and pipe Reason strategy.
Currently, shown in widely used tensor Restoration model such as formula (1):
Wherein, rankiIndicate the i mode order of tensor;λiIndicate that the mode of tensor expansion is biased to;||.||FIt indicates Frobenius norm.WithRespectively indicate the sparse part tensor of low-rank part tensor, initial data tensor sum.
Fig. 1 is the flow chart of the traffic data cleaning method according to an embodiment of the present invention restored based on tensor.Such as Fig. 1 institute Show, the method comprising the steps of S101-S106:
At step 101, the low-rank characteristic of the traffic data is verified.
In one embodiment, traffic data has multi-mode characteristic, and the multi-mode characteristic includes all modes, day mode One or more of with hour mode, and the step of low-rank characteristic of the verifying traffic data specifically can be By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank characteristic of the traffic data.
Preferably, the calculating of the correlation factor matrix can be carried out by following formula:
At step 102, in the case where the traffic data has low-rank characteristic, creation is used for first that tensor restores Objective function, the first object function include low-rank part and sparse part, and the low-rank part and the sparse part are divided The truthful data and contamination data in the traffic data are not indicated.As described above, traffic data has multi-mode characteristic, it is described Multi-mode may include one or more of all modes, day mode and hour mode.Traffic data includes the magnitude of traffic flow and row The journey time.
By due toTensor Restoration model formula (1) can be equivalent to formula (4) to get To first object function;
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,It indicates Frobenius norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
At step 103, the first object function will be converted into the second objective function, in second objective function Np problem is described.
In order to restoreFormula (4) is become into formula (5) to get to the second objective function.
At step 104, second objective function is converted into third objective function.
Specifically, the problem as described in formula (5) is np problem, it is possible to replace formula using formula (6) (5) to get arrive third objective function.
WithIt is respectivelyWithThe i-th mode expansion.
At step 105, the third objective function is converted into segmentation Lagrangian.
Specifically, shown in the segmentation Lagrangian such as formula (10) of formula (6).
Wherein, Yi,ZiRepresent Lagrange factor, αii> 0 represents punishment parameter;
At step 106, the segmentation Lagrangian is solved, the low-rank part and the sparse portion are separated Point.
Specifically, according to formula (10), successively iteration seeks Mi、NiUntil convergence, obtains described based on multiplier The tensor recovery algorithms (as follows) of alternating direction implicit.Wherein,WithRespectively indicate the low-rank part sparse portion of tensor sum The amount of saying good-bye.In this way, by solvingLow-rank part and sparse be partially separated are come, to realize to traffic data Cleaning.
A kind of traffic data cleaning method restoring (ADMM-TR) based on tensor provided in an embodiment of the present invention, to right Collected traffic data is pre-processed, and abnormal data are removed, so that more preferably accurate available traffic data is provided, such as Flow, journey time etc..Traveler and traffic administration personnel can be formulated according to the data after cleaning more correctly trip and Management strategy.
Professional should further appreciate that, described in conjunction with the examples disclosed in the embodiments of the present disclosure Unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, hard in order to clearly demonstrate The interchangeability of part and software generally describes each exemplary composition and step according to function in the above description. These functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution. Professional technician can use different methods to achieve the described function each specific application, but this realization It should not be considered as beyond the scope of the present invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can be executed with hardware, processor The combination of software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field In any other form of storage medium well known to interior.
The above described is only a preferred embodiment of the present invention, being not the limit for making any other form to the present invention System, and made any modification or equivalent variations according to the technical essence of the invention, still fall within present invention model claimed It encloses.

Claims (9)

1. a kind of traffic data cleaning method restored based on tensor, which comprises the following steps:
Verify the low-rank characteristic of the traffic data;
In the case where the traffic data has low-rank characteristic, the first object function that creation restores for tensor, described the One objective function includes low-rank part and sparse part, and the low-rank part and the sparse part respectively indicate the traffic number Truthful data and contamination data in;
The first object function is converted into the second objective function, is np problem described in second objective function;
Second objective function is converted into third objective function;
The third objective function is converted into segmentation Lagrangian;
The segmentation Lagrangian is solved, the low-rank part and the sparse part are separated.
2. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the friendship Logical data have multi-mode characteristic, and the multi-mode includes one or more of all modes, day mode and hour mode, and The step of low-rank characteristic of the verifying traffic data includes:
By calculating the correlation factor matrix between each mode of traffic data, to verify the low-rank spy of the traffic data Property.
3. a kind of traffic data cleaning method restored based on tensor according to claim 2, it is characterised in that: the phase The calculating for closing factor matrix is carried out by following formula:
4. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the wound Build for tensor restore first object function the step of include:
Due toAccording to original tensor Restoration model
To create the first object function:
Wherein, rankiIndicate the order of the i-th mode of tensor, λiIndicate that the mode of tensor expansion is biased to,Indicate Frobenius Norm,WithRespectively indicate low-rank part, the sparse part of initial data tensor sum.
5. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: second mesh Scalar functions are as follows:
6. a kind of traffic data cleaning method restored based on tensor according to claim 1, it is characterised in that: the third mesh Scalar functions are as follows:
WithIt is respectivelyWithThe i-th mode expansion.
7. according to a kind of traffic data cleaning method restored based on tensor of claim 1, it is characterised in that: the segmentation glug Bright day function are as follows:
Wherein, Yi,ZiRepresent Lagrange factor, αii> 0 represents punishment parameter.
8. according to a kind of traffic data cleaning method restored based on tensor of claim 7, it is characterised in that: the segmentation glug is bright Day function is solved using the alternating direction implicit based on multiplier, is iterated solution to the segmentation Lagrangian Mi、NiUntil convergence obtains the low-rank part and the sparse part.
9. according to a kind of traffic data cleaning method restored based on tensor of claim 1, it is characterised in that: described extensive based on tensor The software module of multiple traffic data cleaning method is placed in random access memory ram or memory or read only memory ROM or electricity and can compile In journey ROM or electrically erasable ROM or register or hard disk or moveable magnetic disc or CD-ROM storage medium.
CN201910433784.XA 2019-05-23 2019-05-23 A kind of traffic data cleaning method restored based on tensor Pending CN110222041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910433784.XA CN110222041A (en) 2019-05-23 2019-05-23 A kind of traffic data cleaning method restored based on tensor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910433784.XA CN110222041A (en) 2019-05-23 2019-05-23 A kind of traffic data cleaning method restored based on tensor

Publications (1)

Publication Number Publication Date
CN110222041A true CN110222041A (en) 2019-09-10

Family

ID=67818296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910433784.XA Pending CN110222041A (en) 2019-05-23 2019-05-23 A kind of traffic data cleaning method restored based on tensor

Country Status (1)

Country Link
CN (1) CN110222041A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274525A (en) * 2020-01-19 2020-06-12 东南大学 Tensor data recovery method based on multi-linear augmented Lagrange multiplier method
CN111739551A (en) * 2020-06-24 2020-10-02 广东工业大学 Multichannel cardiopulmonary sound denoising system based on low-rank and sparse tensor decomposition
CN111768635A (en) * 2020-04-02 2020-10-13 东南大学 Coupling robustness tensor decomposition-based sporadic traffic congestion detection method
CN113792254A (en) * 2021-08-17 2021-12-14 大连理工大学 Multi-test fMRI data Tucker decomposition method introducing space sparsity constraint

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111274525A (en) * 2020-01-19 2020-06-12 东南大学 Tensor data recovery method based on multi-linear augmented Lagrange multiplier method
CN111768635A (en) * 2020-04-02 2020-10-13 东南大学 Coupling robustness tensor decomposition-based sporadic traffic congestion detection method
CN111739551A (en) * 2020-06-24 2020-10-02 广东工业大学 Multichannel cardiopulmonary sound denoising system based on low-rank and sparse tensor decomposition
CN113792254A (en) * 2021-08-17 2021-12-14 大连理工大学 Multi-test fMRI data Tucker decomposition method introducing space sparsity constraint
CN113792254B (en) * 2021-08-17 2024-05-28 大连理工大学 Multi-test fMRI data Tucker decomposition method introducing space sparse constraint

Similar Documents

Publication Publication Date Title
CN110222041A (en) A kind of traffic data cleaning method restored based on tensor
Helmus et al. A data driven typology of electric vehicle user types and charging sessions
CN111367961B (en) Time sequence data event prediction method and system based on graph convolution neural network and application thereof
CN104523268B (en) Electroencephalogram signal recognition fuzzy system and method with transfer learning ability
CN109002904B (en) Hospital outpatient quantity prediction method based on Prophet-ARMA
CN106951825A (en) A kind of quality of human face image assessment system and implementation method
Aquaro et al. A Bayesian networks approach to operational risk
CN108399453A (en) A kind of Electric Power Customer Credit Rank Appraisal method and apparatus
CN101425138A (en) Human face aging analogue method based on face super-resolution process
CN107688819A (en) The recognition methods of vehicle and device
CN105279964A (en) Road network traffic data completion method based on low-order algorithm
CN108509843A (en) A kind of face identification method of the Huber constraint sparse codings based on weighting
CN105574475A (en) Common vector dictionary based sparse representation classification method
He et al. A hybrid slantlet denoising least squares support vector regression model for exchange rate prediction
CN103530312B (en) Use the method and system of the ID of many-sided footprint
CN111047078A (en) Traffic characteristic prediction method, system and storage medium
CN106056627B (en) A kind of robust method for tracking target based on local distinctive rarefaction representation
CN108647714A (en) Acquisition methods, terminal device and the medium of negative label weight
CN110390012B (en) Track aggregation method and device, storage medium and electronic equipment
CN107004136B (en) Method and system for the face key point for estimating facial image
US20240029556A1 (en) Short-term traffic flow prediction method based on causal gated-low-pass graph convolutional network
CN110633394B (en) Graph compression method based on feature enhancement
CN104952065A (en) Method for building multilayer detailed skeleton model of garment images
CN111325255A (en) Specific crowd delineating method and device, electronic equipment and storage medium
Laude et al. Optimization of inf-convolution regularized nonconvex composite problems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190910

RJ01 Rejection of invention patent application after publication