CN110390816A - A kind of condition discrimination method based on multi-model fusion - Google Patents
A kind of condition discrimination method based on multi-model fusion Download PDFInfo
- Publication number
- CN110390816A CN110390816A CN201910650794.9A CN201910650794A CN110390816A CN 110390816 A CN110390816 A CN 110390816A CN 201910650794 A CN201910650794 A CN 201910650794A CN 110390816 A CN110390816 A CN 110390816A
- Authority
- CN
- China
- Prior art keywords
- data
- feature
- traffic flow
- mass center
- traffic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0133—Traffic data processing for classifying traffic situation
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Probability & Statistics with Applications (AREA)
- Traffic Control Systems (AREA)
Abstract
The present invention relates to a kind of condition discrimination methods based on multi-model fusion, and the method includes the following contents: data prediction, carry out data prediction to the traffic flow data of acquisition;Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.The real-time status in the path in current network topology can be differentiated, for routine weight value determination and subsequent path planning application is provided fundamental basis and technology path;Accuracy and validity are improved compared with traditional single characteristic threshold value method of discrimination, meanwhile, feature selection approach can remove some extraneous features, promote the precision of differentiation.
Description
Technical field
The present invention relates to a kind of traffic flow modes method of discrimination, sentence more particularly to a kind of state based on multi-model fusion
Other method.
Background technique
In recent years, the high speed development of urban economy brings the increasingly saturation of Urban traffic demand, and traffic congestion phenomenon is
Become the public enemy number one in Urban Transportation, urban road presentation waits in line even congestion state, seriously affected people
The enthusiasm and efficiency gone on a journey.As an important component of intelligent transportation system, vehicle route induction can be real-time
The service such as navigator fix, geography information efficiently is provided to traveler, guidance traveler reaches target location from original place point.It selects
Path Planning directly decide that paths chosen is supplied to the quality good or not of the driving path of traveler.According to dynamic traffic
Demand, Path Planning Technique involved in Vehicle Route Guide System also need while providing accurate route searching result
Want can with the dynamic change of traffic information real-time calculated result, to prevent occur the route programming result obtained failure.Most
Shortest path planning technology obtains road network real-time running state using the smart machines such as GPS, sensor, in road network origin node and
The accessibility of destination node is analyzed, and seeks between origin node to the reachable path destination node, certain optimal rules is arranged,
Such as oil consumption it is minimum, hide congestion etc., the selection of different schemes is carried out according to the principle of optimality, and the selection result is presented to the user
It is selected for user.
Real-time requirement of the vehicle route induction according to user is that traveler displaying is current most using Wormhole routing is optimized
Excellent programme.Since vehicle route derived need is provided based on the optimal routing route under current road network operating status, thus
Great challenge is proposed to the timeliness for seeking diameter algorithm.Tradition is needed to be traversed for based on the path planning algorithm of graphics method
Node is excessive, and the intermediate data amount of storage is excessive, it is difficult to be applied to large complicated network topology structure.
From the angle of economic development, traffic congestion caused by current city traffic supersaturation has become urbanization and builds
If one can not set no problem in the process.Road network utilization rate is difficult to effectively be promoted, and traffic resource distribution mechanism is chaotic.Based on reality
When road network structure the judgement of traffic flow data state can effectively improve the utilization rate of current road network, reduce traffic accident, push away
It is scientific into decision-making management, it is intelligent, promote the ability of resource allocation.
Summary of the invention
The purpose of the present invention is to overcome the shortcomings of the existing technology, provides a kind of condition discrimination based on multi-model fusion
Method differentiates for the real-time status to the path in current network topology, is routine weight value determination and subsequent path
Planning application is provided fundamental basis and technology path.
The purpose of the present invention is achieved through the following technical solutions: a kind of condition discrimination method based on multi-model fusion,
The method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
The step of data prediction, is as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
The traffic flow data of the judgement acquisition whether there is abnormal data, and the tool of data processing is carried out to abnormal data
Steps are as follows for body:
Whether in the reasonable scope to judge the fluctuation range of data value field;
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
It is described that data progress data normalization processing, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value is obtained;
The feature vector of all traffic flow datas is traversed, minimum value is obtained;
Feature vector is normalized.
The feature selecting selects correlated characteristic subset to reduce the tool of data dimension by removing uncorrelated and redundancy feature
Steps are as follows for body:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
The multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of matter is calculated using K-Means clustering algorithm to initial m data
The heart.
Step 2: first step is repeated, until obtaining m S grades of mass center.
Step 3: k S+1 grades of mass center is calculated using K-Means clustering algorithm to m S grades of mass center.
Step 4: third step is repeated, until obtaining m S+1 grades of mass center, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, with regard to carrying out using K-Means algorithm
Cluster obtains k S+1 grades of mass center;Until finally obtaining k final mass center.
The real-time grading, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, m random sample is chosen altogether;
Step 2: selecting n feature in feature set at random for the feature set Jing Guo feature selecting, establish
CART decision-tree model;
Step 3: repeating the first step and second step k times, k CART decision tree is generated, every decision tree possesses independent determine
Plan criterion;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
The invention has the following advantages that a kind of condition discrimination method based on multi-model fusion, can open up current network
The real-time status in the path in flutterring differentiated, for routine weight value is determining and subsequent path planning application is provided fundamental basis and
Technology path;Accuracy and validity are improved compared with traditional single characteristic threshold value method of discrimination, meanwhile, feature selection approach
Some extraneous features can be removed, the precision of differentiation is promoted.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the present invention;
Fig. 2 is wrong data decision flowchart;
Fig. 3 is the flow chart of forest algorithm at any time
Fig. 4 is traffic flow data feature weight figure in embodiment;
Fig. 5 is the accuracy comparison diagram of different models in embodiment.
Specific embodiment
The present invention will be further described with reference to the accompanying drawing, but protection scope of the present invention is not limited to following institute
It states.
As shown in Figure 1, a kind of condition discrimination method based on multi-model fusion, the method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Further, initial data does not ensure that complete correctness during acquisition, transmission, storage, inevitable
There are many incomplete places, for example, data type is inconsistent, shortage of data, data redundancy etc..If not to initial data
It is pocessed and directly uses, let alone low-quality data and flow into algorithm model, the learning process of algorithm model certainly will be made
At huge destruction.It is opposite, pretreatment appropriate is carried out to data, it will be obviously improved algorithm model decision quality and can
By property.
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Further, with the exponentially rising of data scale and data complexity, the calculation for establishing ultra-large type is generally required
Solution of the method structure to carry out problem, algorithm complexity and response time increase suddenly, but in fact, most of features (become
Amount) solution of problem is absolutely not helped, it may be said that it is redundancy feature (variable) for the process of Solve problems.This for
Data itself are unacceptable, the especially this data flow infinitely reached over time of flow data naturally.Thus, into
During row model construction, select effective data characteristics (variable) carries out solving seem it is vital.
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Further, different characteristic has the description of different aspect to the problem of being solved, and surface seems independently of each other, actually
There is profound connection, multi-characters clusterl is divided example using similarity principle i.e. by analyzing multidimensional characteristic
There are the sub-instances of significant difference to be multiple.In brief, clustering is that object of classification is placed in hyperspace, according to right
As an existing otherness is recognized, the object with same attribute is divided into same class, the object with different attribute
It is divided into inhomogeneity, " high cohesion, lower coupling " between classification is realized, that is, is allocated as similarity pole present in of a sort object
Height, the otherness being allocated as between inhomogeneous object are very big.
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
Further, due to the particularity of flow data, there is high requirement to the real-time of model algorithm, thus, it establishes
Real-time grading device, which carries out classification to the flow data continually generated, seems important very much, and real-time grading device is required to inflow
Data inside model make quick response, and the assorting process of stream data can be completed in finite time, is avoided the occurrence of
Lead to large-scale data queue and obstruction since computation complexity is excessive.
The step of data prediction, is as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
As shown in Fig. 2, the traffic flow data of the judgement acquisition is carried out with the presence or absence of abnormal data, and to abnormal data
Specific step is as follows for data processing:
Whether in the reasonable scope to judge the fluctuation range of data value field;
Further, the fluctuation of data value field indicates between 50%~150%, data value field wave in the reasonable scope
It is dynamic.
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
Further, wrong data mainly includes data mistake and two kinds of shortage of data;Wherein error in data indicates that data exist
Since data formatting error causes data appearance unexpected as a result, such as in traffic flow data during acquisition, storage
In there is negative;Shortage of data indicates that equipment is interrupted during carrying out data acquisition, causes certain data to occur bright
Aobvious omission does not collect traffic density data such as.
Further, it needs to carry out mistake or abnormality processing when apparent error occur in data;It is a small amount of wrong when only occurring
Accidentally, when can be neglected compared to correct data, wrong data can directly be deleted.If wrong data is compared to correct
When data are less than 5%, wrong data can directly be deleted.If wrong data is more than 5% compared to correct data, need to mistake
Accidentally data are modified, and the present invention will use adjacent data or the algebraic mean number in a period of time to fill up.
It is described that data progress data normalization processing, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value Max is obtained;
The feature vector of all traffic flow datas is traversed, minimum M in is obtained;
Feature vector is normalized.
Further, normalization calculation formula is as follows:
X in formula0-1For the feature vector after being normalized, x is feature vector, and Min is the minimum value of feature vector, Max
For the maximum value of feature vector.
The feature selecting selects correlated characteristic subset to reduce the tool of data dimension by removing uncorrelated and redundancy feature
Steps are as follows for body:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
Specifically, a sample S is randomly selected from training set T, k that S is then found out from the sample set similar with S
Neighbour's sample Hk, k neighbour's sample M is found out from each and the inhomogeneous sample set of Sk, it is updated according to following formula
The weight of each feature.
W (A)=W (A)-similarityH(A)+differenceM(A)
Wherein,
Mj(c) j-th of closest sample in class C is indicated, diff (A, S, R) representative sample S and sample R are on feature A
Difference, calculation formula is as follows:
It can be found that second formula is substantially calculating a certain feature of sample S to similar closest in above formula
Sample HkSum of the distance;Third formula is then in a certain feature for calculating sample S to the closest sample M of inhomogeneitykDistance
The sum of.More new formula according to first formula is it is found that when a certain feature of sample S is to similar closest sample HkDistance it
Be greater than this feature to the closest sample M of inhomogeneitykSum of the distance when, the weight of this feature will be elevated, i.e. this feature
It is positive acting in the classification for carrying out similar sample and non-similar sample, on the contrary, when a certain feature of sample S is to similar most adjacent
Nearly sample HkSum of the distance be less than this feature to the closest sample M of inhomogeneitykSum of the distance when, then weight will be cut down,
I.e. this feature is negative role in the classification for carrying out similar sample and non-similar sample.Certainly, the selection of sample S may have
Therefore certain randomness can repeat n times, taking each feature average weight is the final weight of this feature, if a certain spy
The weight of sign is greater than 0.5, then proves that the correlation between this feature and the problem that is solved is high, conversely, then proving this feature and being asked
Correlation between solution problem is low, particularly, if the weight of a certain feature is less than threshold value, illustrates this feature and is solved between problem
Almost without relationship, can directly be removed from multidimensional characteristic vectors group, to achieve the purpose that feature selecting.
The multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of matter is calculated using K-Means clustering algorithm to initial m data
The heart.
Step 2: first step is repeated, until obtaining m S grades of mass center.
Step 3: k S+1 grades of mass center is calculated using K-Means clustering algorithm to m S grades of mass center.
Step 4: third step is repeated, until obtaining m S+1 grades of mass center, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, with regard to carrying out using K-Means algorithm
Cluster obtains k S+1 grades of mass center;Until finally obtaining k final mass center.
Further, the present invention is based on STREAM algorithms to carry out multi-characters clusterl analysis, and STREAM algorithm is calculated with K-Means
Based on method, introduces sliding window mechanism and solve the problems in flow data cluster.The bottom frame of STREAM algorithm is still K-
Means clustering algorithm first makees brief analysis to K-Means clustering algorithm below.
On the basis of K-Means algorithm, the cluster process of stream data feature is realized using STREAM algorithm.STREAM
The fabric algorithm of algorithm is K-Means algorithm, batch processing mechanism is added in superstructure, to solve to occur in flow data
Concept drift the problem of.
K-Means algorithm divides different classification according to the distribution similarity of data point in multidimensional feature space.Specifically, with
Machine obtains k object from data set, is considered as the initial mass center of k cluster;By remaining object according to itself and each cluster mass center
Euclidean distance be assigned to closest cluster, recalculate the mass center of each cluster, be iteratively repeated this process, until distortion letter
Number convergence obtains k changeless mass centers.Specifically, algorithm flow is as follows:
1. k object, the initial mass center μ as k cluster are obtained from data set at random1,μ2...μk;
2. being directed to each object, its Euclidean distance between each cluster centre point is calculated, and according to minimum range
Again corresponding object is divided, the criteria for classifying is as shown by the equation;
C(i)=argmin | | x(i)-μj||2
Wherein, C(i)For i-th of data object generic, x(i)For i-th of data object, μjFor j-th of cluster centre.
3. updating the mass center μ of k cluster according to following formula1,μ2...μk,
4. repeating 2-3 step, until following formula distortion function is restrained, the k mass centers no longer changed are obtained;
In formula, J (c, μ) is distortion function, μCFor the center after the completion of cluster.
As shown in figure 3, real-time grading is based on decision tree theory, Random Forest model is established, is obtained with multi-characters clusterl module
To draw be classified as training set, classify to arithmetic for real-time traffic flow, carry out the differentiation of real-time traffic states.
The real-time grading, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, m random sample is chosen altogether;
Step 2: selecting n feature in feature set at random for the feature set Jing Guo feature selecting, establish
CART decision-tree model;
Step 3: repeating the first step and second step k times, k CART decision tree is generated, every decision tree possesses independent determine
Plan criterion;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
The present invention discusses for convenience by taking Expressway in Sichuan Province traffic flow data as an example, does to former data in advance
Simple processing, table 1 give 287 traffic flow datas that Sichuan Province's highway collects, and collection period is
5min.Volume is flow field in table;Speed is speed field;Density is vehicle density field;Occupancy is to account for
There is rate field;Queue is to be lined up duration field.
1 Sichuan Province's freeway traffic flow data of table
It can be found through observation, the traffic flow data of upper table is not fully correct, wherein still remaining manifest error.
Under normal circumstances, the codomain range of occupation rate should be between 0~1.When road is completely unimpeded, vehicle does not need enterprising in road
Row stops, and occupation rate is minimum at this time, is 0. when heavy congestion occurs in road, vehicle, which needs to rest on road, to be waited, at this time
Occupation rate reaches peak value, is 1.But there is the case where occupation rate is more than 1 in a plurality of data in table, belong in wrong data.In addition, certain
A little data are in road there are when vehicle, and vehicle density is down to 0, this is obvious and unreasonable.It is then desired to be repaired to wrong data
Just.
By being based on traffic flow theory, traffic parameter data model is established.It is completed using following formula to wrong data
Makeover process.
After carrying out wrong data amendment, need that feature is normalized.Data such as 2 institute of table after processed
Show, totally 274 correct data.
The pretreated traffic flow data of table 2
It can be found that handling by data normalization, the codomain of all traffic flow characters is distributed between [0-1], is disappeared
In addition to the error generated due to form of expression difference to model between feature.
Traffic flow data should analyze traffic flow character after being pre-processed, and selection is for solving traffic behavior
Beneficial traffic characteristic is got rid of and is not helped or the feature of redundancy for solving traffic behavior, to promote following model
Precision and reliability.Before carrying out feature selecting, the present invention has contacted the expert of field of traffic to small part traffic flow data
Corresponding traffic behavior is manually judged, and the present invention is also with reference to progress traffic characteristic selection with this partial data.
According to table 1, the feature for including in traffic flow data is mainly the volume of traffic (Volume), travel speed
(Speed), vehicle density (Density), occupation rate (Occuracy), queue length (Queue).
As shown in figure 4, carrying out feature selection process using method proposed by the present invention, it is contemplated that algorithm is in the process of running
Random sample S can be selected, may cause result weight has certain discrepancy, therefore the present invention takes many experiments to be averaged
Method amounts to and carries out 30 experiments, each run result is summarized, obtains the average value of every kind of weight.Q represents flow, V in figure
Travel speed is represented, P represents vehicle density, and O represents occupation rate, and L, which is represented, is lined up duration.
Each feature average weight is as shown in table 3;
3 feature average weight of table
Q represents flow in table, and V represents travel speed, and P represents vehicle density, and O represents occupation rate, and L, which is represented, is lined up duration.
According to feature selecting algorithm, flow, travel speed, 3 features of occupation rate weight be all larger than 10%, and wagon flow is close
Degree and the weight for being lined up duration characteristics are respectively less than 5%, further relate to, for the highway, flow, travel speed and occupation rate
3 traffic flow characters can map traffic behavior.Vehicle density and queue length are then low with traffic behavior correlation, should by this two
The low feature of a correlation is got rid of.
According to traffic flow theory, the present invention defines 4 traffic behavior grades altogether, respectively unimpeded, jogging, congestion, seriously
Congestion.
The results are shown in Table 4 for the road traffic state of real-time grading device output.
4 traffic state judging of table
Test set is split according to the ratio of 1:4 at random, real-time grading model is established and road traffic state is carried out
Differentiate.The present invention uses the quality of accuracy probabilistic index evaluation algorithms.
Accuracy=(TP+TN)/(P+N)
TP is correctly divided into the number of positive example in formula, i.e., practical to be positive example and be classified the example that device is divided into positive example
Number (sample number), TP is mistakenly divided into the number of positive example, i.e., the example that is actually negative but is classified the example that device is divided into positive example
Number.P+N is total number of samples.
As shown in figure 5, respectively using multi-model blending algorithm proposed by the present invention, traditional clustering algorithm, single characteristic threshold value
Distinguished number establishes real-time grading model, differentiates that computation model is just to Sichuan Province's highway real-time traffic states
True rate, and with the quality of this evaluation model.
By experimental result it is found that the precision of condition discrimination algorithm proposed by the invention can reach 94% or so, with
Traditional single characteristic threshold value method of discrimination is compared and improves accuracy and validity, meanwhile, feature selecting side proposed by the present invention
Method can remove some extraneous features, promote the precision of differentiation.
The above is only a preferred embodiment of the present invention, it should be understood that the present invention is not limited to presently disclosed
Form, should not be regarded as an exclusion of other examples, and can be used for other combinations, modifications, and environments, and can be
In contemplated scope of the present invention, modifications can be made through the above teachings or related fields of technology or knowledge.And those skilled in the art
The modifications and changes carried out do not depart from the spirit and scope of the present invention, then all should be in the protection model of appended claims of the present invention
In enclosing.
Claims (10)
1. a kind of condition discrimination method based on multi-model fusion, it is characterised in that: the method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
2. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the data
It is pretreated that steps are as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
3. a kind of condition discrimination method based on multi-model fusion according to claim 2, it is characterised in that: the judgement
The traffic flow data of acquisition whether there is abnormal data, and specific step is as follows to abnormal data progress data processing:
Whether in the reasonable scope to judge the fluctuation range of data value field;
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
4. a kind of condition discrimination method based on multi-model fusion according to claim 2, it is characterised in that: the logarithm
According to data normalization processing is carried out, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value is obtained;
The feature vector of all traffic flow datas is traversed, minimum value is obtained;
Feature vector is normalized.
5. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the feature
Selection, selecting correlated characteristic subset by removing uncorrelated and redundancy feature reduction data dimension, specific step is as follows:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
6. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: described mostly special
Sign cluster, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of mass center is calculated using K-Means clustering algorithm to initial m data.
7. step 2: first step is repeated, until obtaining m S grades of mass center.
8. step 3:+1 grade of mass center of k S is calculated using K-Means clustering algorithm to m S grades of mass center.
9. step 4: third step is repeated, until obtaining+1 grade of mass center of m S, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, just clustered using K-Means algorithm
Obtain+1 grade of mass center of k S;Until finally obtaining k final mass center.
10. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the reality
When classify, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, choose altogethermA random sample;
Step 2: being selected in feature set at random for the feature set Jing Guo feature selectingnA feature establishes CART and determines
Plan tree-model;
Step 3: repeating the first step and second stepkIt is secondary, it generateskCART decision tree, it is quasi- that every decision tree possesses independent decision
Then;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910650794.9A CN110390816A (en) | 2019-07-18 | 2019-07-18 | A kind of condition discrimination method based on multi-model fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910650794.9A CN110390816A (en) | 2019-07-18 | 2019-07-18 | A kind of condition discrimination method based on multi-model fusion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110390816A true CN110390816A (en) | 2019-10-29 |
Family
ID=68285143
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910650794.9A Pending CN110390816A (en) | 2019-07-18 | 2019-07-18 | A kind of condition discrimination method based on multi-model fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110390816A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177346A (en) * | 2019-12-19 | 2020-05-19 | 爱驰汽车有限公司 | Man-machine interaction method and device, electronic equipment and storage medium |
CN111192456A (en) * | 2020-01-14 | 2020-05-22 | 泉州市益典信息科技有限公司 | Road traffic operation situation multi-time scale prediction method |
CN111230872A (en) * | 2020-01-31 | 2020-06-05 | 武汉大学 | Object delivery intention recognition system and method based on multiple sensors |
CN111599170A (en) * | 2020-04-13 | 2020-08-28 | 浙江工业大学 | Traffic running state classification method based on time sequence traffic network diagram |
CN113029227A (en) * | 2021-02-02 | 2021-06-25 | 中船第九设计研究院工程有限公司 | State monitoring system for moving hydraulic trolley |
CN113971216A (en) * | 2021-10-22 | 2022-01-25 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and memory |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102592453A (en) * | 2012-02-27 | 2012-07-18 | 东南大学 | Real-time traffic condition judging method based on time window |
CN102609612A (en) * | 2011-12-31 | 2012-07-25 | 电子科技大学 | Data fusion method for calibration of multi-parameter instruments |
CN108492557A (en) * | 2018-03-23 | 2018-09-04 | 四川高路交通信息工程有限公司 | Highway jam level judgment method based on multi-model fusion |
US20190069808A1 (en) * | 2016-05-10 | 2019-03-07 | David Andrew Clifton | Method of determining the frequency of a periodic physiological process of a subject, and a device and system for determining the frequency of a periodic physiological process of a subject |
-
2019
- 2019-07-18 CN CN201910650794.9A patent/CN110390816A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102609612A (en) * | 2011-12-31 | 2012-07-25 | 电子科技大学 | Data fusion method for calibration of multi-parameter instruments |
CN102592453A (en) * | 2012-02-27 | 2012-07-18 | 东南大学 | Real-time traffic condition judging method based on time window |
US20190069808A1 (en) * | 2016-05-10 | 2019-03-07 | David Andrew Clifton | Method of determining the frequency of a periodic physiological process of a subject, and a device and system for determining the frequency of a periodic physiological process of a subject |
CN108492557A (en) * | 2018-03-23 | 2018-09-04 | 四川高路交通信息工程有限公司 | Highway jam level judgment method based on multi-model fusion |
Non-Patent Citations (4)
Title |
---|
冯勇: "基于云模型的城市快速路交通状态识别方法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
张钰等: "基于分类与回归算法(CART)的城市道路交通状态阈值划分研究", 《黑龙江交通科技》 * |
张静萱: "基于特征选择的城市快速路实时交通事故风险预测", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
李晓璐: "基于多源信息处理技术的交通状态判别研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111177346A (en) * | 2019-12-19 | 2020-05-19 | 爱驰汽车有限公司 | Man-machine interaction method and device, electronic equipment and storage medium |
CN111192456A (en) * | 2020-01-14 | 2020-05-22 | 泉州市益典信息科技有限公司 | Road traffic operation situation multi-time scale prediction method |
CN111230872A (en) * | 2020-01-31 | 2020-06-05 | 武汉大学 | Object delivery intention recognition system and method based on multiple sensors |
CN111230872B (en) * | 2020-01-31 | 2021-07-20 | 武汉大学 | Object delivery intention recognition system and method based on multiple sensors |
CN111599170A (en) * | 2020-04-13 | 2020-08-28 | 浙江工业大学 | Traffic running state classification method based on time sequence traffic network diagram |
CN111599170B (en) * | 2020-04-13 | 2021-12-17 | 浙江工业大学 | Traffic running state classification method based on time sequence traffic network diagram |
CN113029227A (en) * | 2021-02-02 | 2021-06-25 | 中船第九设计研究院工程有限公司 | State monitoring system for moving hydraulic trolley |
CN113971216A (en) * | 2021-10-22 | 2022-01-25 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and memory |
CN113971216B (en) * | 2021-10-22 | 2023-02-03 | 北京百度网讯科技有限公司 | Data processing method and device, electronic equipment and memory |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110390816A (en) | A kind of condition discrimination method based on multi-model fusion | |
CN109191896B (en) | Personalized parking space recommendation method and system | |
CN110516702A (en) | A kind of discreet paths planing method based on flow data | |
CN108492557A (en) | Highway jam level judgment method based on multi-model fusion | |
CN106250442A (en) | The feature selection approach of a kind of network security data and system | |
CN108764375B (en) | Highway goods stock transprovincially matching process and device | |
CN101751438A (en) | Theme webpage filter system for driving self-adaption semantics | |
CN108961758A (en) | A kind of crossing broadening lane detection method promoting decision tree based on gradient | |
CN114299742B (en) | Speed limit information dynamic identification and update recommendation method for expressway | |
Beshah et al. | Pattern recognition and knowledge discovery from road traffic accident data in ethiopia: Implications for improving road safety | |
Vollet et al. | Use of meta-analysis for the comparison and transfer of economic base multipliers | |
CN106911591A (en) | The sorting technique and system of network traffics | |
CN115309906B (en) | Intelligent data classification method based on knowledge graph technology | |
CN112560915A (en) | Urban expressway traffic state identification method based on machine learning | |
CN109866776A (en) | Driving preference discrimination method, equipment and medium suitable for three lanes complex environment | |
Özel | A genetic algorithm based optimal feature selection for web page classification | |
CN106126637A (en) | A kind of vehicles classification recognition methods and device | |
CN105930872A (en) | Bus driving state classification method based on class-similar binary tree support vector machine | |
CN115060278A (en) | Intelligent vehicle battery replacement navigation method and system based on multi-target genetic algorithm | |
He et al. | [Retracted] Visualization and Analysis of Mapping Knowledge Domain of Heterogeneous Traffic Flow | |
CN117272995B (en) | Repeated work order recommendation method and device | |
Tišljarić et al. | Fuzzy inference system for congestion index estimation based on speed probability distributions | |
CN109697575A (en) | Data processing method and system based on evaluation result | |
CN113379334B (en) | Road section bicycle riding quality identification method based on noisy track data | |
Himani et al. | A comparative study on machine learning based prediction of citations of articles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191029 |