CN110390816A - A kind of condition discrimination method based on multi-model fusion - Google Patents

A kind of condition discrimination method based on multi-model fusion Download PDF

Info

Publication number
CN110390816A
CN110390816A CN201910650794.9A CN201910650794A CN110390816A CN 110390816 A CN110390816 A CN 110390816A CN 201910650794 A CN201910650794 A CN 201910650794A CN 110390816 A CN110390816 A CN 110390816A
Authority
CN
China
Prior art keywords
data
feature
traffic flow
mass center
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910650794.9A
Other languages
Chinese (zh)
Inventor
张凤荔
王瑞锦
翟嘉伊
刘崛雄
周世杰
张雪岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910650794.9A priority Critical patent/CN110390816A/en
Publication of CN110390816A publication Critical patent/CN110390816A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0133Traffic data processing for classifying traffic situation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Traffic Control Systems (AREA)

Abstract

The present invention relates to a kind of condition discrimination methods based on multi-model fusion, and the method includes the following contents: data prediction, carry out data prediction to the traffic flow data of acquisition;Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.The real-time status in the path in current network topology can be differentiated, for routine weight value determination and subsequent path planning application is provided fundamental basis and technology path;Accuracy and validity are improved compared with traditional single characteristic threshold value method of discrimination, meanwhile, feature selection approach can remove some extraneous features, promote the precision of differentiation.

Description

A kind of condition discrimination method based on multi-model fusion
Technical field
The present invention relates to a kind of traffic flow modes method of discrimination, sentence more particularly to a kind of state based on multi-model fusion Other method.
Background technique
In recent years, the high speed development of urban economy brings the increasingly saturation of Urban traffic demand, and traffic congestion phenomenon is Become the public enemy number one in Urban Transportation, urban road presentation waits in line even congestion state, seriously affected people The enthusiasm and efficiency gone on a journey.As an important component of intelligent transportation system, vehicle route induction can be real-time The service such as navigator fix, geography information efficiently is provided to traveler, guidance traveler reaches target location from original place point.It selects Path Planning directly decide that paths chosen is supplied to the quality good or not of the driving path of traveler.According to dynamic traffic Demand, Path Planning Technique involved in Vehicle Route Guide System also need while providing accurate route searching result Want can with the dynamic change of traffic information real-time calculated result, to prevent occur the route programming result obtained failure.Most Shortest path planning technology obtains road network real-time running state using the smart machines such as GPS, sensor, in road network origin node and The accessibility of destination node is analyzed, and seeks between origin node to the reachable path destination node, certain optimal rules is arranged, Such as oil consumption it is minimum, hide congestion etc., the selection of different schemes is carried out according to the principle of optimality, and the selection result is presented to the user It is selected for user.
Real-time requirement of the vehicle route induction according to user is that traveler displaying is current most using Wormhole routing is optimized Excellent programme.Since vehicle route derived need is provided based on the optimal routing route under current road network operating status, thus Great challenge is proposed to the timeliness for seeking diameter algorithm.Tradition is needed to be traversed for based on the path planning algorithm of graphics method Node is excessive, and the intermediate data amount of storage is excessive, it is difficult to be applied to large complicated network topology structure.
From the angle of economic development, traffic congestion caused by current city traffic supersaturation has become urbanization and builds If one can not set no problem in the process.Road network utilization rate is difficult to effectively be promoted, and traffic resource distribution mechanism is chaotic.Based on reality When road network structure the judgement of traffic flow data state can effectively improve the utilization rate of current road network, reduce traffic accident, push away It is scientific into decision-making management, it is intelligent, promote the ability of resource allocation.
Summary of the invention
The purpose of the present invention is to overcome the shortcomings of the existing technology, provides a kind of condition discrimination based on multi-model fusion Method differentiates for the real-time status to the path in current network topology, is routine weight value determination and subsequent path Planning application is provided fundamental basis and technology path.
The purpose of the present invention is achieved through the following technical solutions: a kind of condition discrimination method based on multi-model fusion, The method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
The step of data prediction, is as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
The traffic flow data of the judgement acquisition whether there is abnormal data, and the tool of data processing is carried out to abnormal data Steps are as follows for body:
Whether in the reasonable scope to judge the fluctuation range of data value field;
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
It is described that data progress data normalization processing, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value is obtained;
The feature vector of all traffic flow datas is traversed, minimum value is obtained;
Feature vector is normalized.
The feature selecting selects correlated characteristic subset to reduce the tool of data dimension by removing uncorrelated and redundancy feature Steps are as follows for body:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
The multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of matter is calculated using K-Means clustering algorithm to initial m data The heart.
Step 2: first step is repeated, until obtaining m S grades of mass center.
Step 3: k S+1 grades of mass center is calculated using K-Means clustering algorithm to m S grades of mass center.
Step 4: third step is repeated, until obtaining m S+1 grades of mass center, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, with regard to carrying out using K-Means algorithm Cluster obtains k S+1 grades of mass center;Until finally obtaining k final mass center.
The real-time grading, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, m random sample is chosen altogether;
Step 2: selecting n feature in feature set at random for the feature set Jing Guo feature selecting, establish CART decision-tree model;
Step 3: repeating the first step and second step k times, k CART decision tree is generated, every decision tree possesses independent determine Plan criterion;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
The invention has the following advantages that a kind of condition discrimination method based on multi-model fusion, can open up current network The real-time status in the path in flutterring differentiated, for routine weight value is determining and subsequent path planning application is provided fundamental basis and Technology path;Accuracy and validity are improved compared with traditional single characteristic threshold value method of discrimination, meanwhile, feature selection approach Some extraneous features can be removed, the precision of differentiation is promoted.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the present invention;
Fig. 2 is wrong data decision flowchart;
Fig. 3 is the flow chart of forest algorithm at any time
Fig. 4 is traffic flow data feature weight figure in embodiment;
Fig. 5 is the accuracy comparison diagram of different models in embodiment.
Specific embodiment
The present invention will be further described with reference to the accompanying drawing, but protection scope of the present invention is not limited to following institute It states.
As shown in Figure 1, a kind of condition discrimination method based on multi-model fusion, the method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Further, initial data does not ensure that complete correctness during acquisition, transmission, storage, inevitable There are many incomplete places, for example, data type is inconsistent, shortage of data, data redundancy etc..If not to initial data It is pocessed and directly uses, let alone low-quality data and flow into algorithm model, the learning process of algorithm model certainly will be made At huge destruction.It is opposite, pretreatment appropriate is carried out to data, it will be obviously improved algorithm model decision quality and can By property.
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Further, with the exponentially rising of data scale and data complexity, the calculation for establishing ultra-large type is generally required Solution of the method structure to carry out problem, algorithm complexity and response time increase suddenly, but in fact, most of features (become Amount) solution of problem is absolutely not helped, it may be said that it is redundancy feature (variable) for the process of Solve problems.This for Data itself are unacceptable, the especially this data flow infinitely reached over time of flow data naturally.Thus, into During row model construction, select effective data characteristics (variable) carries out solving seem it is vital.
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Further, different characteristic has the description of different aspect to the problem of being solved, and surface seems independently of each other, actually There is profound connection, multi-characters clusterl is divided example using similarity principle i.e. by analyzing multidimensional characteristic There are the sub-instances of significant difference to be multiple.In brief, clustering is that object of classification is placed in hyperspace, according to right As an existing otherness is recognized, the object with same attribute is divided into same class, the object with different attribute It is divided into inhomogeneity, " high cohesion, lower coupling " between classification is realized, that is, is allocated as similarity pole present in of a sort object Height, the otherness being allocated as between inhomogeneous object are very big.
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
Further, due to the particularity of flow data, there is high requirement to the real-time of model algorithm, thus, it establishes Real-time grading device, which carries out classification to the flow data continually generated, seems important very much, and real-time grading device is required to inflow Data inside model make quick response, and the assorting process of stream data can be completed in finite time, is avoided the occurrence of Lead to large-scale data queue and obstruction since computation complexity is excessive.
The step of data prediction, is as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
As shown in Fig. 2, the traffic flow data of the judgement acquisition is carried out with the presence or absence of abnormal data, and to abnormal data Specific step is as follows for data processing:
Whether in the reasonable scope to judge the fluctuation range of data value field;
Further, the fluctuation of data value field indicates between 50%~150%, data value field wave in the reasonable scope It is dynamic.
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
Further, wrong data mainly includes data mistake and two kinds of shortage of data;Wherein error in data indicates that data exist Since data formatting error causes data appearance unexpected as a result, such as in traffic flow data during acquisition, storage In there is negative;Shortage of data indicates that equipment is interrupted during carrying out data acquisition, causes certain data to occur bright Aobvious omission does not collect traffic density data such as.
Further, it needs to carry out mistake or abnormality processing when apparent error occur in data;It is a small amount of wrong when only occurring Accidentally, when can be neglected compared to correct data, wrong data can directly be deleted.If wrong data is compared to correct When data are less than 5%, wrong data can directly be deleted.If wrong data is more than 5% compared to correct data, need to mistake Accidentally data are modified, and the present invention will use adjacent data or the algebraic mean number in a period of time to fill up.
It is described that data progress data normalization processing, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value Max is obtained;
The feature vector of all traffic flow datas is traversed, minimum M in is obtained;
Feature vector is normalized.
Further, normalization calculation formula is as follows:
X in formula0-1For the feature vector after being normalized, x is feature vector, and Min is the minimum value of feature vector, Max For the maximum value of feature vector.
The feature selecting selects correlated characteristic subset to reduce the tool of data dimension by removing uncorrelated and redundancy feature Steps are as follows for body:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
Specifically, a sample S is randomly selected from training set T, k that S is then found out from the sample set similar with S Neighbour's sample Hk, k neighbour's sample M is found out from each and the inhomogeneous sample set of Sk, it is updated according to following formula The weight of each feature.
W (A)=W (A)-similarityH(A)+differenceM(A)
Wherein,
Mj(c) j-th of closest sample in class C is indicated, diff (A, S, R) representative sample S and sample R are on feature A Difference, calculation formula is as follows:
It can be found that second formula is substantially calculating a certain feature of sample S to similar closest in above formula Sample HkSum of the distance;Third formula is then in a certain feature for calculating sample S to the closest sample M of inhomogeneitykDistance The sum of.More new formula according to first formula is it is found that when a certain feature of sample S is to similar closest sample HkDistance it Be greater than this feature to the closest sample M of inhomogeneitykSum of the distance when, the weight of this feature will be elevated, i.e. this feature It is positive acting in the classification for carrying out similar sample and non-similar sample, on the contrary, when a certain feature of sample S is to similar most adjacent Nearly sample HkSum of the distance be less than this feature to the closest sample M of inhomogeneitykSum of the distance when, then weight will be cut down, I.e. this feature is negative role in the classification for carrying out similar sample and non-similar sample.Certainly, the selection of sample S may have Therefore certain randomness can repeat n times, taking each feature average weight is the final weight of this feature, if a certain spy The weight of sign is greater than 0.5, then proves that the correlation between this feature and the problem that is solved is high, conversely, then proving this feature and being asked Correlation between solution problem is low, particularly, if the weight of a certain feature is less than threshold value, illustrates this feature and is solved between problem Almost without relationship, can directly be removed from multidimensional characteristic vectors group, to achieve the purpose that feature selecting.
The multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of matter is calculated using K-Means clustering algorithm to initial m data The heart.
Step 2: first step is repeated, until obtaining m S grades of mass center.
Step 3: k S+1 grades of mass center is calculated using K-Means clustering algorithm to m S grades of mass center.
Step 4: third step is repeated, until obtaining m S+1 grades of mass center, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, with regard to carrying out using K-Means algorithm Cluster obtains k S+1 grades of mass center;Until finally obtaining k final mass center.
Further, the present invention is based on STREAM algorithms to carry out multi-characters clusterl analysis, and STREAM algorithm is calculated with K-Means Based on method, introduces sliding window mechanism and solve the problems in flow data cluster.The bottom frame of STREAM algorithm is still K- Means clustering algorithm first makees brief analysis to K-Means clustering algorithm below.
On the basis of K-Means algorithm, the cluster process of stream data feature is realized using STREAM algorithm.STREAM The fabric algorithm of algorithm is K-Means algorithm, batch processing mechanism is added in superstructure, to solve to occur in flow data Concept drift the problem of.
K-Means algorithm divides different classification according to the distribution similarity of data point in multidimensional feature space.Specifically, with Machine obtains k object from data set, is considered as the initial mass center of k cluster;By remaining object according to itself and each cluster mass center Euclidean distance be assigned to closest cluster, recalculate the mass center of each cluster, be iteratively repeated this process, until distortion letter Number convergence obtains k changeless mass centers.Specifically, algorithm flow is as follows:
1. k object, the initial mass center μ as k cluster are obtained from data set at random12...μk
2. being directed to each object, its Euclidean distance between each cluster centre point is calculated, and according to minimum range Again corresponding object is divided, the criteria for classifying is as shown by the equation;
C(i)=argmin | | x(i)j||2
Wherein, C(i)For i-th of data object generic, x(i)For i-th of data object, μjFor j-th of cluster centre.
3. updating the mass center μ of k cluster according to following formula12...μk,
4. repeating 2-3 step, until following formula distortion function is restrained, the k mass centers no longer changed are obtained;
In formula, J (c, μ) is distortion function, μCFor the center after the completion of cluster.
As shown in figure 3, real-time grading is based on decision tree theory, Random Forest model is established, is obtained with multi-characters clusterl module To draw be classified as training set, classify to arithmetic for real-time traffic flow, carry out the differentiation of real-time traffic states.
The real-time grading, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, m random sample is chosen altogether;
Step 2: selecting n feature in feature set at random for the feature set Jing Guo feature selecting, establish CART decision-tree model;
Step 3: repeating the first step and second step k times, k CART decision tree is generated, every decision tree possesses independent determine Plan criterion;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
The present invention discusses for convenience by taking Expressway in Sichuan Province traffic flow data as an example, does to former data in advance Simple processing, table 1 give 287 traffic flow datas that Sichuan Province's highway collects, and collection period is 5min.Volume is flow field in table;Speed is speed field;Density is vehicle density field;Occupancy is to account for There is rate field;Queue is to be lined up duration field.
1 Sichuan Province's freeway traffic flow data of table
It can be found through observation, the traffic flow data of upper table is not fully correct, wherein still remaining manifest error. Under normal circumstances, the codomain range of occupation rate should be between 0~1.When road is completely unimpeded, vehicle does not need enterprising in road Row stops, and occupation rate is minimum at this time, is 0. when heavy congestion occurs in road, vehicle, which needs to rest on road, to be waited, at this time Occupation rate reaches peak value, is 1.But there is the case where occupation rate is more than 1 in a plurality of data in table, belong in wrong data.In addition, certain A little data are in road there are when vehicle, and vehicle density is down to 0, this is obvious and unreasonable.It is then desired to be repaired to wrong data Just.
By being based on traffic flow theory, traffic parameter data model is established.It is completed using following formula to wrong data Makeover process.
After carrying out wrong data amendment, need that feature is normalized.Data such as 2 institute of table after processed Show, totally 274 correct data.
The pretreated traffic flow data of table 2
It can be found that handling by data normalization, the codomain of all traffic flow characters is distributed between [0-1], is disappeared In addition to the error generated due to form of expression difference to model between feature.
Traffic flow data should analyze traffic flow character after being pre-processed, and selection is for solving traffic behavior Beneficial traffic characteristic is got rid of and is not helped or the feature of redundancy for solving traffic behavior, to promote following model Precision and reliability.Before carrying out feature selecting, the present invention has contacted the expert of field of traffic to small part traffic flow data Corresponding traffic behavior is manually judged, and the present invention is also with reference to progress traffic characteristic selection with this partial data.
According to table 1, the feature for including in traffic flow data is mainly the volume of traffic (Volume), travel speed (Speed), vehicle density (Density), occupation rate (Occuracy), queue length (Queue).
As shown in figure 4, carrying out feature selection process using method proposed by the present invention, it is contemplated that algorithm is in the process of running Random sample S can be selected, may cause result weight has certain discrepancy, therefore the present invention takes many experiments to be averaged Method amounts to and carries out 30 experiments, each run result is summarized, obtains the average value of every kind of weight.Q represents flow, V in figure Travel speed is represented, P represents vehicle density, and O represents occupation rate, and L, which is represented, is lined up duration.
Each feature average weight is as shown in table 3;
3 feature average weight of table
Q represents flow in table, and V represents travel speed, and P represents vehicle density, and O represents occupation rate, and L, which is represented, is lined up duration.
According to feature selecting algorithm, flow, travel speed, 3 features of occupation rate weight be all larger than 10%, and wagon flow is close Degree and the weight for being lined up duration characteristics are respectively less than 5%, further relate to, for the highway, flow, travel speed and occupation rate 3 traffic flow characters can map traffic behavior.Vehicle density and queue length are then low with traffic behavior correlation, should by this two The low feature of a correlation is got rid of.
According to traffic flow theory, the present invention defines 4 traffic behavior grades altogether, respectively unimpeded, jogging, congestion, seriously Congestion.
The results are shown in Table 4 for the road traffic state of real-time grading device output.
4 traffic state judging of table
Test set is split according to the ratio of 1:4 at random, real-time grading model is established and road traffic state is carried out Differentiate.The present invention uses the quality of accuracy probabilistic index evaluation algorithms.
Accuracy=(TP+TN)/(P+N)
TP is correctly divided into the number of positive example in formula, i.e., practical to be positive example and be classified the example that device is divided into positive example Number (sample number), TP is mistakenly divided into the number of positive example, i.e., the example that is actually negative but is classified the example that device is divided into positive example Number.P+N is total number of samples.
As shown in figure 5, respectively using multi-model blending algorithm proposed by the present invention, traditional clustering algorithm, single characteristic threshold value Distinguished number establishes real-time grading model, differentiates that computation model is just to Sichuan Province's highway real-time traffic states True rate, and with the quality of this evaluation model.
By experimental result it is found that the precision of condition discrimination algorithm proposed by the invention can reach 94% or so, with Traditional single characteristic threshold value method of discrimination is compared and improves accuracy and validity, meanwhile, feature selecting side proposed by the present invention Method can remove some extraneous features, promote the precision of differentiation.
The above is only a preferred embodiment of the present invention, it should be understood that the present invention is not limited to presently disclosed Form, should not be regarded as an exclusion of other examples, and can be used for other combinations, modifications, and environments, and can be In contemplated scope of the present invention, modifications can be made through the above teachings or related fields of technology or knowledge.And those skilled in the art The modifications and changes carried out do not depart from the spirit and scope of the present invention, then all should be in the protection model of appended claims of the present invention In enclosing.

Claims (10)

1. a kind of condition discrimination method based on multi-model fusion, it is characterised in that: the method includes the following contents:
Data prediction carries out data prediction to the traffic flow data of acquisition;
Feature selecting selects correlated characteristic subset to reduce data dimension by removing uncorrelated and redundancy feature;
Multi-characters clusterl, by being divided to multidimensional characteristic analysis to traffic flow data;
Real-time grading carries out the differentiation that classification carries out real-time traffic states to traffic flow data.
2. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the data It is pretreated that steps are as follows:
The traffic flow data of judgement acquisition whether there is abnormal data, and carry out data processing to abnormal data;
Data normalization processing is carried out to data.
3. a kind of condition discrimination method based on multi-model fusion according to claim 2, it is characterised in that: the judgement The traffic flow data of acquisition whether there is abnormal data, and specific step is as follows to abnormal data progress data processing:
Whether in the reasonable scope to judge the fluctuation range of data value field;
If data value field exceeds zone of reasonableness, illustrate that apparent error occur in data, and handle wrong data;
If data value field fluctuates in the reasonable scope, illustrate that data are normal.
4. a kind of condition discrimination method based on multi-model fusion according to claim 2, it is characterised in that: the logarithm According to data normalization processing is carried out, specific step is as follows:
The feature vector of all traffic flow datas is traversed, maximum value is obtained;
The feature vector of all traffic flow datas is traversed, minimum value is obtained;
Feature vector is normalized.
5. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the feature Selection, selecting correlated characteristic subset by removing uncorrelated and redundancy feature reduction data dimension, specific step is as follows:
Calculate the correlation of different characteristic vector sum known class in training set;
The different weights of different characteristic are determined according to different correlations;
Delete the feature that weight is less than threshold value.
6. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: described mostly special Sign cluster, by being divided to multidimensional characteristic analysis to traffic flow data, specific step is as follows:
Step 1: initially enabling S=1, k S grades of mass center is calculated using K-Means clustering algorithm to initial m data.
7. step 2: first step is repeated, until obtaining m S grades of mass center.
8. step 3:+1 grade of mass center of k S is calculated using K-Means clustering algorithm to m S grades of mass center.
9. step 4: third step is repeated, until obtaining+1 grade of mass center of m S, S=S+1
Step 5: repeating above-mentioned steps, i.e., whenever obtaining m S grades of mass center, just clustered using K-Means algorithm Obtain+1 grade of mass center of k S;Until finally obtaining k final mass center.
10. a kind of condition discrimination method based on multi-model fusion according to claim 1, it is characterised in that: the reality When classify, to traffic flow data carry out classification carry out real-time traffic states differentiation the following steps are included:
Step 1: carrying out the samples selection process put back in sample set at random, choose altogethermA random sample;
Step 2: being selected in feature set at random for the feature set Jing Guo feature selectingnA feature establishes CART and determines Plan tree-model;
Step 3: repeating the first step and second stepkIt is secondary, it generateskCART decision tree, it is quasi- that every decision tree possesses independent decision Then;
Step 4: traffic flow data is input to each tree decision, feature generic is finally determined.
CN201910650794.9A 2019-07-18 2019-07-18 A kind of condition discrimination method based on multi-model fusion Pending CN110390816A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910650794.9A CN110390816A (en) 2019-07-18 2019-07-18 A kind of condition discrimination method based on multi-model fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910650794.9A CN110390816A (en) 2019-07-18 2019-07-18 A kind of condition discrimination method based on multi-model fusion

Publications (1)

Publication Number Publication Date
CN110390816A true CN110390816A (en) 2019-10-29

Family

ID=68285143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910650794.9A Pending CN110390816A (en) 2019-07-18 2019-07-18 A kind of condition discrimination method based on multi-model fusion

Country Status (1)

Country Link
CN (1) CN110390816A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177346A (en) * 2019-12-19 2020-05-19 爱驰汽车有限公司 Man-machine interaction method and device, electronic equipment and storage medium
CN111192456A (en) * 2020-01-14 2020-05-22 泉州市益典信息科技有限公司 Road traffic operation situation multi-time scale prediction method
CN111230872A (en) * 2020-01-31 2020-06-05 武汉大学 Object delivery intention recognition system and method based on multiple sensors
CN111599170A (en) * 2020-04-13 2020-08-28 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN113029227A (en) * 2021-02-02 2021-06-25 中船第九设计研究院工程有限公司 State monitoring system for moving hydraulic trolley
CN113971216A (en) * 2021-10-22 2022-01-25 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and memory

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592453A (en) * 2012-02-27 2012-07-18 东南大学 Real-time traffic condition judging method based on time window
CN102609612A (en) * 2011-12-31 2012-07-25 电子科技大学 Data fusion method for calibration of multi-parameter instruments
CN108492557A (en) * 2018-03-23 2018-09-04 四川高路交通信息工程有限公司 Highway jam level judgment method based on multi-model fusion
US20190069808A1 (en) * 2016-05-10 2019-03-07 David Andrew Clifton Method of determining the frequency of a periodic physiological process of a subject, and a device and system for determining the frequency of a periodic physiological process of a subject

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609612A (en) * 2011-12-31 2012-07-25 电子科技大学 Data fusion method for calibration of multi-parameter instruments
CN102592453A (en) * 2012-02-27 2012-07-18 东南大学 Real-time traffic condition judging method based on time window
US20190069808A1 (en) * 2016-05-10 2019-03-07 David Andrew Clifton Method of determining the frequency of a periodic physiological process of a subject, and a device and system for determining the frequency of a periodic physiological process of a subject
CN108492557A (en) * 2018-03-23 2018-09-04 四川高路交通信息工程有限公司 Highway jam level judgment method based on multi-model fusion

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
冯勇: "基于云模型的城市快速路交通状态识别方法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
张钰等: "基于分类与回归算法(CART)的城市道路交通状态阈值划分研究", 《黑龙江交通科技》 *
张静萱: "基于特征选择的城市快速路实时交通事故风险预测", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *
李晓璐: "基于多源信息处理技术的交通状态判别研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177346A (en) * 2019-12-19 2020-05-19 爱驰汽车有限公司 Man-machine interaction method and device, electronic equipment and storage medium
CN111192456A (en) * 2020-01-14 2020-05-22 泉州市益典信息科技有限公司 Road traffic operation situation multi-time scale prediction method
CN111230872A (en) * 2020-01-31 2020-06-05 武汉大学 Object delivery intention recognition system and method based on multiple sensors
CN111230872B (en) * 2020-01-31 2021-07-20 武汉大学 Object delivery intention recognition system and method based on multiple sensors
CN111599170A (en) * 2020-04-13 2020-08-28 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN111599170B (en) * 2020-04-13 2021-12-17 浙江工业大学 Traffic running state classification method based on time sequence traffic network diagram
CN113029227A (en) * 2021-02-02 2021-06-25 中船第九设计研究院工程有限公司 State monitoring system for moving hydraulic trolley
CN113971216A (en) * 2021-10-22 2022-01-25 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and memory
CN113971216B (en) * 2021-10-22 2023-02-03 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and memory

Similar Documents

Publication Publication Date Title
CN110390816A (en) A kind of condition discrimination method based on multi-model fusion
CN109191896B (en) Personalized parking space recommendation method and system
CN110516702A (en) A kind of discreet paths planing method based on flow data
CN108492557A (en) Highway jam level judgment method based on multi-model fusion
CN106250442A (en) The feature selection approach of a kind of network security data and system
CN108764375B (en) Highway goods stock transprovincially matching process and device
CN101751438A (en) Theme webpage filter system for driving self-adaption semantics
CN108961758A (en) A kind of crossing broadening lane detection method promoting decision tree based on gradient
CN114299742B (en) Speed limit information dynamic identification and update recommendation method for expressway
Beshah et al. Pattern recognition and knowledge discovery from road traffic accident data in ethiopia: Implications for improving road safety
Vollet et al. Use of meta-analysis for the comparison and transfer of economic base multipliers
CN106911591A (en) The sorting technique and system of network traffics
CN115309906B (en) Intelligent data classification method based on knowledge graph technology
CN112560915A (en) Urban expressway traffic state identification method based on machine learning
CN109866776A (en) Driving preference discrimination method, equipment and medium suitable for three lanes complex environment
Özel A genetic algorithm based optimal feature selection for web page classification
CN106126637A (en) A kind of vehicles classification recognition methods and device
CN105930872A (en) Bus driving state classification method based on class-similar binary tree support vector machine
CN115060278A (en) Intelligent vehicle battery replacement navigation method and system based on multi-target genetic algorithm
He et al. [Retracted] Visualization and Analysis of Mapping Knowledge Domain of Heterogeneous Traffic Flow
CN117272995B (en) Repeated work order recommendation method and device
Tišljarić et al. Fuzzy inference system for congestion index estimation based on speed probability distributions
CN109697575A (en) Data processing method and system based on evaluation result
CN113379334B (en) Road section bicycle riding quality identification method based on noisy track data
Himani et al. A comparative study on machine learning based prediction of citations of articles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191029