CN105843919A - Moving object track clustering method based on multi-feature fusion and clustering ensemble - Google Patents

Moving object track clustering method based on multi-feature fusion and clustering ensemble Download PDF

Info

Publication number
CN105843919A
CN105843919A CN201610176417.2A CN201610176417A CN105843919A CN 105843919 A CN105843919 A CN 105843919A CN 201610176417 A CN201610176417 A CN 201610176417A CN 105843919 A CN105843919 A CN 105843919A
Authority
CN
China
Prior art keywords
clustering
cluster
fusion
track
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610176417.2A
Other languages
Chinese (zh)
Inventor
杨云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201610176417.2A priority Critical patent/CN105843919A/en
Publication of CN105843919A publication Critical patent/CN105843919A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a moving object track clustering method based on multi-feature fusion and clustering ensemble. The method comprises the steps of firstly roundly capturing the feature information of the track of a target moving object; then performing clustering analysis on four selected moving track features and generating a plurality of primary clustering results by using a K-means clustering algorithm; quantizing the quality of the plurality of primary clustering results, and then obtaining three fusion clustering result by means of weighted summation; and further integrating the three fusion clustering results to generate a final integration clustering result. According to the method, the feature information of the target moving object can be comprehensively captured, relevance between the dynamic characteristic of the track and time slice can be restored to the utmost extent, and the good antijamming capability is provided; weights are distributed to the plurality of primary clustering results according to different clustering quality assessment criteria, the class number can be automatically recognized during the fusion process, and the intrinsic structure of the class cluster can be effectively captured.

Description

A kind of mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble
Technical field
The invention belongs to track data clustering technique field, particularly relate to a kind of based on multiple features fusion and cluster Integrated mobile object trajectory clustering method.
Background technology
Mobile object trajectory cluster is a very active applied research direction of data mining, its objective is Classifying track data, similar track is gathered in same class, and different tracks is aggregated At different apoplexy due to endogenous wind, class here is called again bunch.According to the mobile object trajectory cluster result obtained, permissible Mobile object is carried out behavior pattern recognition.Such as: the motion track of store customer is clustered, Ke Yifen The flow of the people in analysis market, focus shop, customer purchase behavioral pattern etc.;Shifting to main traffic artery vehicle Dynamic track clusters, and can set up intelligent transportation control and planning mechanism, to large-scale public place crowd Motion track cluster, can automatically check the suspicious figure with special behavioral pattern, and set up intelligence Can early warning mechanism.Owing to this research field has higher theoretical research and actual application value, permitted both at home and abroad Many researcheres propose the more clustering technique for mobile object trajectory, but substantially comprise the work of two aspects Make: feature extraction and cluster analysis.First, original mobile trajectory data to be carried out feature extraction;Its Secondary, cluster analysis is done in this character representation.Although research worker has been achieved for one in this research field A little achievements, but owing to mobile trajectory data itself has extremely complex characteristic, such as: complicated temporal correlation, High-dimensional, magnanimity, noise jamming so that existing technology is the most immature.
There is the features such as the dynamic characteristic of complexity, high-dimensional and magnanimity due to track data so that traditional Clustering algorithm cannot obtain ideal result.
Summary of the invention
It is an object of the invention to provide a kind of mobile object trajectory based on multiple features fusion with clustering ensemble to gather Class method, it is intended to solve to have the features such as the dynamic characteristic of complexity, high-dimensional and magnanimity due to track data, Make the problem that traditional clustering algorithm cannot obtain ideal result.
The present invention is achieved in that a kind of mobile object trajectory based on multiple features fusion with clustering ensemble is gathered Class method, described mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble includes following step Rapid:
Discrete little first by polynomial curve fitting, discrete Fourier transform, segmentation partial statistics and segmentation Ripple four kinds of complementary Feature Extraction Technology of conversion, catch the characteristic information of target moving object track all sidedly, Reduce the dynamic characteristic of track and associating of time slice to the full extent;
Then give different Initialize installation, use the K-means clustering algorithm four kinds of moving rails to choosing Mark character representation carries out cluster analysis and produces multiple initial clustering result;
The appraisal procedure that the most selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
Finally, further three Fusion of Clustering results are combined the clustering ensemble result that generation is final.
Further, described mobile object trajectory clustering method based on multiple features fusion with clustering ensemble specifically wraps Include following steps:
Step one, multi-feature extraction, including polynomial curve fitting, discrete Fourier transform, segmentation local Statistics and segmentation discrete wavelet change four kinds of complementary Feature Extraction Technology, catch target moving object all sidedly The characteristic information of track, reduces the dynamic characteristic of track and associating of time slice to the full extent;
Step 2, initializes cluster analysis, in initial clustering is analyzed, and given different Initialize installation, Use K-means clustering algorithm that the four kinds of motion track character representations chosen are carried out cluster analysis, due to spy Levying the difference representing that different and K-means clustering initialization is arranged, multiple different cluster results will produce Raw, the excellent of each cluster result therein is had nothing in common with each other;
Step 3, weighted cluster integrated study, Weighted Fusion function, in initial clustering is analyzed, in conjunction with not Same character representation and Initialize installation, to having the target of N bar motion track according to collectionGenerate M Individual cluster result;Oriental matrixIt is used for representing that has a KmInitial clustering result P of classm, Wherein, what every a line was corresponding is each motion track, and every string is a binary vector, numerical value 1 The motion track representing corresponding is clustered in this type of, and numerical value 0 represents do not have;By this oriental matrix HmMeter Calculate a similar matrix Sm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if Any two motion tracks are similar and are collected in same class;And the cluster result represented by this similar matrix It is exactly that all M initial clustering results are combined;The appraisal procedure that selected three kinds of difference are bigger, π= { MH Γ, DVI, NMI} make it have complementary action, then according to three kinds of different appraisal procedures, calculate Weights corresponding to initial clustering result are, then by the way of weighted sum, obtain three Fusion of Clustering knots The similar matrix S of fruitMHT,SDVI,SNMI
Step 4, final optimization pass function, after generating three Fusion of Clustering similar matrixes, melts three matrixes It is combined and generates final clustering ensemble result, and make it can automatically identify class number in the process, produce Raw optimum result.
Further, described polynomial curve fitting method is least square polynomial method, finds best fit rail The coefficient of the math equation of mark data x (t), by a parametric polynomial function modelling:
X (t)=αPtPP-1tP-1+ ...+α1T+ α0
Wherein, αp(p=0,1 ..., P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αp(p=0,1 ..., P) multistage equation, in ordinary circumstance Under, fourth order polynomial coefficient has optimal performance, and the multinomial of higher order does not has substantially improving performance. So all coefficients of quadrinomial are by optimizing αp(p=0,1 ..., 4) constitute one complete for mobile trajectory data The PCF character representation of x (t).
Further, described discrete Fourier transform, for track data x (t), is produced by discrete Fourier transform A series of Fourier coefficient:
a k = 1 T Σ t = 1 T x ( t ) exp ( - j 2 π k t T ) , k = 0 , 1 , ... , T - 1 ;
Wherein, π represents pi, is a constant, and k is the exponent number of discrete Fourier transform function, chooses Front 16 higher order coefficient αk(k=0,1 ..., 16) constitute mobile trajectory data x (t) DFT character representation.
Further, motion track is divided into n section by described segmentation partial statistics, and every section all has same length | W |, if one complete fraction of final stage curtailment, is then merged in the last period;For each section, Mean μnWith variance μnCalculated by equation below:
μ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | x ( t ) , σ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | [ x ( t ) - μ n ] 2 ;
For complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute a PLS Character representation.
Further, motion track is divided into n section by the conversion of described segmentation discrete wavelet, and every section all has same Length | W |, if one complete fraction of final stage curtailment, is then merged in the last period.For each Segmentation, the DWT coefficient calculations on J=2 rank is as follows:
{ x ( t ) } t = ( n - 1 ) | W | n | W | ⇒ { Ψ L J , { Ψ H j } j = 1 J } . ;
Wherein,Represent the high-frequency information at jth order,Represent jth order low-frequency information;For one Motion track x (t) that bar is complete, all segmentations are at the high-frequency information of 1-2 orderWith at the 2nd order Low-frequency informationJust constitute a PDWT character representation.
Further, described K-means clustering algorithm is described as follows:
Step one, arbitrarily selects k object as initial cluster center from n data object;
Step 2, according to the average of each clustering object, calculates the distance of each object and center object;And Again corresponding object is carried out clustering according to minimum range;
Step 3, recalculates the average of each cluster;
Step 4, circulation step two to step 3 is until each cluster no longer changes.
Further, the described appraisal procedure bigger according to three kinds of difference, initial clustering result PmCorresponding power Value is:
w m π = π ( P m ) Σ m = 1 M π ( P m ) ;
Again by the way of weighted sum, obtain the similar matrix S of three Fusion of Clustering resultsMHT, SDVI, SNMI:
S π = Σ m = 1 M w m π S m ;
Further, the similar matrix of final clustering ensemble result is calculated:
S*=∑ Sπ/M;
Matrix conversion is become a dendrogram, and in dendrogram, what its abscissa was corresponding is data point, and it is indulged Coordinate representation be bunch between similarity;In this dendrogram, the life cycle of a node is defined as Produce from it and bunch similarity interval that other nodes merge;Longer time interval is further to existing bunch Structure merges unreasonable, by cutting tree diagram in the range of a largest interval, it is thus achieved that correct class number, And then obtain final clustering ensemble result.
The mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble that the present invention provides, passes through Use the Feature Extraction Technology of multiple complementation, catch the characteristic information of target moving object track all sidedly, Reduce the dynamic characteristic of track and associating of time slice in big degree, and there is stronger anti-noise ability; By introducing new weighted cluster integrated study technology so that the stability of mobile object trajectory cluster analysis with Degree of accuracy is further improved;User need not set the class number of cluster, energy in process of cluster analysis Enough identification class numbers automatically, and effectively catch the intrinsic structure of class bunch.Compared with single clustering algorithm, poly- Class integrated study has three advantages: (1) clustering ensemble result has higher degree of accuracy, (2) cluster set Becoming study can excavate bunch information that single clustering algorithm cannot be excavated, (3) clustering ensemble learns for complexity Environment, such as: noise, exceptional data point, sampling change, there is stronger capacity of resisting disturbance.General by carrying High member clusters the clustering performance of device and increases member and cluster the diversity of device and reach to improve integrated performance Purpose.Present invention introduces clustering ensemble technology, improve the performance of motion track cluster, clustering ensemble learns Purpose be through integrated multiple complementation cluster device to obtain the cluster analysis system of a high reliability, It is intended to produce generalization ability is strong, difference is big multiple members and clusters device, give full play to each member and cluster device and exist Each advantage on clustering performance, it is thus achieved that cluster the clustering ensemble result that device will be good than single member.
The multi-feature extraction method that the present invention announces, has stronger complementarity each other, can be all sidedly Catch the characteristic information of target moving object track, reduce dynamic characteristic and the time of track to the full extent The association of fragment, and there is preferable capacity of resisting disturbance.Compared with existing multi-feature extraction method, the present invention The multi-feature extraction method announced is to represent multiple features to carry out a kind of non-linear fusion, respectively different spies Levy and in expression, do initial clustering analysis, and by clustering ensemble technology, multiple initial clustering analysis results are integrated One optimum clustering ensemble result.And existing multi-feature extraction integration technology is that various features is represented line Property combination, generate a high-dimensional characteristic vector, this characteristic vector do cluster analysis.Due to feature The dimension of vector increases substantially, and the calculation consumption of its cluster analysis, degree of accuracy is the most undesirable with stability.
The weighted cluster integrated technology that the present invention announces, improves average summation in existing clustering ensemble technology Multiple initial clustering results are distributed its weights according to different clustering result evaluation standards by amalgamation mode, power The quality being worth the highest initial clustering result representing correspondence is the highest.And then by the amalgamation mode of weighted sum, Many initial clusterings result is the most reasonably integrated.It addition, during the optimization of final step, this Weighted cluster integrated technology has possessed the automatic identification ability of class number so that it is for different mobile object trajectory Data have adaptive ability, and produce the cluster analysis result having more high stability with degree of accuracy, and this is Not available for existing clustering ensemble technology.
Accompanying drawing explanation
Fig. 1 is that the mobile object trajectory based on multiple features fusion with clustering ensemble that the embodiment of the present invention provides is gathered Class method flow diagram.
Fig. 2 is the clustering ensemble final optimization pass schematic diagram based on dendrogram cutting that the embodiment of the present invention provides.
Fig. 3 is the store customer mobile trajectory data collection that the embodiment of the present invention provides.
Fig. 4 is that the motion track based on multiple features fusion with integrated study technology that the embodiment of the present invention provides gathers Alanysis result.
Fig. 5 is the cluster analysis performance in the case of by varying strength noise jamming that the embodiment of the present invention provides (average correct classification rate+variance) schematic diagram.
Fig. 6 is the cluster analysis performance (classification accuracy rate) in the case of being disturbed that the embodiment of the present invention provides Schematic diagram.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment, The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to Explain the present invention, be not intended to limit the present invention.
Below in conjunction with the accompanying drawings the application principle of the present invention is explained in detail.
As it is shown in figure 1, the mobile object trajectory based on multiple features fusion Yu clustering ensemble of the embodiment of the present invention Clustering method comprises the following steps:
S101: use four kinds of complementary Feature Extraction Technology, catch the spy of target moving object track all sidedly Reference ceases, and reduces the dynamic characteristic of track and associating of time slice to the full extent;
S102: given different Initialize installation, uses the four kinds of movements to choosing of the K-means clustering algorithm Track characteristic represents and carries out cluster analysis;
S103: the appraisal procedure that selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
S104: three Fusion of Clustering results are combined generate final clustering ensemble result further.
The present invention proposes a kind of mobile object trajectory cluster side based on multiple features fusion Yu integrated study technology Method includes multi-feature extraction, initializes cluster analysis, Weighted Fusion function, final optimization pass function:
(1) multi-feature extraction
High between track data dimension, scale big, Noise etc. characteristic, in the raw information of track data Carry out cluster analysis, not only inefficiency on territory, and influence whether reliability and the accuracy of cluster result. Under conditions of ensureing track data key message, the most effectively carry out character representation, thus reduce data Dimension and remove noise, track data cluster analysis is had very important significance;Use four kinds mutually The Feature Extraction Technology mended, catches the characteristic information of target moving object track, to the full extent also all sidedly The former dynamic characteristic of track and associating of time slice, and there is stronger anti-noise ability.Each feature Extracting method is as follows:
The purpose of polynomial curve fitting (Ploynomial Curve Fitting, PCF) be to find one can To represent the mathematical formulae of data signal, reduce the data degree by influence of noise.The most frequently used matching side Method is least square polynomial method, and it can find the coefficient of the math equation of best fit track data x (t), Can be by a parametric polynomial function modelling.
X (t)=αPtPP-1tP-1+…+α1t+α0
Wherein, αp(p=0,1 ..., P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αp(p=0,1 ..., P) multistage equation, in ordinary circumstance Under, fourth order polynomial coefficient has optimal performance, and the multinomial of higher order does not has substantially improving performance. So all coefficients of quadrinomial are by optimizing αp(p=0,1 ..., 4) constitute one complete for mobile trajectory data The PCF character representation of x (t).
Discrete Fourier transform (Discrete Fourier transform) is a kind of linear integral transformations of data, The motion track expression from raw information territory is transformed into frequency domain, and Fourier transformation analysis is sequence data conversion In the instrument that is most widely used.On frequency domain analyze mobile trajectory data, can easily disclose its Some important attribute being difficult to observe by raw information territory.Discrete Fourier transform is permissible for discrete sample Discrete series (observation) on raw information territory is mapped to the discrete series in a domain space (frequently Rate coefficient).For track data x (t), can produce a series of Fourier by discrete Fourier transform is Number:
a k = 1 T Σ t = 1 T x ( t ) exp ( - j 2 π k t T ) , k = 0 , 1 , ... , T - 1 ;
In order under conditions of there is noise formed a stable DFT character representation, only choose front 16 high Level number (corresponding to low-frequency information) constitutes the DFT character representation of mobile trajectory data x (t).
A kind of character representation method to motion track segment processing of segmentation partial statistics (PLS).First, will move Dynamic track is divided into n section, and every section all has same length | W |, if final stage curtailment one is complete Segmentation, then be merged in the last period.For each section, mean μnAnd variances sigmanEquation below meter can be passed through Calculate:
μ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | x ( t ) , σ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | [ x ( t ) - μ n ] 2 ;
And for complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute one PLS character representation.
Segmentation discrete wavelet conversion (PDWT) is also a kind of character representation method to motion track segment processing. Discrete wavelet conversion is a kind of effective multiscale analysis instrument, and wavelet conversion is on raw information territory and frequency domain Preferable local characteristic is all had to show.Wavelet transformation is, by wavelet function, data signal is converted to small echo Progression, thus can portray the feature of motion track by the coefficient of wavelet series.First, still will Motion track is divided into n section, and every section all has same length | W |, if final stage curtailment one is complete Whole segmentation, then be merged in the last period.For each segmentation, the DWT coefficient calculations on J=2 rank is such as Under:
{ x ( t ) } t = ( n - 1 ) | W | n | W | ⇒ { Ψ L J , { Ψ H j } j = 1 J } . ;
Represent the high-frequency information at jth order,Represent jth order low-frequency information;Complete for one Motion track x (t), all segmentations are at the high-frequency information of 1-2 orderBelieve with the low frequency at the 2nd order BreathJust constitute a PDWT character representation.
(2) cluster analysis is initialized
In initial clustering is analyzed, given different Initialize installation, use K-means clustering algorithm to choosing The four kinds of motion track character representations taken carry out cluster analysis.Owing to character representation is different and K-means gathers The difference of class Initialize installation, multiple different cluster results will produce.Each cluster result therein Excellent have nothing in common with each other.K-means clustering algorithm is described as follows:
1. arbitrarily select k object as initial cluster center from n data object;
2., according to the average (center object) of each clustering object, calculate each object and these center object Distance;And again corresponding object is carried out clustering according to minimum range;
3. recalculate the average (center object) that each (changing) clusters;
4. circulation (2) to (3) is until each cluster no longer changes.
(3) Weighted Fusion function
In initial clustering analyzes module, in conjunction with different character representations and Initialize installation, can be to having N The target of bar motion track is according to collectionGenerate M cluster result.Oriental matrixIt is used for Represent one and there is KmInitial clustering result P of classm, wherein, what every a line was corresponding is each motion track, Every string is a binary vector, and numerical value 1 represents that corresponding motion track is clustered in this type of, and Numerical value 0 represents not to be had.By this oriental matrix HmA similar matrix S can be calculatedm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if any two motion tracks are similar and gathered Collection is in same class, and such as: the 2nd row in similar matrix, the element value of the 6th row is 1, then, the 2nd He Article 6, motion track is just gathered at same class, otherwise, then do not gathered at same class for dissmilarity.Connect down Coming, be averaged summation to the similar matrix of M initial clustering result, has just obtained the similar of Fusion of Clustering Matrix:
S = Σ m = 1 M S m / M ;
And the cluster result represented by this similar matrix is incorporated into one all M initial clustering results exactly Rising, its robustness and degree of accuracy are all higher than other initial clustering results.
Define M initial clustering result final Fusion of Clustering result is served the same role, but actual Situation is that the quality of each initial clustering result is different, and the initial clustering result of high-quality should be to finally The structure of Fusion of Clustering plays prior effect, such as: in meeting everybody by ballot by the way of to one Individual proposal is decided by vote, then having experience and have the voter of relevant professional knowledge, his suggestion should be more worth Pay attention to, bigger effect is played in final decision.In the present invention, each initial clustering result is entered Row quality evaluation, and its quantization is obtained weights, the highest initial clustering result representing correspondence of weights Quality is the highest.But not unique to the standard of cluster result quality evaluation, appraisal procedure has a variety of, therefore I selectes three kinds of bigger appraisal procedures of difference, and π={ MH Γ, DVI, NMI}. can make it have mutually Benefit effect.So according to three kinds of different appraisal procedures, initial clustering result PmCorresponding weights are:
w m π = π ( P m ) Σ m = 1 M π ( P m ) ;
Again by the way of weighted sum, the similar matrix S of three Fusion of Clustering results can be obtainedMHT, SDVI, SNMI:
S π = Σ m = 1 M w m π S m ;
(4) final optimization pass function
After generating three Fusion of Clustering similar matrixes, further three matrixes to be merged and generate Whole clustering ensemble result, and make it can automatically identify class number in the process, produce optimum result. First, the similar matrix of final clustering ensemble result will be calculated:
S*=∑ Sπ/M;
Then, this matrix conversion is become a dendrogram, as in figure 2 it is shown, in dendrogram, its abscissa Corresponding is data point, the similarity between what its vertical coordinate represented is bunch.In this dendrogram, a joint It is defined as bunch similarity interval produced and other nodes merge the life cycle of point (bunch) from it.Relatively Existing clustering architecture is merged unreasonable by long time interval further, therefore can be by a largest interval In the range of cut tree diagram, it is thus achieved that correct class number, and then obtain final clustering ensemble result.
It is explained in detail below by the application effect of the comparison present invention.
(1) multi-feature extraction integration technology
The multi-feature extraction method that the present invention announces, has stronger complementarity each other, can be all sidedly Catch the characteristic information of target moving object track, reduce dynamic characteristic and the time of track to the full extent The association of fragment, and there is preferable capacity of resisting disturbance.Compared with existing multi-feature extraction method, the present invention The multi-feature extraction method announced is to represent multiple features to carry out a kind of non-linear fusion, respectively different spies Levy and in expression, do initial clustering analysis, and by clustering ensemble technology, multiple initial clustering analysis results are integrated One optimum clustering ensemble result.And existing multi-feature extraction integration technology is that various features is represented line Property combination, generate a high-dimensional characteristic vector, this characteristic vector do cluster analysis.Due to feature The dimension of vector increases substantially, and the calculation consumption of its cluster analysis, degree of accuracy is the most undesirable with stability.
(2) weighted cluster integrated technology (Weighted Fusion function and final optimization pass function)
The weighted cluster integrated technology that the present invention announces, improves average summation in existing clustering ensemble technology Multiple initial clustering results are distributed its weights according to different clustering result evaluation standards by amalgamation mode, power The quality being worth the highest initial clustering result representing correspondence is the highest.And then by the amalgamation mode of weighted sum, Many initial clusterings result is the most reasonably integrated.It addition, during the optimization of final step, this Weighted cluster integrated technology has possessed the automatic identification ability of class number so that it is for different mobile object trajectory Data have adaptive ability, and produce the cluster analysis result having more high stability with degree of accuracy, and this is Not available for existing clustering ensemble technology.
On actual store customer mobile trajectory data collection, the present invention announce based on multiple features fusion and collection The mobile object trajectory clustering method becoming learning art has obtained preferable checking, as it is shown on figure 3, at one In the range of regular time, collect 222 motion tracks by the monitoring video in market, and use black line to enter Rower is noted.The clustering algorithm that the application of the invention is announced, as shown in Figure 4,222 motion tracks are by automatically And distributing efficiently inside 15 classifications, same category of track has higher similarity, represent The behavioral pattern of correspondence.In order to verify the capacity of resisting disturbance of this algorithm, artificially to store customer moving rail Mark data set adds Gaussian noise N (0, σ).As it is shown in figure 5, σ represents the magnitude adding Gaussian noise, numerical value The interference strength of the biggest noise is the biggest.The clustering algorithm that the application of the invention is announced, with undisturbed environment Under cluster result as benchmark, be used in be disturbed under environment based on multiple features fusion with based on independent feature The cluster result represented compares (calculating average correct classification rate+variance) with it.Result shows various Under interference strength, this algorithm and other method comparison can obtain the highest average accuracy and minimum variance Value, thus proves accuracy and the stability of its excellence.Finally in order to verify this algorithm extremely strong anti-interference Ability (interference includes: adds noise and deletes part motion track information) under environment, to original moving rail Mark data set adds Gaussian noise N (0, σ), σ=0.1, and blocks the partial information of motion track at random, such as Fig. 6 Shown in, abscissa represents the ratio between quantity of information and the complete motion track quantity of information that motion track is lost, Vertical coordinate presentation class accuracy, result shows still can obtain more than 86% under strong interference environment Accuracy.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all at this Any amendment, equivalent and the improvement etc. made within bright spirit and principle, should be included in the present invention Protection domain within.

Claims (9)

1. a mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble, it is characterised in that Described mobile object trajectory clustering method based on multiple features fusion with clustering ensemble comprises the following steps:
Discrete little first by polynomial curve fitting, discrete Fourier transform, segmentation partial statistics and segmentation Ripple four kinds of complementary Feature Extraction Technology of conversion, catch the characteristic information of target moving object track all sidedly, Reduce the dynamic characteristic of track and associating of time slice to the full extent;
Then give different Initialize installation, use the K-means clustering algorithm four kinds of moving rails to choosing Mark character representation carries out cluster analysis;
The appraisal procedure that the most selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
Last further three Fusion of Clustering results being combined generates final clustering ensemble result.
2. mobile object trajectory cluster side based on multiple features fusion Yu clustering ensemble as claimed in claim 1 Method, it is characterised in that described mobile object trajectory clustering method tool based on multiple features fusion with clustering ensemble Body comprises the following steps:
Step one, multi-feature extraction, including polynomial curve fitting, discrete Fourier transform, segmentation local Statistics and segmentation discrete wavelet change four kinds of complementary Feature Extraction Technology, catch target moving object all sidedly The characteristic information of track, reduces the dynamic characteristic of track and associating of time slice to the full extent;
Step 2, initializes cluster analysis, in initial clustering is analyzed, and given different Initialize installation, Use K-means clustering algorithm that the four kinds of motion track character representations chosen are carried out cluster analysis, due to spy Levying the difference representing that different and K-means clustering initialization is arranged, multiple different cluster results will produce Raw, the excellent of each cluster result therein is had nothing in common with each other;
Step 3, Weighted Fusion function, in initial clustering is analyzed, in conjunction with different character representations with initial Change and arrange, to there is the target of N bar motion track according to collectionGenerate M cluster result;Instruction square Battle arrayIt is used for representing that has a KmInitial clustering result P of classm, wherein, every a line is corresponding Being each motion track, every string is a binary vector, and numerical value 1 represents corresponding motion track It is clustered in this type of, and numerical value 0 represents do not have;By this oriental matrix HmCalculate a similar matrix Sm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if any two motion tracks Similar and be collected in same class;And the cluster result represented by this similar matrix is exactly at the beginning of all M Beginning cluster result combines;Appraisal procedure π that selected three kinds of difference are bigger=MH Γ, DVI, NMI}, Make it have complementary action;So according to three kinds of different appraisal procedures, calculate corresponding to initial clustering result Weights be then by the way of weighted sum, to obtain the similar matrix S of three Fusion of Clustering resultsMHT, SDVI,SNMI
Step 4, final optimization pass function, after generating three Fusion of Clustering similar matrixes, merges three matrixes Generate final clustering ensemble result together, and make it can automatically identify class number in the process, produce Optimum result.
3. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that described polynomial curve fitting method is least square polynomial method, finds optimal plan Close the coefficient of the math equation of track data x (t), by a parametric polynomial function modelling:
X (t)=αPtPP-1tP-1+···+α1t+α0
Wherein, αp(p=0,1, P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αpThe multistage equation of (p=0,1, P);Use quadrinomial All coefficients by optimize αp(p=0,1,4) constitute a complete PCF for mobile trajectory data x (t) Character representation.
4. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that described discrete Fourier transform, for track data x (t), passes through discrete Fourier transform Produce a series of Fourier coefficient:
a k = 1 T Σ t = 1 T x ( t ) exp ( - j 2 π k t T ) , k = 0 , 1 , ... , T - 1 ;
Wherein, π represents pi, is a constant, and k is the exponent number of discrete Fourier transform function, chooses Front 16 higher order coefficient αk(k=0,1,16) constitutes the DFT character representation of mobile trajectory data x (t).
5. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that motion track is divided into n section by described segmentation partial statistics, and every section all has same Length | W |, if one complete fraction of final stage curtailment, is then merged in the last period;For each Section, mean μnAnd variances sigmanCalculated by equation below:
μ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | x ( t ) , σ n = 1 | W | Σ t = 1 + ( n - 1 ) | W | n | W | [ x ( t ) - μ n ] 2 ;
For complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute a PLS Character representation.
6. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that motion track is divided into n section by the conversion of described segmentation discrete wavelet, and every section all has same Length | W | of sample, if one complete fraction of final stage curtailment, is then merged in the last period;For Each segmentation, the DWT coefficient calculations on J=2 rank is as follows:
{ x ( t ) } t = ( n - 1 ) | W | n | W | ⇒ { Ψ L J , { Ψ H j } j = 1 J } . ;
Wherein,Represent the high-frequency information at jth order,Represent jth order low-frequency information;For one Motion track x (t) that bar is complete, all segmentations are at the high-frequency information of 1-2 orderWith at the 2nd order Low-frequency informationJust constitute a PDWT character representation.
7. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that described K-means clustering algorithm is described as follows:
Step one, arbitrarily selects k object as initial cluster center from n data object;
Step 2, according to the average of each clustering object, calculates the distance of each object and center object;And Again corresponding object is carried out clustering according to minimum range;
Step 3, recalculates the average of each cluster;
Step 4, circulation step two to step 3 is until each cluster no longer changes.
8. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that the described appraisal procedure bigger according to three kinds of difference, initial clustering result PmCorresponding Weights be:
w m π = π ( P m ) Σ m = 1 M π ( P m ) ;
Again by the way of weighted sum, obtain the similar matrix S of three Fusion of Clustering resultsMHT, SDVI, SNMI:
S π = Σ m = 1 M w m π S m .
9. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2 Method, it is characterised in that calculate the similar matrix of final clustering ensemble result:
S*=∑ Sπ/M;
Matrix conversion is become a dendrogram, and in dendrogram, what its abscissa was corresponding is data point, and it is indulged Coordinate representation be bunch between similarity;In this dendrogram, the life cycle of a node is defined as Produce from it and bunch similarity interval that other nodes merge;Longer time interval is further to existing bunch Structure merges unreasonable, by cutting tree diagram in the range of a largest interval, it is thus achieved that correct class number, And then obtain final clustering ensemble result.
CN201610176417.2A 2016-03-24 2016-03-24 Moving object track clustering method based on multi-feature fusion and clustering ensemble Pending CN105843919A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610176417.2A CN105843919A (en) 2016-03-24 2016-03-24 Moving object track clustering method based on multi-feature fusion and clustering ensemble

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610176417.2A CN105843919A (en) 2016-03-24 2016-03-24 Moving object track clustering method based on multi-feature fusion and clustering ensemble

Publications (1)

Publication Number Publication Date
CN105843919A true CN105843919A (en) 2016-08-10

Family

ID=56583353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610176417.2A Pending CN105843919A (en) 2016-03-24 2016-03-24 Moving object track clustering method based on multi-feature fusion and clustering ensemble

Country Status (1)

Country Link
CN (1) CN105843919A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446081A (en) * 2016-09-09 2017-02-22 西安交通大学 Method for mining association relationship of time series data based on change consistency
CN106951903A (en) * 2016-10-31 2017-07-14 浙江大学 A kind of method for visualizing of crowd's movement law
CN107784314A (en) * 2016-08-26 2018-03-09 北京协同创新智能电网技术有限公司 Normal the abnormal data division methods and system of a kind of multivariable warning system
CN107871111A (en) * 2016-09-28 2018-04-03 苏宁云商集团股份有限公司 A kind of behavior analysis method and system
CN108599140A (en) * 2018-01-24 2018-09-28 合肥工业大学 Power load characteristic analysis method and device, storage medium
CN108834072A (en) * 2017-05-03 2018-11-16 腾讯科技(深圳)有限公司 The acquisition methods and device of motion track
CN108921191A (en) * 2018-05-25 2018-11-30 北方工业大学 Multi-biological-feature fusion recognition method based on image quality evaluation
CN109241069A (en) * 2018-08-23 2019-01-18 中南大学 A kind of method and system that the road network based on track adaptive cluster quickly updates
CN110097121A (en) * 2019-04-30 2019-08-06 北京百度网讯科技有限公司 A kind of classification method of driving trace, device, electronic equipment and storage medium
CN110866559A (en) * 2019-11-14 2020-03-06 上海中信信息发展股份有限公司 Poultry behavior analysis method and device
CN111372186A (en) * 2019-12-17 2020-07-03 广东小天才科技有限公司 Position calculation method under non-uniform positioning scene and terminal equipment
CN111414437A (en) * 2019-01-08 2020-07-14 阿里巴巴集团控股有限公司 Method and device for generating line track
CN111476616A (en) * 2020-06-24 2020-07-31 腾讯科技(深圳)有限公司 Trajectory determination method and apparatus, electronic device and computer storage medium
CN111693059A (en) * 2020-05-28 2020-09-22 北京百度网讯科技有限公司 Navigation method, device and equipment for roundabout and storage medium
CN112116806A (en) * 2020-08-12 2020-12-22 深圳技术大学 Traffic flow characteristic extraction method and system
CN112418339A (en) * 2020-11-29 2021-02-26 中国科学院电子学研究所苏州研究院 Random forest based aerial moving object identification method
CN112861565A (en) * 2019-11-12 2021-05-28 上海高德威智能交通系统有限公司 Method and device for determining track similarity, computer equipment and storage medium
CN113043274A (en) * 2021-03-25 2021-06-29 中车青岛四方车辆研究所有限公司 Robot performance evaluation method and system
CN110686679B (en) * 2019-10-29 2021-07-09 中国人民解放军军事科学院国防科技创新研究院 High-orbit optical satellite offshore target interruption track correlation method
CN113515982A (en) * 2020-05-22 2021-10-19 阿里巴巴集团控股有限公司 Track restoration method and equipment, equipment management method and management equipment
CN113535861A (en) * 2021-07-16 2021-10-22 子亥科技(成都)有限公司 Track prediction method for multi-scale feature fusion and adaptive clustering

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605990A (en) * 2013-10-23 2014-02-26 江苏大学 Integrated multi-classifier fusion classification method and integrated multi-classifier fusion classification system based on graph clustering label propagation
CN104182517A (en) * 2014-08-22 2014-12-03 北京羽乐创新科技有限公司 Data processing method and data processing device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605990A (en) * 2013-10-23 2014-02-26 江苏大学 Integrated multi-classifier fusion classification method and integrated multi-classifier fusion classification system based on graph clustering label propagation
CN104182517A (en) * 2014-08-22 2014-12-03 北京羽乐创新科技有限公司 Data processing method and data processing device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUN YANG ETC.: ""Temporal Data Clustering via Weighted Clustering Ensemble with Different Representations"", 《IEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784314A (en) * 2016-08-26 2018-03-09 北京协同创新智能电网技术有限公司 Normal the abnormal data division methods and system of a kind of multivariable warning system
CN106446081B (en) * 2016-09-09 2019-08-13 西安交通大学 The method for excavating time series data incidence relation based on variation consistency
CN106446081A (en) * 2016-09-09 2017-02-22 西安交通大学 Method for mining association relationship of time series data based on change consistency
CN107871111A (en) * 2016-09-28 2018-04-03 苏宁云商集团股份有限公司 A kind of behavior analysis method and system
CN107871111B (en) * 2016-09-28 2021-11-26 苏宁易购集团股份有限公司 Behavior analysis method and system
CN106951903A (en) * 2016-10-31 2017-07-14 浙江大学 A kind of method for visualizing of crowd's movement law
CN106951903B (en) * 2016-10-31 2019-12-17 浙江大学 method for visualizing crowd movement rules
CN108834072A (en) * 2017-05-03 2018-11-16 腾讯科技(深圳)有限公司 The acquisition methods and device of motion track
CN108599140A (en) * 2018-01-24 2018-09-28 合肥工业大学 Power load characteristic analysis method and device, storage medium
CN108599140B (en) * 2018-01-24 2021-01-29 合肥工业大学 Power load characteristic analysis method and device and storage medium
CN108921191A (en) * 2018-05-25 2018-11-30 北方工业大学 Multi-biological-feature fusion recognition method based on image quality evaluation
CN108921191B (en) * 2018-05-25 2021-10-26 北方工业大学 Multi-biological-feature fusion recognition method based on image quality evaluation
CN109241069A (en) * 2018-08-23 2019-01-18 中南大学 A kind of method and system that the road network based on track adaptive cluster quickly updates
CN111414437B (en) * 2019-01-08 2023-06-20 阿里巴巴集团控股有限公司 Method and device for generating line track
CN111414437A (en) * 2019-01-08 2020-07-14 阿里巴巴集团控股有限公司 Method and device for generating line track
CN110097121A (en) * 2019-04-30 2019-08-06 北京百度网讯科技有限公司 A kind of classification method of driving trace, device, electronic equipment and storage medium
CN110686679B (en) * 2019-10-29 2021-07-09 中国人民解放军军事科学院国防科技创新研究院 High-orbit optical satellite offshore target interruption track correlation method
CN112861565A (en) * 2019-11-12 2021-05-28 上海高德威智能交通系统有限公司 Method and device for determining track similarity, computer equipment and storage medium
CN110866559A (en) * 2019-11-14 2020-03-06 上海中信信息发展股份有限公司 Poultry behavior analysis method and device
CN111372186A (en) * 2019-12-17 2020-07-03 广东小天才科技有限公司 Position calculation method under non-uniform positioning scene and terminal equipment
CN113515982A (en) * 2020-05-22 2021-10-19 阿里巴巴集团控股有限公司 Track restoration method and equipment, equipment management method and management equipment
CN113515982B (en) * 2020-05-22 2022-06-14 阿里巴巴集团控股有限公司 Track restoration method and equipment, equipment management method and management equipment
CN111693059A (en) * 2020-05-28 2020-09-22 北京百度网讯科技有限公司 Navigation method, device and equipment for roundabout and storage medium
CN111693059B (en) * 2020-05-28 2022-10-11 阿波罗智联(北京)科技有限公司 Navigation method, device and equipment for roundabout and storage medium
CN111476616A (en) * 2020-06-24 2020-07-31 腾讯科技(深圳)有限公司 Trajectory determination method and apparatus, electronic device and computer storage medium
CN112116806A (en) * 2020-08-12 2020-12-22 深圳技术大学 Traffic flow characteristic extraction method and system
CN112418339B (en) * 2020-11-29 2022-11-29 中国科学院电子学研究所苏州研究院 Random forest based aerial moving object identification method
CN112418339A (en) * 2020-11-29 2021-02-26 中国科学院电子学研究所苏州研究院 Random forest based aerial moving object identification method
CN113043274A (en) * 2021-03-25 2021-06-29 中车青岛四方车辆研究所有限公司 Robot performance evaluation method and system
CN113535861A (en) * 2021-07-16 2021-10-22 子亥科技(成都)有限公司 Track prediction method for multi-scale feature fusion and adaptive clustering
CN113535861B (en) * 2021-07-16 2023-08-11 子亥科技(成都)有限公司 Track prediction method for multi-scale feature fusion and self-adaptive clustering

Similar Documents

Publication Publication Date Title
CN105843919A (en) Moving object track clustering method based on multi-feature fusion and clustering ensemble
Xie et al. A decomposition-ensemble approach for tourism forecasting
CN108304668B (en) Flood prediction method combining hydrologic process data and historical prior data
CN101866421B (en) Method for extracting characteristic of natural image based on dispersion-constrained non-negative sparse coding
CN102495919B (en) Extraction method for influence factors of carbon exchange of ecosystem and system
CN106650767B (en) Flood forecasting method based on cluster analysis and real-time correction
CN111540193A (en) Traffic data restoration method for generating countermeasure network based on graph convolution time sequence
CN109117992B (en) Ultra-short-term wind power prediction method based on WD-LA-WRF model
CN104751185B (en) SAR image change detection based on average drifting genetic cluster
CN109919364A (en) Multivariate Time Series prediction technique based on adaptive noise reduction and integrated LSTM
CN111785329A (en) Single-cell RNA sequencing clustering method based on confrontation automatic encoder
CN109767312A (en) A kind of training of credit evaluation model, appraisal procedure and device
CN109948726B (en) Power quality disturbance classification method based on deep forest
CN102487343A (en) Diagnosis and prediction method for hidden faults of satellite communication system
Rondonotti et al. SiZer for time series: a new approach to the analysis of trends
CN103984746B (en) Based on the SAR image recognition methodss that semisupervised classification and region distance are estimated
CN103366365A (en) SAR image varying detecting method based on artificial immunity multi-target clustering
Jörges et al. Spatial ocean wave height prediction with CNN mixed-data deep neural networks using random field simulated bathymetry
CN114371009A (en) High-speed train bearing fault diagnosis method based on improved random forest
CN106022652A (en) Processing method of forest carbon sink operating plan and processing device of forest carbon sink operating plan
CN117540303A (en) Landslide susceptibility assessment method and system based on cross semi-supervised machine learning algorithm
Nelson et al. Do roads cause deforestation? Using satellite images in econometric analysis of land use
Silva et al. Generation of monthly synthetic streamflow series based on the method of fragments
Su et al. Fault diagnosis of high-speed train bogie based on spectrogram and multi-channel voting
CN114626412A (en) Multi-class target identification method and system for unattended sensor system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160810