CN105843919A - Moving object track clustering method based on multi-feature fusion and clustering ensemble - Google Patents
Moving object track clustering method based on multi-feature fusion and clustering ensemble Download PDFInfo
- Publication number
- CN105843919A CN105843919A CN201610176417.2A CN201610176417A CN105843919A CN 105843919 A CN105843919 A CN 105843919A CN 201610176417 A CN201610176417 A CN 201610176417A CN 105843919 A CN105843919 A CN 105843919A
- Authority
- CN
- China
- Prior art keywords
- clustering
- cluster
- fusion
- track
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a moving object track clustering method based on multi-feature fusion and clustering ensemble. The method comprises the steps of firstly roundly capturing the feature information of the track of a target moving object; then performing clustering analysis on four selected moving track features and generating a plurality of primary clustering results by using a K-means clustering algorithm; quantizing the quality of the plurality of primary clustering results, and then obtaining three fusion clustering result by means of weighted summation; and further integrating the three fusion clustering results to generate a final integration clustering result. According to the method, the feature information of the target moving object can be comprehensively captured, relevance between the dynamic characteristic of the track and time slice can be restored to the utmost extent, and the good antijamming capability is provided; weights are distributed to the plurality of primary clustering results according to different clustering quality assessment criteria, the class number can be automatically recognized during the fusion process, and the intrinsic structure of the class cluster can be effectively captured.
Description
Technical field
The invention belongs to track data clustering technique field, particularly relate to a kind of based on multiple features fusion and cluster
Integrated mobile object trajectory clustering method.
Background technology
Mobile object trajectory cluster is a very active applied research direction of data mining, its objective is
Classifying track data, similar track is gathered in same class, and different tracks is aggregated
At different apoplexy due to endogenous wind, class here is called again bunch.According to the mobile object trajectory cluster result obtained, permissible
Mobile object is carried out behavior pattern recognition.Such as: the motion track of store customer is clustered, Ke Yifen
The flow of the people in analysis market, focus shop, customer purchase behavioral pattern etc.;Shifting to main traffic artery vehicle
Dynamic track clusters, and can set up intelligent transportation control and planning mechanism, to large-scale public place crowd
Motion track cluster, can automatically check the suspicious figure with special behavioral pattern, and set up intelligence
Can early warning mechanism.Owing to this research field has higher theoretical research and actual application value, permitted both at home and abroad
Many researcheres propose the more clustering technique for mobile object trajectory, but substantially comprise the work of two aspects
Make: feature extraction and cluster analysis.First, original mobile trajectory data to be carried out feature extraction;Its
Secondary, cluster analysis is done in this character representation.Although research worker has been achieved for one in this research field
A little achievements, but owing to mobile trajectory data itself has extremely complex characteristic, such as: complicated temporal correlation,
High-dimensional, magnanimity, noise jamming so that existing technology is the most immature.
There is the features such as the dynamic characteristic of complexity, high-dimensional and magnanimity due to track data so that traditional
Clustering algorithm cannot obtain ideal result.
Summary of the invention
It is an object of the invention to provide a kind of mobile object trajectory based on multiple features fusion with clustering ensemble to gather
Class method, it is intended to solve to have the features such as the dynamic characteristic of complexity, high-dimensional and magnanimity due to track data,
Make the problem that traditional clustering algorithm cannot obtain ideal result.
The present invention is achieved in that a kind of mobile object trajectory based on multiple features fusion with clustering ensemble is gathered
Class method, described mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble includes following step
Rapid:
Discrete little first by polynomial curve fitting, discrete Fourier transform, segmentation partial statistics and segmentation
Ripple four kinds of complementary Feature Extraction Technology of conversion, catch the characteristic information of target moving object track all sidedly,
Reduce the dynamic characteristic of track and associating of time slice to the full extent;
Then give different Initialize installation, use the K-means clustering algorithm four kinds of moving rails to choosing
Mark character representation carries out cluster analysis and produces multiple initial clustering result;
The appraisal procedure that the most selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results
Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
Finally, further three Fusion of Clustering results are combined the clustering ensemble result that generation is final.
Further, described mobile object trajectory clustering method based on multiple features fusion with clustering ensemble specifically wraps
Include following steps:
Step one, multi-feature extraction, including polynomial curve fitting, discrete Fourier transform, segmentation local
Statistics and segmentation discrete wavelet change four kinds of complementary Feature Extraction Technology, catch target moving object all sidedly
The characteristic information of track, reduces the dynamic characteristic of track and associating of time slice to the full extent;
Step 2, initializes cluster analysis, in initial clustering is analyzed, and given different Initialize installation,
Use K-means clustering algorithm that the four kinds of motion track character representations chosen are carried out cluster analysis, due to spy
Levying the difference representing that different and K-means clustering initialization is arranged, multiple different cluster results will produce
Raw, the excellent of each cluster result therein is had nothing in common with each other;
Step 3, weighted cluster integrated study, Weighted Fusion function, in initial clustering is analyzed, in conjunction with not
Same character representation and Initialize installation, to having the target of N bar motion track according to collectionGenerate M
Individual cluster result;Oriental matrixIt is used for representing that has a KmInitial clustering result P of classm,
Wherein, what every a line was corresponding is each motion track, and every string is a binary vector, numerical value 1
The motion track representing corresponding is clustered in this type of, and numerical value 0 represents do not have;By this oriental matrix HmMeter
Calculate a similar matrix Sm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if
Any two motion tracks are similar and are collected in same class;And the cluster result represented by this similar matrix
It is exactly that all M initial clustering results are combined;The appraisal procedure that selected three kinds of difference are bigger, π=
{ MH Γ, DVI, NMI} make it have complementary action, then according to three kinds of different appraisal procedures, calculate
Weights corresponding to initial clustering result are, then by the way of weighted sum, obtain three Fusion of Clustering knots
The similar matrix S of fruitMHT,SDVI,SNMI;
Step 4, final optimization pass function, after generating three Fusion of Clustering similar matrixes, melts three matrixes
It is combined and generates final clustering ensemble result, and make it can automatically identify class number in the process, produce
Raw optimum result.
Further, described polynomial curve fitting method is least square polynomial method, finds best fit rail
The coefficient of the math equation of mark data x (t), by a parametric polynomial function modelling:
X (t)=αPtP+αP-1tP-1+ ...+α1T+ α0;
Wherein, αp(p=0,1 ..., P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point
A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αp(p=0,1 ..., P) multistage equation, in ordinary circumstance
Under, fourth order polynomial coefficient has optimal performance, and the multinomial of higher order does not has substantially improving performance.
So all coefficients of quadrinomial are by optimizing αp(p=0,1 ..., 4) constitute one complete for mobile trajectory data
The PCF character representation of x (t).
Further, described discrete Fourier transform, for track data x (t), is produced by discrete Fourier transform
A series of Fourier coefficient:
Wherein, π represents pi, is a constant, and k is the exponent number of discrete Fourier transform function, chooses
Front 16 higher order coefficient αk(k=0,1 ..., 16) constitute mobile trajectory data x (t) DFT character representation.
Further, motion track is divided into n section by described segmentation partial statistics, and every section all has same length
| W |, if one complete fraction of final stage curtailment, is then merged in the last period;For each section,
Mean μnWith variance μnCalculated by equation below:
For complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute a PLS
Character representation.
Further, motion track is divided into n section by the conversion of described segmentation discrete wavelet, and every section all has same
Length | W |, if one complete fraction of final stage curtailment, is then merged in the last period.For each
Segmentation, the DWT coefficient calculations on J=2 rank is as follows:
Wherein,Represent the high-frequency information at jth order,Represent jth order low-frequency information;For one
Motion track x (t) that bar is complete, all segmentations are at the high-frequency information of 1-2 orderWith at the 2nd order
Low-frequency informationJust constitute a PDWT character representation.
Further, described K-means clustering algorithm is described as follows:
Step one, arbitrarily selects k object as initial cluster center from n data object;
Step 2, according to the average of each clustering object, calculates the distance of each object and center object;And
Again corresponding object is carried out clustering according to minimum range;
Step 3, recalculates the average of each cluster;
Step 4, circulation step two to step 3 is until each cluster no longer changes.
Further, the described appraisal procedure bigger according to three kinds of difference, initial clustering result PmCorresponding power
Value is:
Again by the way of weighted sum, obtain the similar matrix S of three Fusion of Clustering resultsMHT, SDVI,
SNMI:
Further, the similar matrix of final clustering ensemble result is calculated:
S*=∑ Sπ/M;
Matrix conversion is become a dendrogram, and in dendrogram, what its abscissa was corresponding is data point, and it is indulged
Coordinate representation be bunch between similarity;In this dendrogram, the life cycle of a node is defined as
Produce from it and bunch similarity interval that other nodes merge;Longer time interval is further to existing bunch
Structure merges unreasonable, by cutting tree diagram in the range of a largest interval, it is thus achieved that correct class number,
And then obtain final clustering ensemble result.
The mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble that the present invention provides, passes through
Use the Feature Extraction Technology of multiple complementation, catch the characteristic information of target moving object track all sidedly,
Reduce the dynamic characteristic of track and associating of time slice in big degree, and there is stronger anti-noise ability;
By introducing new weighted cluster integrated study technology so that the stability of mobile object trajectory cluster analysis with
Degree of accuracy is further improved;User need not set the class number of cluster, energy in process of cluster analysis
Enough identification class numbers automatically, and effectively catch the intrinsic structure of class bunch.Compared with single clustering algorithm, poly-
Class integrated study has three advantages: (1) clustering ensemble result has higher degree of accuracy, (2) cluster set
Becoming study can excavate bunch information that single clustering algorithm cannot be excavated, (3) clustering ensemble learns for complexity
Environment, such as: noise, exceptional data point, sampling change, there is stronger capacity of resisting disturbance.General by carrying
High member clusters the clustering performance of device and increases member and cluster the diversity of device and reach to improve integrated performance
Purpose.Present invention introduces clustering ensemble technology, improve the performance of motion track cluster, clustering ensemble learns
Purpose be through integrated multiple complementation cluster device to obtain the cluster analysis system of a high reliability,
It is intended to produce generalization ability is strong, difference is big multiple members and clusters device, give full play to each member and cluster device and exist
Each advantage on clustering performance, it is thus achieved that cluster the clustering ensemble result that device will be good than single member.
The multi-feature extraction method that the present invention announces, has stronger complementarity each other, can be all sidedly
Catch the characteristic information of target moving object track, reduce dynamic characteristic and the time of track to the full extent
The association of fragment, and there is preferable capacity of resisting disturbance.Compared with existing multi-feature extraction method, the present invention
The multi-feature extraction method announced is to represent multiple features to carry out a kind of non-linear fusion, respectively different spies
Levy and in expression, do initial clustering analysis, and by clustering ensemble technology, multiple initial clustering analysis results are integrated
One optimum clustering ensemble result.And existing multi-feature extraction integration technology is that various features is represented line
Property combination, generate a high-dimensional characteristic vector, this characteristic vector do cluster analysis.Due to feature
The dimension of vector increases substantially, and the calculation consumption of its cluster analysis, degree of accuracy is the most undesirable with stability.
The weighted cluster integrated technology that the present invention announces, improves average summation in existing clustering ensemble technology
Multiple initial clustering results are distributed its weights according to different clustering result evaluation standards by amalgamation mode, power
The quality being worth the highest initial clustering result representing correspondence is the highest.And then by the amalgamation mode of weighted sum,
Many initial clusterings result is the most reasonably integrated.It addition, during the optimization of final step, this
Weighted cluster integrated technology has possessed the automatic identification ability of class number so that it is for different mobile object trajectory
Data have adaptive ability, and produce the cluster analysis result having more high stability with degree of accuracy, and this is
Not available for existing clustering ensemble technology.
Accompanying drawing explanation
Fig. 1 is that the mobile object trajectory based on multiple features fusion with clustering ensemble that the embodiment of the present invention provides is gathered
Class method flow diagram.
Fig. 2 is the clustering ensemble final optimization pass schematic diagram based on dendrogram cutting that the embodiment of the present invention provides.
Fig. 3 is the store customer mobile trajectory data collection that the embodiment of the present invention provides.
Fig. 4 is that the motion track based on multiple features fusion with integrated study technology that the embodiment of the present invention provides gathers
Alanysis result.
Fig. 5 is the cluster analysis performance in the case of by varying strength noise jamming that the embodiment of the present invention provides
(average correct classification rate+variance) schematic diagram.
Fig. 6 is the cluster analysis performance (classification accuracy rate) in the case of being disturbed that the embodiment of the present invention provides
Schematic diagram.
Detailed description of the invention
In order to make the purpose of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment,
The present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to
Explain the present invention, be not intended to limit the present invention.
Below in conjunction with the accompanying drawings the application principle of the present invention is explained in detail.
As it is shown in figure 1, the mobile object trajectory based on multiple features fusion Yu clustering ensemble of the embodiment of the present invention
Clustering method comprises the following steps:
S101: use four kinds of complementary Feature Extraction Technology, catch the spy of target moving object track all sidedly
Reference ceases, and reduces the dynamic characteristic of track and associating of time slice to the full extent;
S102: given different Initialize installation, uses the four kinds of movements to choosing of the K-means clustering algorithm
Track characteristic represents and carries out cluster analysis;
S103: the appraisal procedure that selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results
Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
S104: three Fusion of Clustering results are combined generate final clustering ensemble result further.
The present invention proposes a kind of mobile object trajectory cluster side based on multiple features fusion Yu integrated study technology
Method includes multi-feature extraction, initializes cluster analysis, Weighted Fusion function, final optimization pass function:
(1) multi-feature extraction
High between track data dimension, scale big, Noise etc. characteristic, in the raw information of track data
Carry out cluster analysis, not only inefficiency on territory, and influence whether reliability and the accuracy of cluster result.
Under conditions of ensureing track data key message, the most effectively carry out character representation, thus reduce data
Dimension and remove noise, track data cluster analysis is had very important significance;Use four kinds mutually
The Feature Extraction Technology mended, catches the characteristic information of target moving object track, to the full extent also all sidedly
The former dynamic characteristic of track and associating of time slice, and there is stronger anti-noise ability.Each feature
Extracting method is as follows:
The purpose of polynomial curve fitting (Ploynomial Curve Fitting, PCF) be to find one can
To represent the mathematical formulae of data signal, reduce the data degree by influence of noise.The most frequently used matching side
Method is least square polynomial method, and it can find the coefficient of the math equation of best fit track data x (t),
Can be by a parametric polynomial function modelling.
X (t)=αPtP+αP-1tP-1+…+α1t+α0;
Wherein, αp(p=0,1 ..., P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point
A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αp(p=0,1 ..., P) multistage equation, in ordinary circumstance
Under, fourth order polynomial coefficient has optimal performance, and the multinomial of higher order does not has substantially improving performance.
So all coefficients of quadrinomial are by optimizing αp(p=0,1 ..., 4) constitute one complete for mobile trajectory data
The PCF character representation of x (t).
Discrete Fourier transform (Discrete Fourier transform) is a kind of linear integral transformations of data,
The motion track expression from raw information territory is transformed into frequency domain, and Fourier transformation analysis is sequence data conversion
In the instrument that is most widely used.On frequency domain analyze mobile trajectory data, can easily disclose its
Some important attribute being difficult to observe by raw information territory.Discrete Fourier transform is permissible for discrete sample
Discrete series (observation) on raw information territory is mapped to the discrete series in a domain space (frequently
Rate coefficient).For track data x (t), can produce a series of Fourier by discrete Fourier transform is
Number:
In order under conditions of there is noise formed a stable DFT character representation, only choose front 16 high
Level number (corresponding to low-frequency information) constitutes the DFT character representation of mobile trajectory data x (t).
A kind of character representation method to motion track segment processing of segmentation partial statistics (PLS).First, will move
Dynamic track is divided into n section, and every section all has same length | W |, if final stage curtailment one is complete
Segmentation, then be merged in the last period.For each section, mean μnAnd variances sigmanEquation below meter can be passed through
Calculate:
And for complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute one
PLS character representation.
Segmentation discrete wavelet conversion (PDWT) is also a kind of character representation method to motion track segment processing.
Discrete wavelet conversion is a kind of effective multiscale analysis instrument, and wavelet conversion is on raw information territory and frequency domain
Preferable local characteristic is all had to show.Wavelet transformation is, by wavelet function, data signal is converted to small echo
Progression, thus can portray the feature of motion track by the coefficient of wavelet series.First, still will
Motion track is divided into n section, and every section all has same length | W |, if final stage curtailment one is complete
Whole segmentation, then be merged in the last period.For each segmentation, the DWT coefficient calculations on J=2 rank is such as
Under:
Represent the high-frequency information at jth order,Represent jth order low-frequency information;Complete for one
Motion track x (t), all segmentations are at the high-frequency information of 1-2 orderBelieve with the low frequency at the 2nd order
BreathJust constitute a PDWT character representation.
(2) cluster analysis is initialized
In initial clustering is analyzed, given different Initialize installation, use K-means clustering algorithm to choosing
The four kinds of motion track character representations taken carry out cluster analysis.Owing to character representation is different and K-means gathers
The difference of class Initialize installation, multiple different cluster results will produce.Each cluster result therein
Excellent have nothing in common with each other.K-means clustering algorithm is described as follows:
1. arbitrarily select k object as initial cluster center from n data object;
2., according to the average (center object) of each clustering object, calculate each object and these center object
Distance;And again corresponding object is carried out clustering according to minimum range;
3. recalculate the average (center object) that each (changing) clusters;
4. circulation (2) to (3) is until each cluster no longer changes.
(3) Weighted Fusion function
In initial clustering analyzes module, in conjunction with different character representations and Initialize installation, can be to having N
The target of bar motion track is according to collectionGenerate M cluster result.Oriental matrixIt is used for
Represent one and there is KmInitial clustering result P of classm, wherein, what every a line was corresponding is each motion track,
Every string is a binary vector, and numerical value 1 represents that corresponding motion track is clustered in this type of, and
Numerical value 0 represents not to be had.By this oriental matrix HmA similar matrix S can be calculatedm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if any two motion tracks are similar and gathered
Collection is in same class, and such as: the 2nd row in similar matrix, the element value of the 6th row is 1, then, the 2nd He
Article 6, motion track is just gathered at same class, otherwise, then do not gathered at same class for dissmilarity.Connect down
Coming, be averaged summation to the similar matrix of M initial clustering result, has just obtained the similar of Fusion of Clustering
Matrix:
And the cluster result represented by this similar matrix is incorporated into one all M initial clustering results exactly
Rising, its robustness and degree of accuracy are all higher than other initial clustering results.
Define M initial clustering result final Fusion of Clustering result is served the same role, but actual
Situation is that the quality of each initial clustering result is different, and the initial clustering result of high-quality should be to finally
The structure of Fusion of Clustering plays prior effect, such as: in meeting everybody by ballot by the way of to one
Individual proposal is decided by vote, then having experience and have the voter of relevant professional knowledge, his suggestion should be more worth
Pay attention to, bigger effect is played in final decision.In the present invention, each initial clustering result is entered
Row quality evaluation, and its quantization is obtained weights, the highest initial clustering result representing correspondence of weights
Quality is the highest.But not unique to the standard of cluster result quality evaluation, appraisal procedure has a variety of, therefore
I selectes three kinds of bigger appraisal procedures of difference, and π={ MH Γ, DVI, NMI}. can make it have mutually
Benefit effect.So according to three kinds of different appraisal procedures, initial clustering result PmCorresponding weights are:
Again by the way of weighted sum, the similar matrix S of three Fusion of Clustering results can be obtainedMHT,
SDVI, SNMI:
(4) final optimization pass function
After generating three Fusion of Clustering similar matrixes, further three matrixes to be merged and generate
Whole clustering ensemble result, and make it can automatically identify class number in the process, produce optimum result.
First, the similar matrix of final clustering ensemble result will be calculated:
S*=∑ Sπ/M;
Then, this matrix conversion is become a dendrogram, as in figure 2 it is shown, in dendrogram, its abscissa
Corresponding is data point, the similarity between what its vertical coordinate represented is bunch.In this dendrogram, a joint
It is defined as bunch similarity interval produced and other nodes merge the life cycle of point (bunch) from it.Relatively
Existing clustering architecture is merged unreasonable by long time interval further, therefore can be by a largest interval
In the range of cut tree diagram, it is thus achieved that correct class number, and then obtain final clustering ensemble result.
It is explained in detail below by the application effect of the comparison present invention.
(1) multi-feature extraction integration technology
The multi-feature extraction method that the present invention announces, has stronger complementarity each other, can be all sidedly
Catch the characteristic information of target moving object track, reduce dynamic characteristic and the time of track to the full extent
The association of fragment, and there is preferable capacity of resisting disturbance.Compared with existing multi-feature extraction method, the present invention
The multi-feature extraction method announced is to represent multiple features to carry out a kind of non-linear fusion, respectively different spies
Levy and in expression, do initial clustering analysis, and by clustering ensemble technology, multiple initial clustering analysis results are integrated
One optimum clustering ensemble result.And existing multi-feature extraction integration technology is that various features is represented line
Property combination, generate a high-dimensional characteristic vector, this characteristic vector do cluster analysis.Due to feature
The dimension of vector increases substantially, and the calculation consumption of its cluster analysis, degree of accuracy is the most undesirable with stability.
(2) weighted cluster integrated technology (Weighted Fusion function and final optimization pass function)
The weighted cluster integrated technology that the present invention announces, improves average summation in existing clustering ensemble technology
Multiple initial clustering results are distributed its weights according to different clustering result evaluation standards by amalgamation mode, power
The quality being worth the highest initial clustering result representing correspondence is the highest.And then by the amalgamation mode of weighted sum,
Many initial clusterings result is the most reasonably integrated.It addition, during the optimization of final step, this
Weighted cluster integrated technology has possessed the automatic identification ability of class number so that it is for different mobile object trajectory
Data have adaptive ability, and produce the cluster analysis result having more high stability with degree of accuracy, and this is
Not available for existing clustering ensemble technology.
On actual store customer mobile trajectory data collection, the present invention announce based on multiple features fusion and collection
The mobile object trajectory clustering method becoming learning art has obtained preferable checking, as it is shown on figure 3, at one
In the range of regular time, collect 222 motion tracks by the monitoring video in market, and use black line to enter
Rower is noted.The clustering algorithm that the application of the invention is announced, as shown in Figure 4,222 motion tracks are by automatically
And distributing efficiently inside 15 classifications, same category of track has higher similarity, represent
The behavioral pattern of correspondence.In order to verify the capacity of resisting disturbance of this algorithm, artificially to store customer moving rail
Mark data set adds Gaussian noise N (0, σ).As it is shown in figure 5, σ represents the magnitude adding Gaussian noise, numerical value
The interference strength of the biggest noise is the biggest.The clustering algorithm that the application of the invention is announced, with undisturbed environment
Under cluster result as benchmark, be used in be disturbed under environment based on multiple features fusion with based on independent feature
The cluster result represented compares (calculating average correct classification rate+variance) with it.Result shows various
Under interference strength, this algorithm and other method comparison can obtain the highest average accuracy and minimum variance
Value, thus proves accuracy and the stability of its excellence.Finally in order to verify this algorithm extremely strong anti-interference
Ability (interference includes: adds noise and deletes part motion track information) under environment, to original moving rail
Mark data set adds Gaussian noise N (0, σ), σ=0.1, and blocks the partial information of motion track at random, such as Fig. 6
Shown in, abscissa represents the ratio between quantity of information and the complete motion track quantity of information that motion track is lost,
Vertical coordinate presentation class accuracy, result shows still can obtain more than 86% under strong interference environment
Accuracy.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all at this
Any amendment, equivalent and the improvement etc. made within bright spirit and principle, should be included in the present invention
Protection domain within.
Claims (9)
1. a mobile object trajectory clustering method based on multiple features fusion Yu clustering ensemble, it is characterised in that
Described mobile object trajectory clustering method based on multiple features fusion with clustering ensemble comprises the following steps:
Discrete little first by polynomial curve fitting, discrete Fourier transform, segmentation partial statistics and segmentation
Ripple four kinds of complementary Feature Extraction Technology of conversion, catch the characteristic information of target moving object track all sidedly,
Reduce the dynamic characteristic of track and associating of time slice to the full extent;
Then give different Initialize installation, use the K-means clustering algorithm four kinds of moving rails to choosing
Mark character representation carries out cluster analysis;
The appraisal procedure that the most selected three kinds of difference are bigger, the quality amount of carrying out to multiple initial clustering results
Change, and obtain three weight vectors, then by the way of weighted sum, obtain three Fusion of Clustering results;
Last further three Fusion of Clustering results being combined generates final clustering ensemble result.
2. mobile object trajectory cluster side based on multiple features fusion Yu clustering ensemble as claimed in claim 1
Method, it is characterised in that described mobile object trajectory clustering method tool based on multiple features fusion with clustering ensemble
Body comprises the following steps:
Step one, multi-feature extraction, including polynomial curve fitting, discrete Fourier transform, segmentation local
Statistics and segmentation discrete wavelet change four kinds of complementary Feature Extraction Technology, catch target moving object all sidedly
The characteristic information of track, reduces the dynamic characteristic of track and associating of time slice to the full extent;
Step 2, initializes cluster analysis, in initial clustering is analyzed, and given different Initialize installation,
Use K-means clustering algorithm that the four kinds of motion track character representations chosen are carried out cluster analysis, due to spy
Levying the difference representing that different and K-means clustering initialization is arranged, multiple different cluster results will produce
Raw, the excellent of each cluster result therein is had nothing in common with each other;
Step 3, Weighted Fusion function, in initial clustering is analyzed, in conjunction with different character representations with initial
Change and arrange, to there is the target of N bar motion track according to collectionGenerate M cluster result;Instruction square
Battle arrayIt is used for representing that has a KmInitial clustering result P of classm, wherein, every a line is corresponding
Being each motion track, every string is a binary vector, and numerical value 1 represents corresponding motion track
It is clustered in this type of, and numerical value 0 represents do not have;By this oriental matrix HmCalculate a similar matrix
Sm={ 0,1}N×N,This similar matrix represents cluster result PmIn, if any two motion tracks
Similar and be collected in same class;And the cluster result represented by this similar matrix is exactly at the beginning of all M
Beginning cluster result combines;Appraisal procedure π that selected three kinds of difference are bigger=MH Γ, DVI, NMI},
Make it have complementary action;So according to three kinds of different appraisal procedures, calculate corresponding to initial clustering result
Weights be then by the way of weighted sum, to obtain the similar matrix S of three Fusion of Clustering resultsMHT,
SDVI,SNMI;
Step 4, final optimization pass function, after generating three Fusion of Clustering similar matrixes, merges three matrixes
Generate final clustering ensemble result together, and make it can automatically identify class number in the process, produce
Optimum result.
3. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that described polynomial curve fitting method is least square polynomial method, finds optimal plan
Close the coefficient of the math equation of track data x (t), by a parametric polynomial function modelling:
X (t)=αPtP+αP-1tP-1+···+α1t+α0;
Wherein, αp(p=0,1, P) it is p rank multinomial coefficients;By minimizing all mobile trajectory datas point
A young waiter in a wineshop or an inn take advantage of error function, multinomial model be one about αpThe multistage equation of (p=0,1, P);Use quadrinomial
All coefficients by optimize αp(p=0,1,4) constitute a complete PCF for mobile trajectory data x (t)
Character representation.
4. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that described discrete Fourier transform, for track data x (t), passes through discrete Fourier transform
Produce a series of Fourier coefficient:
Wherein, π represents pi, is a constant, and k is the exponent number of discrete Fourier transform function, chooses
Front 16 higher order coefficient αk(k=0,1,16) constitutes the DFT character representation of mobile trajectory data x (t).
5. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that motion track is divided into n section by described segmentation partial statistics, and every section all has same
Length | W |, if one complete fraction of final stage curtailment, is then merged in the last period;For each
Section, mean μnAnd variances sigmanCalculated by equation below:
For complete motion track x (t), the mean μ of all segmentationsnAnd variances sigmanJust constitute a PLS
Character representation.
6. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that motion track is divided into n section by the conversion of described segmentation discrete wavelet, and every section all has same
Length | W | of sample, if one complete fraction of final stage curtailment, is then merged in the last period;For
Each segmentation, the DWT coefficient calculations on J=2 rank is as follows:
Wherein,Represent the high-frequency information at jth order,Represent jth order low-frequency information;For one
Motion track x (t) that bar is complete, all segmentations are at the high-frequency information of 1-2 orderWith at the 2nd order
Low-frequency informationJust constitute a PDWT character representation.
7. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that described K-means clustering algorithm is described as follows:
Step one, arbitrarily selects k object as initial cluster center from n data object;
Step 2, according to the average of each clustering object, calculates the distance of each object and center object;And
Again corresponding object is carried out clustering according to minimum range;
Step 3, recalculates the average of each cluster;
Step 4, circulation step two to step 3 is until each cluster no longer changes.
8. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that the described appraisal procedure bigger according to three kinds of difference, initial clustering result PmCorresponding
Weights be:
Again by the way of weighted sum, obtain the similar matrix S of three Fusion of Clustering resultsMHT, SDVI,
SNMI:
9. mobile object trajectory cluster side based on multiple features fusion Yu integrated study as claimed in claim 2
Method, it is characterised in that calculate the similar matrix of final clustering ensemble result:
S*=∑ Sπ/M;
Matrix conversion is become a dendrogram, and in dendrogram, what its abscissa was corresponding is data point, and it is indulged
Coordinate representation be bunch between similarity;In this dendrogram, the life cycle of a node is defined as
Produce from it and bunch similarity interval that other nodes merge;Longer time interval is further to existing bunch
Structure merges unreasonable, by cutting tree diagram in the range of a largest interval, it is thus achieved that correct class number,
And then obtain final clustering ensemble result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610176417.2A CN105843919A (en) | 2016-03-24 | 2016-03-24 | Moving object track clustering method based on multi-feature fusion and clustering ensemble |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610176417.2A CN105843919A (en) | 2016-03-24 | 2016-03-24 | Moving object track clustering method based on multi-feature fusion and clustering ensemble |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105843919A true CN105843919A (en) | 2016-08-10 |
Family
ID=56583353
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610176417.2A Pending CN105843919A (en) | 2016-03-24 | 2016-03-24 | Moving object track clustering method based on multi-feature fusion and clustering ensemble |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105843919A (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446081A (en) * | 2016-09-09 | 2017-02-22 | 西安交通大学 | Method for mining association relationship of time series data based on change consistency |
CN106951903A (en) * | 2016-10-31 | 2017-07-14 | 浙江大学 | A kind of method for visualizing of crowd's movement law |
CN107784314A (en) * | 2016-08-26 | 2018-03-09 | 北京协同创新智能电网技术有限公司 | Normal the abnormal data division methods and system of a kind of multivariable warning system |
CN107871111A (en) * | 2016-09-28 | 2018-04-03 | 苏宁云商集团股份有限公司 | A kind of behavior analysis method and system |
CN108599140A (en) * | 2018-01-24 | 2018-09-28 | 合肥工业大学 | Power load characteristic analysis method and device, storage medium |
CN108834072A (en) * | 2017-05-03 | 2018-11-16 | 腾讯科技(深圳)有限公司 | The acquisition methods and device of motion track |
CN108921191A (en) * | 2018-05-25 | 2018-11-30 | 北方工业大学 | Multi-biological-feature fusion recognition method based on image quality evaluation |
CN109241069A (en) * | 2018-08-23 | 2019-01-18 | 中南大学 | A kind of method and system that the road network based on track adaptive cluster quickly updates |
CN110097121A (en) * | 2019-04-30 | 2019-08-06 | 北京百度网讯科技有限公司 | A kind of classification method of driving trace, device, electronic equipment and storage medium |
CN110866559A (en) * | 2019-11-14 | 2020-03-06 | 上海中信信息发展股份有限公司 | Poultry behavior analysis method and device |
CN111372186A (en) * | 2019-12-17 | 2020-07-03 | 广东小天才科技有限公司 | Position calculation method under non-uniform positioning scene and terminal equipment |
CN111414437A (en) * | 2019-01-08 | 2020-07-14 | 阿里巴巴集团控股有限公司 | Method and device for generating line track |
CN111476616A (en) * | 2020-06-24 | 2020-07-31 | 腾讯科技(深圳)有限公司 | Trajectory determination method and apparatus, electronic device and computer storage medium |
CN111693059A (en) * | 2020-05-28 | 2020-09-22 | 北京百度网讯科技有限公司 | Navigation method, device and equipment for roundabout and storage medium |
CN112116806A (en) * | 2020-08-12 | 2020-12-22 | 深圳技术大学 | Traffic flow characteristic extraction method and system |
CN112418339A (en) * | 2020-11-29 | 2021-02-26 | 中国科学院电子学研究所苏州研究院 | Random forest based aerial moving object identification method |
CN112861565A (en) * | 2019-11-12 | 2021-05-28 | 上海高德威智能交通系统有限公司 | Method and device for determining track similarity, computer equipment and storage medium |
CN113043274A (en) * | 2021-03-25 | 2021-06-29 | 中车青岛四方车辆研究所有限公司 | Robot performance evaluation method and system |
CN110686679B (en) * | 2019-10-29 | 2021-07-09 | 中国人民解放军军事科学院国防科技创新研究院 | High-orbit optical satellite offshore target interruption track correlation method |
CN113515982A (en) * | 2020-05-22 | 2021-10-19 | 阿里巴巴集团控股有限公司 | Track restoration method and equipment, equipment management method and management equipment |
CN113535861A (en) * | 2021-07-16 | 2021-10-22 | 子亥科技(成都)有限公司 | Track prediction method for multi-scale feature fusion and adaptive clustering |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103605990A (en) * | 2013-10-23 | 2014-02-26 | 江苏大学 | Integrated multi-classifier fusion classification method and integrated multi-classifier fusion classification system based on graph clustering label propagation |
CN104182517A (en) * | 2014-08-22 | 2014-12-03 | 北京羽乐创新科技有限公司 | Data processing method and data processing device |
-
2016
- 2016-03-24 CN CN201610176417.2A patent/CN105843919A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103605990A (en) * | 2013-10-23 | 2014-02-26 | 江苏大学 | Integrated multi-classifier fusion classification method and integrated multi-classifier fusion classification system based on graph clustering label propagation |
CN104182517A (en) * | 2014-08-22 | 2014-12-03 | 北京羽乐创新科技有限公司 | Data processing method and data processing device |
Non-Patent Citations (1)
Title |
---|
YUN YANG ETC.: ""Temporal Data Clustering via Weighted Clustering Ensemble with Different Representations"", 《IEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 * |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107784314A (en) * | 2016-08-26 | 2018-03-09 | 北京协同创新智能电网技术有限公司 | Normal the abnormal data division methods and system of a kind of multivariable warning system |
CN106446081B (en) * | 2016-09-09 | 2019-08-13 | 西安交通大学 | The method for excavating time series data incidence relation based on variation consistency |
CN106446081A (en) * | 2016-09-09 | 2017-02-22 | 西安交通大学 | Method for mining association relationship of time series data based on change consistency |
CN107871111A (en) * | 2016-09-28 | 2018-04-03 | 苏宁云商集团股份有限公司 | A kind of behavior analysis method and system |
CN107871111B (en) * | 2016-09-28 | 2021-11-26 | 苏宁易购集团股份有限公司 | Behavior analysis method and system |
CN106951903A (en) * | 2016-10-31 | 2017-07-14 | 浙江大学 | A kind of method for visualizing of crowd's movement law |
CN106951903B (en) * | 2016-10-31 | 2019-12-17 | 浙江大学 | method for visualizing crowd movement rules |
CN108834072A (en) * | 2017-05-03 | 2018-11-16 | 腾讯科技(深圳)有限公司 | The acquisition methods and device of motion track |
CN108599140A (en) * | 2018-01-24 | 2018-09-28 | 合肥工业大学 | Power load characteristic analysis method and device, storage medium |
CN108599140B (en) * | 2018-01-24 | 2021-01-29 | 合肥工业大学 | Power load characteristic analysis method and device and storage medium |
CN108921191A (en) * | 2018-05-25 | 2018-11-30 | 北方工业大学 | Multi-biological-feature fusion recognition method based on image quality evaluation |
CN108921191B (en) * | 2018-05-25 | 2021-10-26 | 北方工业大学 | Multi-biological-feature fusion recognition method based on image quality evaluation |
CN109241069A (en) * | 2018-08-23 | 2019-01-18 | 中南大学 | A kind of method and system that the road network based on track adaptive cluster quickly updates |
CN111414437B (en) * | 2019-01-08 | 2023-06-20 | 阿里巴巴集团控股有限公司 | Method and device for generating line track |
CN111414437A (en) * | 2019-01-08 | 2020-07-14 | 阿里巴巴集团控股有限公司 | Method and device for generating line track |
CN110097121A (en) * | 2019-04-30 | 2019-08-06 | 北京百度网讯科技有限公司 | A kind of classification method of driving trace, device, electronic equipment and storage medium |
CN110686679B (en) * | 2019-10-29 | 2021-07-09 | 中国人民解放军军事科学院国防科技创新研究院 | High-orbit optical satellite offshore target interruption track correlation method |
CN112861565A (en) * | 2019-11-12 | 2021-05-28 | 上海高德威智能交通系统有限公司 | Method and device for determining track similarity, computer equipment and storage medium |
CN110866559A (en) * | 2019-11-14 | 2020-03-06 | 上海中信信息发展股份有限公司 | Poultry behavior analysis method and device |
CN111372186A (en) * | 2019-12-17 | 2020-07-03 | 广东小天才科技有限公司 | Position calculation method under non-uniform positioning scene and terminal equipment |
CN113515982A (en) * | 2020-05-22 | 2021-10-19 | 阿里巴巴集团控股有限公司 | Track restoration method and equipment, equipment management method and management equipment |
CN113515982B (en) * | 2020-05-22 | 2022-06-14 | 阿里巴巴集团控股有限公司 | Track restoration method and equipment, equipment management method and management equipment |
CN111693059A (en) * | 2020-05-28 | 2020-09-22 | 北京百度网讯科技有限公司 | Navigation method, device and equipment for roundabout and storage medium |
CN111693059B (en) * | 2020-05-28 | 2022-10-11 | 阿波罗智联(北京)科技有限公司 | Navigation method, device and equipment for roundabout and storage medium |
CN111476616A (en) * | 2020-06-24 | 2020-07-31 | 腾讯科技(深圳)有限公司 | Trajectory determination method and apparatus, electronic device and computer storage medium |
CN112116806A (en) * | 2020-08-12 | 2020-12-22 | 深圳技术大学 | Traffic flow characteristic extraction method and system |
CN112418339B (en) * | 2020-11-29 | 2022-11-29 | 中国科学院电子学研究所苏州研究院 | Random forest based aerial moving object identification method |
CN112418339A (en) * | 2020-11-29 | 2021-02-26 | 中国科学院电子学研究所苏州研究院 | Random forest based aerial moving object identification method |
CN113043274A (en) * | 2021-03-25 | 2021-06-29 | 中车青岛四方车辆研究所有限公司 | Robot performance evaluation method and system |
CN113535861A (en) * | 2021-07-16 | 2021-10-22 | 子亥科技(成都)有限公司 | Track prediction method for multi-scale feature fusion and adaptive clustering |
CN113535861B (en) * | 2021-07-16 | 2023-08-11 | 子亥科技(成都)有限公司 | Track prediction method for multi-scale feature fusion and self-adaptive clustering |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105843919A (en) | Moving object track clustering method based on multi-feature fusion and clustering ensemble | |
Xie et al. | A decomposition-ensemble approach for tourism forecasting | |
CN108304668B (en) | Flood prediction method combining hydrologic process data and historical prior data | |
CN101866421B (en) | Method for extracting characteristic of natural image based on dispersion-constrained non-negative sparse coding | |
CN102495919B (en) | Extraction method for influence factors of carbon exchange of ecosystem and system | |
CN106650767B (en) | Flood forecasting method based on cluster analysis and real-time correction | |
CN111540193A (en) | Traffic data restoration method for generating countermeasure network based on graph convolution time sequence | |
CN109117992B (en) | Ultra-short-term wind power prediction method based on WD-LA-WRF model | |
CN104751185B (en) | SAR image change detection based on average drifting genetic cluster | |
CN109919364A (en) | Multivariate Time Series prediction technique based on adaptive noise reduction and integrated LSTM | |
CN111785329A (en) | Single-cell RNA sequencing clustering method based on confrontation automatic encoder | |
CN109767312A (en) | A kind of training of credit evaluation model, appraisal procedure and device | |
CN109948726B (en) | Power quality disturbance classification method based on deep forest | |
CN102487343A (en) | Diagnosis and prediction method for hidden faults of satellite communication system | |
Rondonotti et al. | SiZer for time series: a new approach to the analysis of trends | |
CN103984746B (en) | Based on the SAR image recognition methodss that semisupervised classification and region distance are estimated | |
CN103366365A (en) | SAR image varying detecting method based on artificial immunity multi-target clustering | |
Jörges et al. | Spatial ocean wave height prediction with CNN mixed-data deep neural networks using random field simulated bathymetry | |
CN114371009A (en) | High-speed train bearing fault diagnosis method based on improved random forest | |
CN106022652A (en) | Processing method of forest carbon sink operating plan and processing device of forest carbon sink operating plan | |
CN117540303A (en) | Landslide susceptibility assessment method and system based on cross semi-supervised machine learning algorithm | |
Nelson et al. | Do roads cause deforestation? Using satellite images in econometric analysis of land use | |
Silva et al. | Generation of monthly synthetic streamflow series based on the method of fragments | |
Su et al. | Fault diagnosis of high-speed train bogie based on spectrogram and multi-channel voting | |
CN114626412A (en) | Multi-class target identification method and system for unattended sensor system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160810 |