Background technology
Space-time data is since the spatial entities and spatial phenomenon in space where it are at time, space and attribute three aspects
Inherent feature, space-time big data show the associated complexity of multidimensional, semanteme, space-time dynamic.Space-time big data includes time, sky
Between, thematic attribute three-dimensional information, there are multi-source, magnanimity, the quick Comprehensive Characteristics of renewal.Space-time data has become smart city and has provided
The key element in source, by study space-time big data multidimensional association description Formal Representation, incidence relation dynamic modeling with it is more
Scale correlating method is analyzed, and then space-time big data information is excavated and distributed rationally, so that Optimizing City resource is matched somebody with somebody
Put, reduce urban resource consume excessively and waste provides decision support and is of great significance.
With technologies such as development of Mobile Internet technology, space orientation technique, location-based service technology, big data technology and cloud computings
Rapid development, wisdom traffic system (ITS, Intelligence Transportation System's) applies in daily life
Become in work more and more important.At present, various traffic data collection technologies acquire the space-time data of magnanimity in real time, based at this time
Empty big data is predicted mobile object location, and then provides intelligent decision and clothes to traffic programme, traffic monitoring and scheduling
Business, provide to the user it is finer, accurately and efficiently service, realize technology, society and the coordinated development of people accordingly.
Markov model is a kind of Statistic analysis models, its each state is by some probability density distribution tables
It is now various states, each state is produced by a state with corresponding probability density distribution.
But the value for being currently based on space-time big data intelligent transportation is not excavated fully also, to intelligent transportation system
In space-time data can not efficiently store, retrieve, analyze with excavate, lack to user trip and traffic situation trajectory predictions with
Study and judge.Contain space-time big data in wisdom traffic, in the planning, construction and supervision of wisdom traffic, how effectively to moving
Employ the track at family, position is predicted, and then be traffic route planning, traffic scheduling and management, urban planning, public service
Selection, safe protection engineering provide decision support and are never well solved with helping.
Movement pattern technology mainly includes mobile trajectory data collection, denoising, characteristic parameter extraction, foundation prediction mould
Type, Forecasting recognition decision-making etc..The prediction of this patent shift position gathers the mobile position data of user by mobile equipment, through data
Pretreatment, denoising, the joint Clustering Model based on density and algorithm, extract the characteristic information structure interest point sequence of user, then
The prediction to location of mobile users is realized by the mobile Markov model of structure.
The content of the invention
For overcome the deficiencies in the prior art, under space-time big data environment, the present invention provides under a kind of space-time big data
Big to solve data storage processing amount under space-time big data environment based on the Forecasting Methodology of mobile Markov model, prediction is accurate
Exactness and precision are undesirable, improve the precision to location of mobile users prediction and accuracy.
In order to realize above-mentioned target, the technical solution adopted in the present invention is as follows:
Forecasting Methodology based on mobile Markov model under a kind of space-time big data, it is characterised in that including following step
Suddenly:
Step 1:Denoising is carried out to the historical position data of collection;The historical position data collected is gone
Make an uproar, filter out interference data, filter out dynamic mobile track, retain static motion track;
Step 2:Clustering processing is carried out to the data of denoising;To the static moving rail obtained after denoising in the step 1
Mark carries out the clustering processing based on joint density to static motion track by joint density clustering algorithm and obtains clustering cluster;
Step 3:Point of interest is established for clustering cluster;For the clustering cluster obtained in the step 2, mobile use is extracted
The behavioural characteristic at family, so as to establish the point of interest of user;
Step 4:Denoising is carried out to point of interest;Point of interest in the step 3 is calculated, is calculated each
Radius, interval time and the density of point of interest, while the motion track of each denoising to being included in point of interest carries out again
Clustering processing, further filters out the noise data in point of interest, retains real point of interest;
Step 5:Establish mobile Markov model;Built for each really point of interest obtained in the step 4
Vertical state transition probability and state transition probability matrix;
Step 6:Predict the next position;After mobile user data is collected, handled through above-mentioned steps one~tetra-, extraction
The point of interest of mobile subscriber, is realized to mobile subscriber's the next position according to the mobile Markov model established in the step 5
It is predicted.
Further, the method for denoising is in the step 1:Retain static motion track first, i.e., static moving rail
The speed speed of mark<δ, δ are a pre-defined constant;As the speed speed of motion track>δ, is dynamic mobile rail
Mark, so as to delete all dynamic mobile tracks;
Further, class cluster merging is carried out to the clustering cluster in the step 2, the method that the class cluster merges is:Clustering cluster
C1={ c1,c3,c7,c9And clustering cluster C2={ c9,c11,c12, then the two clustering clusters merge into a clustering cluster:C1∪C2=
{c1,c3,c7,c9,c11,c12}。
Had the beneficial effect that caused by the present invention:
The present invention is by using the processing of position data de-noising, the cluster based on joint density and class cluster merging treatment, extraction
The behavioural characteristic at family, so that location point is built, while based on mobile Markovian model of this location point structure with memory function
Type, is accordingly predicted the position of mobile object.By Step 1: two, three and four can remove substantial amounts of noise data, with
The unrelated track data of predicted position, relies only on point of interest and realizes position prediction, such present invention can reduce space-time big data
The amount of storage of processing data needed for mobile object is predicted under environment, while also carries and simplifies forecasting system, improves predetermined speed,
Additionally by the mobile Markov model with memory function, the present invention can improve precision and the accuracy of position prediction.
Embodiment
Come the further details of explanation present invention, but protection scope of the present invention with specific embodiment below in conjunction with the accompanying drawings
It is not limited to this.
As shown in Figure 1, the Forecasting Methodology based on mobile Markov model under a kind of space-time big data, including following step
Suddenly:
Step 1:Data cleansing.Historical position data denoising first to collecting, filters out dynamic motion track, protects
Stay the motion track of static state.
Due to mobile object velocity variations, location equipment precision is not high, the mobile subscriber track data transformation collected
Meet truth;Additionally, due to stabilization of equipment performance problem, the data of collection usually contain certain noise.Due to mobile subscriber
Track be the time continuous signal, and it is a discrete random process to move Markov chain, so needing to collecting
The mobile trajectory data of mobile subscriber carry out sliding-model control, while need to filter out noise data, that is, data cleansing.
Since space-time data amount is huge, before data processing is carried out, it is necessary to which initial data is handled.Given Euclidean is empty
Between, track sets T={ t1,t2,···,tnFor the discrete loci point that is sequentially arranged, tiFor i-th of tracing point, ti
=(xi,yi,ti),1<i<n。
Orbit segment TS={ ts1,ts2,···,stkIt is the discrete loci sequence arranged in temporal sequence, orbit segment is
The orderly discrete line segment that continuous tracing point is formed.
Discrete loci section speedRepresent the speed of track jth section, which is by n nearest discrete loci point structure
Into discrete loci section average speed.In order to filter out noise data, retain the motion track of static state first, i.e., if jth section
Motion trackThen this section is static motion track, and wherein δ is a pre-defined constant, due to moving
The motion track for employing family is dynamic change, this constant takes local mean value, be to adapt to variable dynamic data:
Wherein,For the speed of nearest n motion track section jth section.
All dynamic motion tracks are filtered out at the same time, that is, are filtered outMobile trajectory data, tsjWith previous shifting
Dynamic orbit segment tsj-1Merge into a motion track section.
Step 2:The static motion track retained in the step 1 moves static state by joint density clustering algorithm
Track carries out clustering processing, extracts the behavioural characteristic of mobile subscriber, so as to establish the point of interest of user, while merges interest
Point, makes its shared maximum public point of interest.
1) the clustering method processing procedure based on joint density
In order to handle the static motion track classification of reservation, the concept in field is defined first:For a given rail
Mark point p, using p points as the center of circle, the r fields of tracing point p are known as with the tracing point in radius r;
Set T, q for all tracing points are any one tracing point, for given tracing point p, then p points based on
The field of density is:N (p)=q ∈ S | dist (p, q)≤r }.
Track points in the r fields of tracing point p:size(N(p)).
Joint density clusters:Then
Main thought based on Density Clustering is to reduce the treating capacity of data, and piecemeal is carried out to position big data, will
Noise data in the tracing point of mobile subscriber filters out.The process of realizing of core is:Travel through each mobile use in the big data of position
The tracing point at family, produces clustering cluster, it is assumed that in the field of a tracing point p by the clustering processing method based on joint density
Points:Size (N (p)) >=λ, λ is a pre-defined constant, the minimum track points in a class cluster is represented, according to tool
Body problem determines, then is created that a new cluster C, then tracing point p is the kernel object of the cluster;If size (N (p))<λ, then should
Tracing point p is noise data, it is necessary to filter out;Class cluster merging is finally carried out according to the clustering processing method of joint density.
Detailed process is:Initialize cluster number n=0;Travel through each tracing point p in the set T of tracing point;If size
(N(p))<λ, then the tracing point is noise, it is necessary to filter out;If size (N (p)) >=λ, establishes new clustering cluster Ci;Depth
The new cluster C of first traversaliFields of the tracing point p based on density, the Cluster merging for carrying out joint density obtains clustering cluster Cn;Cluster
Cluster merge method be:Assuming that clustering cluster C1={ c1,c3,c7,c9And clustering cluster C2={ c9,c11,c12, then the two are clustered
Cluster merges into a clustering cluster Cn=C1∪C2={ c1,c3,c7,c9,c11,c12, so as to obtain clustering cluster Cn, wherein ciRepresent poly-
The i-th element in class cluster.
2) build point of interest, the point of interest calculated, calculate the radius of each point of interest, interval time with it is close
Degree, while clustering processing is carried out again to the tracing point of each denoising, the noise data in point of interest is further filtered out, is retained
Real point of interest.
Once clustering cluster is formed, the radius raduis of each cluster, access time interval interval, density d ensity are just
Determine, wherein:Radius raduis is distance of the cluster heart to farthest tracing point;Access time interval interval is to visit earliest
Ask the interval time between time and last access time;Density d ensity is the track points of mobile subscriber in cluster.
The behavioural characteristic in place when being accessed according to user, marks corresponding semantic information to each class cluster, forms point of interest
(POIs, Point of interest), i.e. position.Such as some cluster can mark for etc.;Each cluster is relative to shifting at the same time
A state in dynamic Markov model.Class cluster radius raduis, density d ensity have just been calculated in labeling process, according to
This is to point of interest CiCluster again, if some static state motion tracks are not belonging to any one cluster, be marked as unknown, so
All static motion tracks for being marked as unknown are removed afterwards, and all continuous static motion tracks are shared same
Label is simultaneously summed up as individual event, and an event corresponds to a state.Such as 6 continuous static motion tracks share one
Label, it is considered to be a static motion track mark point.All static motion track markers works are finally completed, and record mark
Credit m, state transition probability matrix are just made of the state transition probability between m point of interest.
The point of interest of structure is as shown in Figure 2.In Fig. 2, H represents that Home, W represent that Work, S represent that Sport, L are represented
Leisure, the line with the arrow between point of interest represent the transition probability between point of interest, i.e. state transition probability.
Point of interest POIs arranges each point of interest according to density d ensity in descending order.
Step 3:For the real point of interest retained in the step 2, mobile horse is built based on real point of interest
Er Kefu models.
Predict mobile subscriber point of interest, it must be determined that user some point of interest probability, if having m point of interest,
Then the vector of point of interest is represented by:dm={ dm,1,dm,2,···,dm,i, when i-th of state is μ, dm,i,μ=1;It is no
Then, dm,i,μ=0.
As shown in figure 3, Fig. 3 illustrates point of interest transfer case, 5 point of interest D1~D5 are shared in figure 3, at first
For time point point of interest in D3, second time point point of interest is transferred to D1, and the 3rd time point point of interest is transferred to D5, the 4th
Time point point of interest is transferred to D2.
Fig. 4 illustrates the state vector figure formed according to Fig. 3 interest dotted states transfer case, and D1~D5 is corresponding in turn to
First element~five element in vector, for example, the point of interest at first time point is marked as D3, exists accordingly
The 3rd element in first vector is 1;The point of interest at second time point is marked as D1, accordingly in second vector
In first element be 1;The point of interest at the 3rd time point is marked as D5, accordingly the 5th in the 3rd vector
Element is 1;The point of interest at the 4th time point is marked as D2, and second element in the 4th vector is 1 accordingly.
Step 4:State transition probability is established according to the mobile Markov model built in the step 3, state turns
Move probability matrix and n-MMC (n-MMCs, n Mobility Markov Chains) state transition probability matrix.
According to training data, the structure of clustering processing method, point of interest based on above-mentioned data cleansing, based on joint density
Build, so as to build mobile Markov model, each point of interest corresponds to an event, and each event corresponds to mobile Ma Erke
A state in husband's model, if interest point m, then overall state transition probability be:
Wherein,
State transition probability matrix can be built accordingly:
As shown in Fig. 2, according to the point of interest of structure, the state transition probability square of the mobile Markov model finally built
Battle array is as follows:
The mobile Markov chain (MMC, Mobility Markov Chain) of standard is memoryless, to Future Positions
Prediction only rely upon current location.This is less consistent with actual conditions, and people is to select future according to custom and memory, because
This can make the decision of next step action when selecting the action of next step according to historical trace.Therefore this memoryless property
The accuracy of the prediction of confrontation Future Positions can produce some negative effects.It is this in order to solve the problems, such as, introduce a conception of species
N-MMCs, in n-MMCs, state not only considers current interest point, it is also contemplated that the pervious n-1 interest accessed
Point.
According to training data, and then n-MMCs state transition probability matrixs are built, in order to illustrate what is predicted based on n-MMCs
Concept, Fig. 2 illustrate the point of interest built according to the clustering algorithm of joint density, have collected small strong phone gps data, learn
Obtain small strong trace information.In 2-MMCs, it is contemplated that four different states, are Home (H), Work (W) respectively,
Leisure (L) and Sports (S), the purpose is to the position based on the 2 position prediction next moment accessed recently.Therefore
The row of state transition probability matrix illustrates all possible combinations of states of the n point of interest accessed recently, and arranges and illustrate
Next position in n-MMCs.For example, if prior location is H, current location W, prediction the next position is H, then state
Transfer HW to WH will occur, and update state transition probability matrix accordingly, prior location W, current location H, accordingly
State transition probability is as follows:
Wherein, μ is previous state, and ν is current state, and σ is NextState, dm,i-1,μRepresent amount to m point of interest, i-th-
1 state is μ, dm,i,νRepresent to amount to m point of interest, i state is ν, dm,i+1,σRepresent to amount to m point of interest, i+1 shape
State is σ, dn,j-1,μRepresent to amount to n point of interest, -1 state of jth is μ, dn,j,νRepresent to amount to n point of interest, j-th of state
For ν, dn,j+1,σRepresent to amount to n point of interest ,+1 state of jth is σ.The 2-MMC movement Markov models finally built
Partial status transition probability matrix is as shown in table 1:
1 2-MMC of table moves Ma Er transition probability tables
|
W |
H |
S |
L |
HW |
0.1 |
0.8 |
0.1 |
0 |
HS |
0 |
0.8 |
0.13 |
0.07 |
HL |
0 |
0.9 |
0.04 |
0.06 |
WH |
0.71 |
0.24 |
0.03 |
0.02 |
WS |
0.26 |
0.59 |
0.11 |
0.04 |
WL |
0.32 |
0.68 |
0 |
0 |
Step 5:State transition probability matrix in the step 4 searches n-MMC state transition probability matrixs pair
The next position of mobile subscriber is predicted, and determines the next position of mobile subscriber.
According to the mobile Markov model of structure, in order to predict next position, from n-MMC state transition probability matrixs
In, be expert at the successively corresponding current state of middle lookup and previous state, are searched corresponding with having been found in row in row
Row in maximum probability next position of the row object as mobile object, while update n-MMC state transition probability matrixs
In corresponding row and column.
It is noted that above-described embodiment is general to the illustrative and not limiting of technical solution of the present invention, technical field
The equivalent substitution of logical technical staff or the other modifications made according to the prior art, as long as not exceeding technical solution of the present invention
Thinking and scope, should be included within interest field of the presently claimed invention.