CN107977734A

CN107977734A - A kind of Forecasting Methodology based on mobile Markov model under space-time big data

Info

Publication number: CN107977734A
Application number: CN201711107770.6A
Authority: CN
Inventors: 郭力争; 闫涛; 王春丽; 李蓓; 柳运昌; 董国忠; 赵军民; 何宗耀
Original assignee: Henan University of Urban Construction
Current assignee: Shandong Shuoweisi Big Data Technology Co ltd
Priority date: 2017-11-10
Filing date: 2017-11-10
Publication date: 2018-05-01
Anticipated expiration: 2037-11-10
Also published as: CN107977734B

Abstract

The present invention relates to wisdom traffic, space-time big data technical field, and in particular to the Forecasting Methodology based on mobile Markov model under a kind of space-time big data, this method are：Denoising, clustering processing are carried out successively to the historical position data of collection, clustering cluster is obtained by joint density clustering algorithm, point of interest is established for clustering cluster, carries out that denoising retains real point of interest, Markov model is moved in foundation to point of interest；After mobile user data is collected, after above-mentioned processing, the point of interest of mobile subscriber is extracted, is realized according to mobile Markov model and mobile subscriber's the next position is predicted.The present invention solves under space-time big data environment that data storage processing amount is big, and prediction accuracy and precision are undesirable, improves the precision to location of mobile users prediction and accuracy.

Description

A kind of Forecasting Methodology based on mobile Markov model under space-time big data

Technical field

The present invention relates to wisdom traffic, space-time big data technical field, and in particular to based on shifting under a kind of space-time big data The Forecasting Methodology of dynamic Markov model.

Background technology

Space-time data is since the spatial entities and spatial phenomenon in space where it are at time, space and attribute three aspects Inherent feature, space-time big data show the associated complexity of multidimensional, semanteme, space-time dynamic.Space-time big data includes time, sky Between, thematic attribute three-dimensional information, there are multi-source, magnanimity, the quick Comprehensive Characteristics of renewal.Space-time data has become smart city and has provided The key element in source, by study space-time big data multidimensional association description Formal Representation, incidence relation dynamic modeling with it is more Scale correlating method is analyzed, and then space-time big data information is excavated and distributed rationally, so that Optimizing City resource is matched somebody with somebody Put, reduce urban resource consume excessively and waste provides decision support and is of great significance.

With technologies such as development of Mobile Internet technology, space orientation technique, location-based service technology, big data technology and cloud computings Rapid development, wisdom traffic system (ITS, Intelligence Transportation System's) applies in daily life Become in work more and more important.At present, various traffic data collection technologies acquire the space-time data of magnanimity in real time, based at this time Empty big data is predicted mobile object location, and then provides intelligent decision and clothes to traffic programme, traffic monitoring and scheduling Business, provide to the user it is finer, accurately and efficiently service, realize technology, society and the coordinated development of people accordingly.

Markov model is a kind of Statistic analysis models, its each state is by some probability density distribution tables It is now various states, each state is produced by a state with corresponding probability density distribution.

But the value for being currently based on space-time big data intelligent transportation is not excavated fully also, to intelligent transportation system In space-time data can not efficiently store, retrieve, analyze with excavate, lack to user trip and traffic situation trajectory predictions with Study and judge.Contain space-time big data in wisdom traffic, in the planning, construction and supervision of wisdom traffic, how effectively to moving Employ the track at family, position is predicted, and then be traffic route planning, traffic scheduling and management, urban planning, public service Selection, safe protection engineering provide decision support and are never well solved with helping.

Movement pattern technology mainly includes mobile trajectory data collection, denoising, characteristic parameter extraction, foundation prediction mould Type, Forecasting recognition decision-making etc..The prediction of this patent shift position gathers the mobile position data of user by mobile equipment, through data Pretreatment, denoising, the joint Clustering Model based on density and algorithm, extract the characteristic information structure interest point sequence of user, then The prediction to location of mobile users is realized by the mobile Markov model of structure.

The content of the invention

For overcome the deficiencies in the prior art, under space-time big data environment, the present invention provides under a kind of space-time big data Big to solve data storage processing amount under space-time big data environment based on the Forecasting Methodology of mobile Markov model, prediction is accurate Exactness and precision are undesirable, improve the precision to location of mobile users prediction and accuracy.

In order to realize above-mentioned target, the technical solution adopted in the present invention is as follows：

Forecasting Methodology based on mobile Markov model under a kind of space-time big data, it is characterised in that including following step Suddenly：

Step 1：Denoising is carried out to the historical position data of collection；The historical position data collected is gone Make an uproar, filter out interference data, filter out dynamic mobile track, retain static motion track；

Step 2：Clustering processing is carried out to the data of denoising；To the static moving rail obtained after denoising in the step 1 Mark carries out the clustering processing based on joint density to static motion track by joint density clustering algorithm and obtains clustering cluster；

Step 3：Point of interest is established for clustering cluster；For the clustering cluster obtained in the step 2, mobile use is extracted The behavioural characteristic at family, so as to establish the point of interest of user；

Step 4：Denoising is carried out to point of interest；Point of interest in the step 3 is calculated, is calculated each Radius, interval time and the density of point of interest, while the motion track of each denoising to being included in point of interest carries out again Clustering processing, further filters out the noise data in point of interest, retains real point of interest；

Step 5：Establish mobile Markov model；Built for each really point of interest obtained in the step 4 Vertical state transition probability and state transition probability matrix；

Step 6：Predict the next position；After mobile user data is collected, handled through above-mentioned steps one~tetra-, extraction The point of interest of mobile subscriber, is realized to mobile subscriber's the next position according to the mobile Markov model established in the step 5 It is predicted.

Further, the method for denoising is in the step 1：Retain static motion track first, i.e., static moving rail The speed speed of mark<δ, δ are a pre-defined constant；As the speed speed of motion track>δ, is dynamic mobile rail Mark, so as to delete all dynamic mobile tracks；

Further, class cluster merging is carried out to the clustering cluster in the step 2, the method that the class cluster merges is：Clustering cluster C₁={ c₁,c₃,c₇,c₉And clustering cluster C₂={ c₉,c₁₁,c₁₂, then the two clustering clusters merge into a clustering cluster：C₁∪C₂= {c₁,c₃,c₇,c₉,c₁₁,c₁₂}。

Had the beneficial effect that caused by the present invention：

The present invention is by using the processing of position data de-noising, the cluster based on joint density and class cluster merging treatment, extraction The behavioural characteristic at family, so that location point is built, while based on mobile Markovian model of this location point structure with memory function Type, is accordingly predicted the position of mobile object.By Step 1: two, three and four can remove substantial amounts of noise data, with The unrelated track data of predicted position, relies only on point of interest and realizes position prediction, such present invention can reduce space-time big data The amount of storage of processing data needed for mobile object is predicted under environment, while also carries and simplifies forecasting system, improves predetermined speed, Additionally by the mobile Markov model with memory function, the present invention can improve precision and the accuracy of position prediction.

Brief description of the drawings

Fig. 1 is the mobile object trajectory predictions block diagram of the present invention；

Fig. 2 is the point of interest schematic diagram of the present invention；

Fig. 3 is the point of interest transfer figure of the present invention；

Fig. 4 is the state vector figure of the present invention.

Embodiment

Come the further details of explanation present invention, but protection scope of the present invention with specific embodiment below in conjunction with the accompanying drawings It is not limited to this.

As shown in Figure 1, the Forecasting Methodology based on mobile Markov model under a kind of space-time big data, including following step Suddenly：

Step 1：Data cleansing.Historical position data denoising first to collecting, filters out dynamic motion track, protects Stay the motion track of static state.

Due to mobile object velocity variations, location equipment precision is not high, the mobile subscriber track data transformation collected Meet truth；Additionally, due to stabilization of equipment performance problem, the data of collection usually contain certain noise.Due to mobile subscriber Track be the time continuous signal, and it is a discrete random process to move Markov chain, so needing to collecting The mobile trajectory data of mobile subscriber carry out sliding-model control, while need to filter out noise data, that is, data cleansing.

Since space-time data amount is huge, before data processing is carried out, it is necessary to which initial data is handled.Given Euclidean is empty Between, track sets T={ t₁,t₂,···,t_nFor the discrete loci point that is sequentially arranged, t_iFor i-th of tracing point, t_i =(x_i,y_i,t_i),1<i<n。

Orbit segment TS={ ts₁,ts₂,···,st_kIt is the discrete loci sequence arranged in temporal sequence, orbit segment is The orderly discrete line segment that continuous tracing point is formed.

Discrete loci section speedRepresent the speed of track jth section, which is by n nearest discrete loci point structure Into discrete loci section average speed.In order to filter out noise data, retain the motion track of static state first, i.e., if jth section Motion trackThen this section is static motion track, and wherein δ is a pre-defined constant, due to moving The motion track for employing family is dynamic change, this constant takes local mean value, be to adapt to variable dynamic data：

Wherein,For the speed of nearest n motion track section jth section.

All dynamic motion tracks are filtered out at the same time, that is, are filtered outMobile trajectory data, ts_jWith previous shifting Dynamic orbit segment ts_j-1Merge into a motion track section.

Step 2：The static motion track retained in the step 1 moves static state by joint density clustering algorithm Track carries out clustering processing, extracts the behavioural characteristic of mobile subscriber, so as to establish the point of interest of user, while merges interest Point, makes its shared maximum public point of interest.

1) the clustering method processing procedure based on joint density

In order to handle the static motion track classification of reservation, the concept in field is defined first：For a given rail Mark point p, using p points as the center of circle, the r fields of tracing point p are known as with the tracing point in radius r；

Set T, q for all tracing points are any one tracing point, for given tracing point p, then p points based on The field of density is：N (p)=q ∈ S | dist (p, q)≤r }.

Track points in the r fields of tracing point p：size(N(p)).

Joint density clusters：Then

Main thought based on Density Clustering is to reduce the treating capacity of data, and piecemeal is carried out to position big data, will Noise data in the tracing point of mobile subscriber filters out.The process of realizing of core is：Travel through each mobile use in the big data of position The tracing point at family, produces clustering cluster, it is assumed that in the field of a tracing point p by the clustering processing method based on joint density Points：Size (N (p)) >=λ, λ is a pre-defined constant, the minimum track points in a class cluster is represented, according to tool Body problem determines, then is created that a new cluster C, then tracing point p is the kernel object of the cluster；If size (N (p))<λ, then should Tracing point p is noise data, it is necessary to filter out；Class cluster merging is finally carried out according to the clustering processing method of joint density.

Detailed process is：Initialize cluster number n=0；Travel through each tracing point p in the set T of tracing point；If size (N(p))<λ, then the tracing point is noise, it is necessary to filter out；If size (N (p)) >=λ, establishes new clustering cluster C_i；Depth The new cluster C of first traversal_iFields of the tracing point p based on density, the Cluster merging for carrying out joint density obtains clustering cluster C_n；Cluster Cluster merge method be：Assuming that clustering cluster C₁={ c₁,c₃,c₇,c₉And clustering cluster C₂={ c₉,c₁₁,c₁₂, then the two are clustered Cluster merges into a clustering cluster C_n=C₁∪C₂={ c₁,c₃,c₇,c₉,c₁₁,c₁₂, so as to obtain clustering cluster C_n, wherein c_iRepresent poly- The i-th element in class cluster.

2) build point of interest, the point of interest calculated, calculate the radius of each point of interest, interval time with it is close Degree, while clustering processing is carried out again to the tracing point of each denoising, the noise data in point of interest is further filtered out, is retained Real point of interest.

Once clustering cluster is formed, the radius raduis of each cluster, access time interval interval, density d ensity are just Determine, wherein：Radius raduis is distance of the cluster heart to farthest tracing point；Access time interval interval is to visit earliest Ask the interval time between time and last access time；Density d ensity is the track points of mobile subscriber in cluster.

The behavioural characteristic in place when being accessed according to user, marks corresponding semantic information to each class cluster, forms point of interest (POIs, Point of interest), i.e. position.Such as some cluster can mark for etc.；Each cluster is relative to shifting at the same time A state in dynamic Markov model.Class cluster radius raduis, density d ensity have just been calculated in labeling process, according to This is to point of interest C_iCluster again, if some static state motion tracks are not belonging to any one cluster, be marked as unknown, so All static motion tracks for being marked as unknown are removed afterwards, and all continuous static motion tracks are shared same Label is simultaneously summed up as individual event, and an event corresponds to a state.Such as 6 continuous static motion tracks share one Label, it is considered to be a static motion track mark point.All static motion track markers works are finally completed, and record mark Credit m, state transition probability matrix are just made of the state transition probability between m point of interest.

The point of interest of structure is as shown in Figure 2.In Fig. 2, H represents that Home, W represent that Work, S represent that Sport, L are represented Leisure, the line with the arrow between point of interest represent the transition probability between point of interest, i.e. state transition probability.

Point of interest POIs arranges each point of interest according to density d ensity in descending order.

Step 3：For the real point of interest retained in the step 2, mobile horse is built based on real point of interest Er Kefu models.

Predict mobile subscriber point of interest, it must be determined that user some point of interest probability, if having m point of interest, Then the vector of point of interest is represented by：d_m={ d_m,1,d_m,2,···,d_m,i, when i-th of state is μ, d_m,i,μ=1；It is no Then, d_m,i,μ=0.

As shown in figure 3, Fig. 3 illustrates point of interest transfer case, 5 point of interest D1~D5 are shared in figure 3, at first For time point point of interest in D3, second time point point of interest is transferred to D1, and the 3rd time point point of interest is transferred to D5, the 4th Time point point of interest is transferred to D2.

Fig. 4 illustrates the state vector figure formed according to Fig. 3 interest dotted states transfer case, and D1~D5 is corresponding in turn to First element~five element in vector, for example, the point of interest at first time point is marked as D3, exists accordingly The 3rd element in first vector is 1；The point of interest at second time point is marked as D1, accordingly in second vector In first element be 1；The point of interest at the 3rd time point is marked as D5, accordingly the 5th in the 3rd vector Element is 1；The point of interest at the 4th time point is marked as D2, and second element in the 4th vector is 1 accordingly.

Step 4：State transition probability is established according to the mobile Markov model built in the step 3, state turns Move probability matrix and n-MMC (n-MMCs, n Mobility Markov Chains) state transition probability matrix.

According to training data, the structure of clustering processing method, point of interest based on above-mentioned data cleansing, based on joint density Build, so as to build mobile Markov model, each point of interest corresponds to an event, and each event corresponds to mobile Ma Erke A state in husband's model, if interest point m, then overall state transition probability be：

Wherein,

State transition probability matrix can be built accordingly：

As shown in Fig. 2, according to the point of interest of structure, the state transition probability square of the mobile Markov model finally built Battle array is as follows：

The mobile Markov chain (MMC, Mobility Markov Chain) of standard is memoryless, to Future Positions Prediction only rely upon current location.This is less consistent with actual conditions, and people is to select future according to custom and memory, because This can make the decision of next step action when selecting the action of next step according to historical trace.Therefore this memoryless property The accuracy of the prediction of confrontation Future Positions can produce some negative effects.It is this in order to solve the problems, such as, introduce a conception of species N-MMCs, in n-MMCs, state not only considers current interest point, it is also contemplated that the pervious n-1 interest accessed Point.

According to training data, and then n-MMCs state transition probability matrixs are built, in order to illustrate what is predicted based on n-MMCs Concept, Fig. 2 illustrate the point of interest built according to the clustering algorithm of joint density, have collected small strong phone gps data, learn Obtain small strong trace information.In 2-MMCs, it is contemplated that four different states, are Home (H), Work (W) respectively, Leisure (L) and Sports (S), the purpose is to the position based on the 2 position prediction next moment accessed recently.Therefore The row of state transition probability matrix illustrates all possible combinations of states of the n point of interest accessed recently, and arranges and illustrate Next position in n-MMCs.For example, if prior location is H, current location W, prediction the next position is H, then state Transfer HW to WH will occur, and update state transition probability matrix accordingly, prior location W, current location H, accordingly State transition probability is as follows：

Wherein, μ is previous state, and ν is current state, and σ is NextState, d_m,i-1,μRepresent amount to m point of interest, i-th- 1 state is μ, d_m,i,νRepresent to amount to m point of interest, i state is ν, d_m,i+1,σRepresent to amount to m point of interest, i+1 shape State is σ, d_n,j-1,μRepresent to amount to n point of interest, -1 state of jth is μ, d_n,j,νRepresent to amount to n point of interest, j-th of state For ν, d_n,j+1,σRepresent to amount to n point of interest ,+1 state of jth is σ.The 2-MMC movement Markov models finally built Partial status transition probability matrix is as shown in table 1：

1 2-MMC of table moves Ma Er transition probability tables

	W	H	S	L
					HW	0.1	0.8	0.1	0
HS	0	0.8	0.13	0.07
					HL	0	0.9	0.04	0.06
WH	0.71	0.24	0.03	0.02
					WS	0.26	0.59	0.11	0.04
WL	0.32	0.68	0	0

Step 5：State transition probability matrix in the step 4 searches n-MMC state transition probability matrixs pair The next position of mobile subscriber is predicted, and determines the next position of mobile subscriber.

According to the mobile Markov model of structure, in order to predict next position, from n-MMC state transition probability matrixs In, be expert at the successively corresponding current state of middle lookup and previous state, are searched corresponding with having been found in row in row Row in maximum probability next position of the row object as mobile object, while update n-MMC state transition probability matrixs In corresponding row and column.

It is noted that above-described embodiment is general to the illustrative and not limiting of technical solution of the present invention, technical field The equivalent substitution of logical technical staff or the other modifications made according to the prior art, as long as not exceeding technical solution of the present invention Thinking and scope, should be included within interest field of the presently claimed invention.

Claims

1. the Forecasting Methodology based on mobile Markov model under a kind of space-time big data, it is characterised in that include the following steps：

Step 1：Denoising is carried out to the historical position data of collection；Denoising, filter are carried out to the historical position data collected Except interference data, dynamic mobile track is filtered out, retains static motion track；

Step 2：Clustering processing is carried out to the data of denoising；The static motion track obtained after denoising in the step 1 is led to Cross joint density clustering algorithm the clustering processing based on joint density is carried out to static motion track and obtain clustering cluster；

Step 3：Point of interest is established for clustering cluster；For the clustering cluster obtained in the step 2, extract mobile subscriber's Behavioural characteristic, so as to establish the point of interest of user；

Step 4：Denoising is carried out to point of interest；Point of interest in the step 3 is calculated, calculates each interest Radius, interval time and the density of point, while the motion track of each denoising to being included in point of interest clusters again Processing, further filters out the noise data in point of interest, retains real point of interest；

Step 5：Establish mobile Markov model；Shape is established for each really point of interest obtained in the step 4 State transition probability and state transition probability matrix；

Step 6：Predict the next position；After mobile user data is collected, handled through above-mentioned steps one~tetra-, extraction movement The point of interest of user, realizes according to the mobile Markov model established in the step 5 and mobile subscriber's the next position is carried out Prediction.

2. the Forecasting Methodology based on mobile hidden Markov model under a kind of space-time big data according to claim 1, its It is characterized in that, the method for denoising is in the step 1：Retain static motion track, i.e., the speed of static motion track first Spend speed<δ, δ are a pre-defined constant；As the speed speed of motion track>δ, is dynamic mobile track, so that Delete all dynamic mobile tracks.

3. the Forecasting Methodology based on mobile hidden Markov model under a kind of space-time big data according to claim 1, its It is characterized in that, clustering cluster in the step 2 is closed and carries out class cluster merging, and the method that the class cluster merges is：Clustering cluster C₁= {c₁,c₃,c₇,c₉And clustering cluster C₂={ c₉,c₁₁,c₁₂, then the two clustering clusters merge into a clustering cluster：C₁∪C₂={ c₁, c₃,c₇,c₉,c₁₁,c₁₂}。