Group activity method of data capture and system based on multi-source space-time trajectory data
Technical field
The present invention relates to technical field of data processing, more particularly to the group activity data based on multi-source space-time trajectory data
Collection method and system.
Background technique
Traditional movable gathering method depends on activity log or activity survey, and sample size is few, and the collection time is long, time-consuming consumption
Power.The outburst of space-time trajectory data provides new tool for the movable acquisition of large-scale groups.Space-time data analyzes correlative study
The individual activity identification being primarily upon in realistic space, especially travel activity, lack the extraction to activity essential attribute information.
Need to develop the group activity extracting method of fusion multi-source space-time trajectory data, to establish based on the movable urban science research of magnanimity
Determine data basis.Space-time trajectory data (such as mobile phone signaling data, vehicle GPS data, social activity register data) is although comprising rich
Rich temporal information and location information, but semantic information is opposite to be lacked, and spatial and temporal resolution is different, can not directly provide
Group activity information.
Therefore, the existing technology needs to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, it is an object of that present invention to provide a kind of, and the group based on multi-source space-time trajectory data is living
Dynamic method of data capture and system.
Technical scheme is as follows:
A kind of group activity method of data capture based on multi-source space-time trajectory data, wherein method includes:
A, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to it is original it is mobile eventually
End signaling data and original social software data of registering are pre-processed, and the correspondence of generation meets the signaling to be processed of specific format
Data and data to be processed of registering;
B, from the background by presetting the rule of time and space, moving point is extracted from signaling data to be processed, is obtained
Moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity rule
Prior information;Moving point track data is obtained, activity venue data are obtained;
C, from the background according to moving point track data, the prior information of group activity rule, activity venue data, using being based on
Bayesian model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the A is specifically included:
A1, backstage obtain originating mobile terminal signaling data, carry out quality cleaning to originating mobile terminal signaling data, go
Except repeated data, the data of attribute missing, the data of removal time and space not within the predefined range, removal user's points are removed
Amount is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
A2, original social software is obtained from the background registering data, quality cleaning is carried out to original social software data of registering, is gone
Except repeated data, the data of attribute missing are removed, remove the data of time and space not in research range, removal user registers
The user data that quantity is only registered in one place in a certain range of user data, removal generates and pre-processes data of registering;
A3, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to the scale of pre-defined rule grid
Resolution ratio converted, generate corresponding signaling data to be processed and data to be processed of registering.
The group activity method of data capture based on multi-source space-time trajectory data, wherein by preparatory in the B
The rule of setting time and space extracts moving point from signaling data to be processed, and obtained moving point track data specifically wraps
It includes:
B11, signaling data to be processed is obtained from the background, people and time are ranked up according to specific time rule, obtained
People sequential track;
B12, the sequential track according to people calculate the time that people enters and leaves specific position, successively enter people each
A position is set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
B13, the space length and time difference for calculating every bit and existing moving point in sequential track, if space length
Less than given threshold, and time difference is less than given threshold, then moving point is added in the point and the point is otherwise set as new
Moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
B14, obtain the candidate active locus of points in candidate active point, when the entry time for detecting candidate active point and from
The difference of ETAD expected time of arrival and departure will then correspond to candidate active point after removing in the candidate active locus of points, generate less than the second given threshold
Moving point track data.
The group activity method of data capture based on multi-source space-time trajectory data, wherein according to wait locate in the B
The classification information of registering in data of registering is managed, constructs and the prior information for learning group activity rule specifically includes:
B21, according to social activity register platform register classification and user's data in different time periods of registering in one day it is total
Amount, is calculated different groups activity and is distributed in intraday intensive probable;
B22, the data of registering according to user calculate the different groups activity movable transition probability under different time point
Cloth;
B23, the data of registering according to user calculate different regions and carry out the movable probability distribution of different groups.
The group activity method of data capture based on multi-source space-time trajectory data, wherein acquisition activity in the B
Locus of points data obtain activity venue data and specifically include:
B31, preset people activity venue time identification window, be denoted as respectively the first active window, second activity
Window;
B32, the moving point track data for obtaining people are living with the first active window and second respectively by the moving point duration
Dynamic window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time length
50% or more, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
B33, activity venue data of the match time longest candidate active position as user are obtained.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the C is specifically included:
C1, according to Bayesian model, and after the Activity Type of given position, time and previous moment, generate
Subsequent time carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, are obtained most
The activity mark of maximum probability is the maximum probability Activity Type of the moving point;
C3, by moving point track data all moving points label after, output activity space-time trajectory chain.
A kind of group activity data gathering system based on multi-source space-time trajectory data, wherein system includes:
Preprocessing module, for obtaining originating mobile terminal signaling data from the background and original social software is registered data, point
Other to pre-process to originating mobile terminal signaling data and original social software data of registering, the correspondence of generation meets particular bin
The signaling data to be processed of formula and data to be processed of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from letter to be processed
It enables and extracts moving point in data, obtained moving point track data;According to the classification information of registering in data to be processed of registering, structure
Build and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively
Point data is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module
It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range
Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range
Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw
It registers data at pretreatment;
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre-
The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count
It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background
Sequence, the sequential track of obtained people;
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to
The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points
A moving point;
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from
From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity
Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold
After removing in the movable locus of points, moving point track data is generated;
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when
Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user
Movable transfering probability distribution;
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit
Dynamic probability distribution;
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively
Window, the second active window;
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people
It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and
50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time
Select moving position;
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user
Dynamic locality data.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould
Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from
The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data
Activity space-time trajectory chain.
The present invention provides a kind of group activity method of data capture and system based on multi-source space-time trajectory data, this hairs
The bright deduction that individual activity is carried out using Bayesian model, and previous moment Activity Type is considered in spatio-temporal activity track to rear
A wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection are realized in the influence of one moment Activity Type.
Detailed description of the invention
Fig. 1 is a kind of preferable implementation of group activity method of data capture based on multi-source space-time trajectory data of the invention
The flow chart of example.
Fig. 2 is a kind of preferable implementation of group activity data gathering system based on multi-source space-time trajectory data of the invention
The functional schematic block diagram of example.
Specific embodiment
To make the purpose of the present invention, technical solution and effect clearer, clear and definite, below to the present invention further specifically
It is bright.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of preferable implementations of group activity method of data capture based on multi-source space-time trajectory data
The flow chart of example, as shown in Figure 1, wherein method includes:
Step S100, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to original
Beginning mobile terminal signaling data and original social software data of registering are pre-processed, the correspondence of generation meet specific format to
Handle signaling data and data to be processed of registering.Wherein mobile terminal is preferably mobile phone.
In further embodiment, specifically included in step S100:
Step S101, backstage obtains originating mobile terminal signaling data, carries out quality to originating mobile terminal signaling data
Cleaning removes repeated data, the data of removal attribute missing, the data of removal time and space not within the predefined range, removal
User's point quantity is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
Step S102, original social software is obtained from the background to register data, and quality is carried out to original social software data of registering
Cleaning, removes repeated data, and the data of removal attribute missing remove the data of time and space not in research range, removal
User registers quantity in a certain range of user data, removes the user data only registered in one place, generates pretreatment label
To data;
Step S103, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to pre-defined rule grid
The resolution ratio of scale converted, generate corresponding signaling data to be processed and data to be processed of registering.
When it is implemented, pre-processing mobile phone signaling data and social activity data of registering, obtain locating after being allowed to meet
Desired data are managed, particular content includes:
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to mobile phone signaling data
Between and data of the space not in research range, removal user's point quantity be less than or greater than certain threshold value user data;Threshold value
Selection depend on specific data type, data format, the quality of data.Preferably, threshold value value range is every less than 3
It, is greater than 100 daily.
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to social activity data of registering
Between and data of the space not in research range;Removal user registers user data of the quantity less than 2 times, greater than 100 times;Removal
The user data only registered in one place;
For multi-source space-time trajectory data, the influence of spatial resolution is considered.Mobile phone signaling data and social activity are registered number
According to spatial resolution be uniformly converted to the scale of rule-based grid.The scale size of regular grid generally depends on above two
The spatial resolution of class data itself.Preferential scale selection is 500m*500m.
Step S200, from the background by presetting the rule of time and space, the extraction activity from signaling data to be processed
Point, obtained moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity
The prior information of rule;Moving point track data is obtained, activity venue data are obtained.
Further, it is mentioned from signaling data to be processed in step S200 by presetting the rule of time and space
Moving point is taken, obtained moving point track data specifically includes:
Step S211, signaling data to be processed is obtained from the background, and people and time are ranked up according to specific time rule,
The sequential track of obtained people;
Step S212, according to the sequential track of people, the time that people enters and leaves specific position is calculated, successively enters people
Each position be set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
Step S213, the space length and time difference for calculating every bit and existing moving point in sequential track, if empty
Between distance be less than given threshold, and time difference be less than given threshold, then by it is described point be added moving point, otherwise, by the point
Be set as new moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
Step S214, the candidate active point in the candidate active locus of points is obtained, when detecting the entrance of candidate active point
Between and time departure difference less than the second given threshold, then corresponding candidate active point is removed from the candidate active locus of points
Afterwards, moving point track data is generated.
When it is implemented, by extracting the moving point of people, obtaining the work of people for passing through processed mobile phone signaling data
Moving point trace.The method for extracting moving point mainly determines that the specific method is as follows by setting time and the rule in space:
It for the mobile phone signaling data of generation, is ranked up according to people and time, obtains the sequential track of people;
Using the sequential track of people, its time for entering and leaving each position (grid) is calculated, first position is set as
First moving point in the movable locus of points;
As the time is mobile, calculate in sequential track the space of every bit and the moving point in the existing movable locus of points away from
From with time difference;If space length is less than given threshold, and time difference is less than given threshold, then the point is added to this
Moving point;Otherwise, which is set as new moving point;Until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;The range of preferred given threshold is 500m-1000m.
For the candidate active point in the candidate active locus of points, if the entry time of the point and the difference of time departure are less than
Certain threshold value, then it is assumed that the point is not moving point, it is removed from the candidate active locus of points, the moving point rail finally obtained
Mark.Preferably, threshold value value range is -3 hours 1 hour.
Further, it constructs according to the classification information of registering in data to be processed of registering in step S200 and learns group
The prior information of mechanics specifically includes:
Step S221, according to classification and the user number in different time periods of registering in one day of registering of social platform of registering
According to total amount, different groups activity is calculated and is distributed in intraday intensive probable;
Step S222, according to the data of registering of user, it is general to calculate activity transfer of the different groups activity under different time
Rate distribution;
Step S223, it according to the data of registering of user, calculates different regions and carries out the movable probability distribution of different groups.
When it is implemented, it is rich in classification information abundant of registering using it for registering data by processed social activity,
The prior information of building and study group activity rule.The specific method is as follows:
It is registered according to social activity classification and the user data in different time periods of registering in one day of registering provided by platform
Total amount is calculated different groups activity in intraday intensive probable and is distributed Pr (ATi| t), indicate are as follows:
checkins(ATi, t) and indicate that moment t Activity Type is the quantity of registering of i, ∑tcheckins(ATi, t) and it is one day
Interior each moment is engaged in the quantity of registering that Activity Type is i, wherein ATiTo be engaged in the number of registering that the class of activity is i, according to user
Track of registering, movable transfering probability distribution of the different groups activity under different time is calculated, is expressed as Pr (ATi,t|
ATj, t-1), wherein i, j indicate movable classification, and t indicates the time.ATi, it is registering for i that t expression, which is engaged in Activity Type in t moment,
Quantity, (ATj, t-1) and it indicates to be engaged in the number of registering that Activity Type is j, probability P r (AT at the t-1 momenti,t|ATj, t-1) meaning
Justice is to be engaged in the probability of movable i in moment t in the case where known previous moment t-1 is engaged in movable j;Pr (X) indicates event X
Probability announce;
According to the track of registering of user, different mesh regions are calculated and carry out the movable probability distribution of different groups, table
It is shown as: Pr (Gridm|ATi, t), wherein m is grid serial number, GridmIndicate m-th of grid, i is the class of activity, and t is the time.
In further embodiment, moving point track data is obtained in the step S200, obtains activity venue data tool
Body includes:
Step S231, the time identification window for presetting the activity venue of people, is denoted as the first active window, second respectively
Active window;
Step S232, obtain people moving point track data, by the moving point duration respectively with the first active window and
Second active window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time
50% or more of length, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
Step S233, activity venue data of the match time longest candidate active position as user are obtained.
When it is implemented, obtained moving point track data, detects house and the work activities of people.The specific method is as follows:
According to common sense, the identification window of setting house activity and work activities is set to: 0. -7 point, 9. -17 points;
For the moving point track data of people, the duration of moving point is matched with two above identification window,
If the duration of the moving point falls in identification window, and 50% or more of the total identification window time span of Zhan, then it is assumed that
With success, as candidate house or work activities position;
Find match time longest house or work activities position house and work activities position as the user;If
There is no successful match, then it is assumed that the user does not find house or work activities position.
Step S300, from the background according to moving point track data, the prior information of group activity rule, activity venue data,
It is marked using based on Bayesian model carry out activity locus of points semantic information, generation activity space-time trajectory chain.
Using the movable locus of points by obtaining, obtained group activity temporal prior information, the house work of obtained people
Make action message, marked based on Bayesian model carry out activity locus of points semantic information, the action message of label mainly includes occupying
Family, work, other (such as: amusement/shopping/study/leisure/trip), obtain activity space-time trajectory chain.
Obtained spatio-temporal activity track chain has important meaning for research urban planning and urban function region dynamic change
Justice.According to the variation of spatio-temporal activity, tune can quickly be made in time for the dynamic change for the urban function region planned
Whole and prediction.
Further embodiment, step S300 are specifically included:
Step S301, according to Bayesian model, and given position, the Activity Type of time and previous moment
Afterwards, the new probability formula that subsequent time carries out a certain type of activity is generated;
Step S302, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating,
The activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Step S303, after all moving points in moving point track data being marked, output activity space-time trajectory chain.
When it is implemented, according to Bayesian model, in the activity class of given specific location, time and previous moment
Under type, the lower moment at a moment will carry out the probability of a certain type of activity are as follows:
Wherein, m is grid serial number, and j is the Activity Type at previous moment, and t is current time, and i is current time activity
Type.
For Pr (Gridm|ATi,t,ATj), it is believed that ATjWith GridmCondition is unrelated, then the formula can simplify are as follows:
Pr(Gridm|ATi,t,ATj)=Pr (Gridm|ATi,t) (2)
For Pr (ATi|t,ATj), which can rewrite are as follows:
Pr(ATi|t,ATj)=Pr (ATi,t|ATj, t-1) and (3)
In conjunction with formula (2) (3), formula (1) is converted are as follows:
Pr(ATi|Gridm,t,ATj) ∝ Pr (Gridm|ATi, t) and Pr(ATi,t|ATj,t-1)Pr(ATj|t)(5)
It for the movable locus of points, is sequentially inputted in formula (5), different movable probability sizes are engaged in calculating, take maximum
The activity mark of probability is the maximum probability Activity Type of the moving point;
Particularly, for have been marked as at home or work activities type grid position, then by Pr (Gridm|ATi,t)
It is set as 1, and by ATj, t-1=AThomeorATworking, continue the label processing for being input to next moving point.Until all work
Moving point in moving point trace is marked, and output obtains movable Space-time Chain.AThomeIndicate that Activity Type is AT at homeworking
Indicate that Activity Type is to work.
Wherein, moving point track extraction method depends on the spatial and temporal resolution of specific data type, data, is not limited to
The method that the present invention introduces;
The observation duration of space-time data is limited to work activities detection method at home, the selection of threshold value is not limited to this hair
The method of bright introduction;
The prior information of building and study group activity rule is not limited to social media and registers data, can also use residence
The modes such as people's survey data, GPS track data, volunteer's data.
The present invention proposes a kind of completely new group activity collection method based on multi-source space-time trajectory data, using Bayes
Model carries out the deduction of individual activity, solves existing method the problems such as taking time and effort, is at high cost, sample size is small, realize it is a wide range of,
The accurate, quick of magnanimity group activity, high efficiency extraction and collection.Group activity deduction of the invention not only allows for city space
The Factors on Human class such as middle time, position movable constraint, it is also contemplated that previous moment Activity Type is to rear in spatio-temporal activity track
The influence of one moment Activity Type considers movable deduction in mankind's spatio-temporal activity chain.
The present invention also provides a kind of preferable realities of group activity data gathering system based on multi-source space-time trajectory data
The functional schematic block diagram of example is applied, as shown in Fig. 2, system includes:
Preprocessing module 100, for obtaining originating mobile terminal signaling data from the background and original social software is registered data,
Originating mobile terminal signaling data and original social software data of registering are pre-processed respectively, the correspondence of generation meets specific
The signaling data to be processed of format and data to be processed of registering;Specifically as described in embodiment of the method.
Activity venue data acquisition module 200, for backstage by presetting the rule of time and space, to be processed
Moving point is extracted in signaling data, obtained moving point track data;According to the classification information of registering in data to be processed of registering,
Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;Specific such as side
Described in method embodiment.
Semantic marker module 300, for backstage according to moving point track data, the prior information of group activity rule, work
Dynamic locality data, is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information;Tool
Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module
It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range
Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;Specific such as side
Described in method embodiment.
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range
Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw
It registers data at pretreatment;Specifically as described in embodiment of the method.
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre-
The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated;Tool
Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count
It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background
Sequence, the sequential track of obtained people;Specifically as described in embodiment of the method.
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to
The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points
A moving point;Specifically as described in embodiment of the method.
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from
From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity
Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;Specifically as described in embodiment of the method.
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold
After removing in the movable locus of points, moving point track data is generated;Specifically as described in embodiment of the method.
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when
Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;Specific such as method is implemented
Described in example.
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user
Movable transfering probability distribution;Specifically as described in embodiment of the method.
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit
Dynamic probability distribution;Specifically as described in embodiment of the method.
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively
Window, the second active window;Specifically as described in embodiment of the method.
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people
It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and
50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time
Select moving position;Specifically as described in embodiment of the method.
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user
Dynamic locality data;Specifically as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould
Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;Specifically as described in embodiment of the method.
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from
The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;Tool
Body is as described in embodiment of the method.
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data
Activity space-time trajectory chain;Specifically as described in embodiment of the method.
In conclusion the present invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and
System, method include: that backstage obtains originating mobile terminal signaling data and original social software and registers and data and pre-processed,
Generate the signaling data to be processed for meeting specific format and data to be processed of registering;The work that backstage is obtained from signaling data to be processed
Moving point trace data;Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue is obtained
Data;Backstage is according to moving point track data, the prior information of group activity rule, activity venue data, using based on pattra leaves
This model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.The present invention is carried out using Bayesian model
The deduction of individual activity, and previous moment Activity Type is considered in spatio-temporal activity track to the shadow of later moment in time Activity Type
It rings, realizes a wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection.
It should be understood that the application of the present invention is not limited to the above for those of ordinary skills can
With improvement or transformation based on the above description, all these modifications and variations all should belong to the guarantor of appended claims of the present invention
Protect range.