CN106211071B - Group activity method of data capture and system based on multi-source space-time trajectory data - Google Patents
Group activity method of data capture and system based on multi-source space-time trajectory data Download PDFInfo
- Publication number
- CN106211071B CN106211071B CN201610517438.6A CN201610517438A CN106211071B CN 106211071 B CN106211071 B CN 106211071B CN 201610517438 A CN201610517438 A CN 201610517438A CN 106211071 B CN106211071 B CN 106211071B
- Authority
- CN
- China
- Prior art keywords
- data
- activity
- time
- moving point
- registering
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W64/00—Locating users or terminals or network equipment for network management purposes, e.g. mobility management
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses group activity methods of data capture and system based on multi-source space-time trajectory data, method includes: that backstage obtains originating mobile terminal signaling data and original social software and registers and data and pre-processed, and generates the signaling data to be processed for meeting specific format and data to be processed of registering;The moving point track data that backstage is obtained from signaling data to be processed;Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;Backstage is marked, generation activity space-time trajectory chain according to moving point track data, the prior information of group activity rule, activity venue data using based on Bayesian model carry out activity locus of points semantic information.The present invention carries out the deduction of individual activity using Bayesian model, and considers influence of the previous moment Activity Type to later moment in time Activity Type in spatio-temporal activity track, realizes a wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to the group activity data based on multi-source space-time trajectory data
Collection method and system.
Background technique
Traditional movable gathering method depends on activity log or activity survey, and sample size is few, and the collection time is long, time-consuming consumption
Power.The outburst of space-time trajectory data provides new tool for the movable acquisition of large-scale groups.Space-time data analyzes correlative study
The individual activity identification being primarily upon in realistic space, especially travel activity, lack the extraction to activity essential attribute information.
Need to develop the group activity extracting method of fusion multi-source space-time trajectory data, to establish based on the movable urban science research of magnanimity
Determine data basis.Space-time trajectory data (such as mobile phone signaling data, vehicle GPS data, social activity register data) is although comprising rich
Rich temporal information and location information, but semantic information is opposite to be lacked, and spatial and temporal resolution is different, can not directly provide
Group activity information.
Therefore, the existing technology needs to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, it is an object of that present invention to provide a kind of, and the group based on multi-source space-time trajectory data is living
Dynamic method of data capture and system.
Technical scheme is as follows:
A kind of group activity method of data capture based on multi-source space-time trajectory data, wherein method includes:
A, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to it is original it is mobile eventually
End signaling data and original social software data of registering are pre-processed, and the correspondence of generation meets the signaling to be processed of specific format
Data and data to be processed of registering;
B, from the background by presetting the rule of time and space, moving point is extracted from signaling data to be processed, is obtained
Moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity rule
Prior information;Moving point track data is obtained, activity venue data are obtained;
C, from the background according to moving point track data, the prior information of group activity rule, activity venue data, using being based on
Bayesian model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the A is specifically included:
A1, backstage obtain originating mobile terminal signaling data, carry out quality cleaning to originating mobile terminal signaling data, go
Except repeated data, the data of attribute missing, the data of removal time and space not within the predefined range, removal user's points are removed
Amount is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
A2, original social software is obtained from the background registering data, quality cleaning is carried out to original social software data of registering, is gone
Except repeated data, the data of attribute missing are removed, remove the data of time and space not in research range, removal user registers
The user data that quantity is only registered in one place in a certain range of user data, removal generates and pre-processes data of registering;
A3, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to the scale of pre-defined rule grid
Resolution ratio converted, generate corresponding signaling data to be processed and data to be processed of registering.
The group activity method of data capture based on multi-source space-time trajectory data, wherein by preparatory in the B
The rule of setting time and space extracts moving point from signaling data to be processed, and obtained moving point track data specifically wraps
It includes:
B11, signaling data to be processed is obtained from the background, people and time are ranked up according to specific time rule, obtained
People sequential track;
B12, the sequential track according to people calculate the time that people enters and leaves specific position, successively enter people each
A position is set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
B13, the space length and time difference for calculating every bit and existing moving point in sequential track, if space length
Less than given threshold, and time difference is less than given threshold, then moving point is added in the point and the point is otherwise set as new
Moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
B14, obtain the candidate active locus of points in candidate active point, when the entry time for detecting candidate active point and from
The difference of ETAD expected time of arrival and departure will then correspond to candidate active point after removing in the candidate active locus of points, generate less than the second given threshold
Moving point track data.
The group activity method of data capture based on multi-source space-time trajectory data, wherein according to wait locate in the B
The classification information of registering in data of registering is managed, constructs and the prior information for learning group activity rule specifically includes:
B21, according to social activity register platform register classification and user's data in different time periods of registering in one day it is total
Amount, is calculated different groups activity and is distributed in intraday intensive probable;
B22, the data of registering according to user calculate the different groups activity movable transition probability under different time point
Cloth;
B23, the data of registering according to user calculate different regions and carry out the movable probability distribution of different groups.
The group activity method of data capture based on multi-source space-time trajectory data, wherein acquisition activity in the B
Locus of points data obtain activity venue data and specifically include:
B31, preset people activity venue time identification window, be denoted as respectively the first active window, second activity
Window;
B32, the moving point track data for obtaining people are living with the first active window and second respectively by the moving point duration
Dynamic window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time length
50% or more, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
B33, activity venue data of the match time longest candidate active position as user are obtained.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the C is specifically included:
C1, according to Bayesian model, and after the Activity Type of given position, time and previous moment, generate
Subsequent time carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, are obtained most
The activity mark of maximum probability is the maximum probability Activity Type of the moving point;
C3, by moving point track data all moving points label after, output activity space-time trajectory chain.
A kind of group activity data gathering system based on multi-source space-time trajectory data, wherein system includes:
Preprocessing module, for obtaining originating mobile terminal signaling data from the background and original social software is registered data, point
Other to pre-process to originating mobile terminal signaling data and original social software data of registering, the correspondence of generation meets particular bin
The signaling data to be processed of formula and data to be processed of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from letter to be processed
It enables and extracts moving point in data, obtained moving point track data;According to the classification information of registering in data to be processed of registering, structure
Build and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively
Point data is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module
It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range
Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range
Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw
It registers data at pretreatment;
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre-
The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count
It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background
Sequence, the sequential track of obtained people;
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to
The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points
A moving point;
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from
From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity
Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold
After removing in the movable locus of points, moving point track data is generated;
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when
Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user
Movable transfering probability distribution;
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit
Dynamic probability distribution;
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively
Window, the second active window;
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people
It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and
50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time
Select moving position;
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user
Dynamic locality data.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould
Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from
The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data
Activity space-time trajectory chain.
The present invention provides a kind of group activity method of data capture and system based on multi-source space-time trajectory data, this hairs
The bright deduction that individual activity is carried out using Bayesian model, and previous moment Activity Type is considered in spatio-temporal activity track to rear
A wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection are realized in the influence of one moment Activity Type.
Detailed description of the invention
Fig. 1 is a kind of preferable implementation of group activity method of data capture based on multi-source space-time trajectory data of the invention
The flow chart of example.
Fig. 2 is a kind of preferable implementation of group activity data gathering system based on multi-source space-time trajectory data of the invention
The functional schematic block diagram of example.
Specific embodiment
To make the purpose of the present invention, technical solution and effect clearer, clear and definite, below to the present invention further specifically
It is bright.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of preferable implementations of group activity method of data capture based on multi-source space-time trajectory data
The flow chart of example, as shown in Figure 1, wherein method includes:
Step S100, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to original
Beginning mobile terminal signaling data and original social software data of registering are pre-processed, the correspondence of generation meet specific format to
Handle signaling data and data to be processed of registering.Wherein mobile terminal is preferably mobile phone.
In further embodiment, specifically included in step S100:
Step S101, backstage obtains originating mobile terminal signaling data, carries out quality to originating mobile terminal signaling data
Cleaning removes repeated data, the data of removal attribute missing, the data of removal time and space not within the predefined range, removal
User's point quantity is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
Step S102, original social software is obtained from the background to register data, and quality is carried out to original social software data of registering
Cleaning, removes repeated data, and the data of removal attribute missing remove the data of time and space not in research range, removal
User registers quantity in a certain range of user data, removes the user data only registered in one place, generates pretreatment label
To data;
Step S103, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to pre-defined rule grid
The resolution ratio of scale converted, generate corresponding signaling data to be processed and data to be processed of registering.
When it is implemented, pre-processing mobile phone signaling data and social activity data of registering, obtain locating after being allowed to meet
Desired data are managed, particular content includes:
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to mobile phone signaling data
Between and data of the space not in research range, removal user's point quantity be less than or greater than certain threshold value user data;Threshold value
Selection depend on specific data type, data format, the quality of data.Preferably, threshold value value range is every less than 3
It, is greater than 100 daily.
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to social activity data of registering
Between and data of the space not in research range;Removal user registers user data of the quantity less than 2 times, greater than 100 times;Removal
The user data only registered in one place;
For multi-source space-time trajectory data, the influence of spatial resolution is considered.Mobile phone signaling data and social activity are registered number
According to spatial resolution be uniformly converted to the scale of rule-based grid.The scale size of regular grid generally depends on above two
The spatial resolution of class data itself.Preferential scale selection is 500m*500m.
Step S200, from the background by presetting the rule of time and space, the extraction activity from signaling data to be processed
Point, obtained moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity
The prior information of rule;Moving point track data is obtained, activity venue data are obtained.
Further, it is mentioned from signaling data to be processed in step S200 by presetting the rule of time and space
Moving point is taken, obtained moving point track data specifically includes:
Step S211, signaling data to be processed is obtained from the background, and people and time are ranked up according to specific time rule,
The sequential track of obtained people;
Step S212, according to the sequential track of people, the time that people enters and leaves specific position is calculated, successively enters people
Each position be set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
Step S213, the space length and time difference for calculating every bit and existing moving point in sequential track, if empty
Between distance be less than given threshold, and time difference be less than given threshold, then by it is described point be added moving point, otherwise, by the point
Be set as new moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
Step S214, the candidate active point in the candidate active locus of points is obtained, when detecting the entrance of candidate active point
Between and time departure difference less than the second given threshold, then corresponding candidate active point is removed from the candidate active locus of points
Afterwards, moving point track data is generated.
When it is implemented, by extracting the moving point of people, obtaining the work of people for passing through processed mobile phone signaling data
Moving point trace.The method for extracting moving point mainly determines that the specific method is as follows by setting time and the rule in space:
It for the mobile phone signaling data of generation, is ranked up according to people and time, obtains the sequential track of people;
Using the sequential track of people, its time for entering and leaving each position (grid) is calculated, first position is set as
First moving point in the movable locus of points;
As the time is mobile, calculate in sequential track the space of every bit and the moving point in the existing movable locus of points away from
From with time difference;If space length is less than given threshold, and time difference is less than given threshold, then the point is added to this
Moving point;Otherwise, which is set as new moving point;Until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;The range of preferred given threshold is 500m-1000m.
For the candidate active point in the candidate active locus of points, if the entry time of the point and the difference of time departure are less than
Certain threshold value, then it is assumed that the point is not moving point, it is removed from the candidate active locus of points, the moving point rail finally obtained
Mark.Preferably, threshold value value range is -3 hours 1 hour.
Further, it constructs according to the classification information of registering in data to be processed of registering in step S200 and learns group
The prior information of mechanics specifically includes:
Step S221, according to classification and the user number in different time periods of registering in one day of registering of social platform of registering
According to total amount, different groups activity is calculated and is distributed in intraday intensive probable;
Step S222, according to the data of registering of user, it is general to calculate activity transfer of the different groups activity under different time
Rate distribution;
Step S223, it according to the data of registering of user, calculates different regions and carries out the movable probability distribution of different groups.
When it is implemented, it is rich in classification information abundant of registering using it for registering data by processed social activity,
The prior information of building and study group activity rule.The specific method is as follows:
It is registered according to social activity classification and the user data in different time periods of registering in one day of registering provided by platform
Total amount is calculated different groups activity in intraday intensive probable and is distributed Pr (ATi| t), indicate are as follows:
checkins(ATi, t) and indicate that moment t Activity Type is the quantity of registering of i, ∑tcheckins(ATi, t) and it is one day
Interior each moment is engaged in the quantity of registering that Activity Type is i, wherein ATiTo be engaged in the number of registering that the class of activity is i, according to user
Track of registering, movable transfering probability distribution of the different groups activity under different time is calculated, is expressed as Pr (ATi,t|
ATj, t-1), wherein i, j indicate movable classification, and t indicates the time.ATi, it is registering for i that t expression, which is engaged in Activity Type in t moment,
Quantity, (ATj, t-1) and it indicates to be engaged in the number of registering that Activity Type is j, probability P r (AT at the t-1 momenti,t|ATj, t-1) meaning
Justice is to be engaged in the probability of movable i in moment t in the case where known previous moment t-1 is engaged in movable j;Pr (X) indicates event X
Probability announce;
According to the track of registering of user, different mesh regions are calculated and carry out the movable probability distribution of different groups, table
It is shown as: Pr (Gridm|ATi, t), wherein m is grid serial number, GridmIndicate m-th of grid, i is the class of activity, and t is the time.
In further embodiment, moving point track data is obtained in the step S200, obtains activity venue data tool
Body includes:
Step S231, the time identification window for presetting the activity venue of people, is denoted as the first active window, second respectively
Active window;
Step S232, obtain people moving point track data, by the moving point duration respectively with the first active window and
Second active window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time
50% or more of length, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
Step S233, activity venue data of the match time longest candidate active position as user are obtained.
When it is implemented, obtained moving point track data, detects house and the work activities of people.The specific method is as follows:
According to common sense, the identification window of setting house activity and work activities is set to: 0. -7 point, 9. -17 points;
For the moving point track data of people, the duration of moving point is matched with two above identification window,
If the duration of the moving point falls in identification window, and 50% or more of the total identification window time span of Zhan, then it is assumed that
With success, as candidate house or work activities position;
Find match time longest house or work activities position house and work activities position as the user;If
There is no successful match, then it is assumed that the user does not find house or work activities position.
Step S300, from the background according to moving point track data, the prior information of group activity rule, activity venue data,
It is marked using based on Bayesian model carry out activity locus of points semantic information, generation activity space-time trajectory chain.
Using the movable locus of points by obtaining, obtained group activity temporal prior information, the house work of obtained people
Make action message, marked based on Bayesian model carry out activity locus of points semantic information, the action message of label mainly includes occupying
Family, work, other (such as: amusement/shopping/study/leisure/trip), obtain activity space-time trajectory chain.
Obtained spatio-temporal activity track chain has important meaning for research urban planning and urban function region dynamic change
Justice.According to the variation of spatio-temporal activity, tune can quickly be made in time for the dynamic change for the urban function region planned
Whole and prediction.
Further embodiment, step S300 are specifically included:
Step S301, according to Bayesian model, and given position, the Activity Type of time and previous moment
Afterwards, the new probability formula that subsequent time carries out a certain type of activity is generated;
Step S302, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating,
The activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Step S303, after all moving points in moving point track data being marked, output activity space-time trajectory chain.
When it is implemented, according to Bayesian model, in the activity class of given specific location, time and previous moment
Under type, the lower moment at a moment will carry out the probability of a certain type of activity are as follows:
Wherein, m is grid serial number, and j is the Activity Type at previous moment, and t is current time, and i is current time activity
Type.
For Pr (Gridm|ATi,t,ATj), it is believed that ATjWith GridmCondition is unrelated, then the formula can simplify are as follows:
Pr(Gridm|ATi,t,ATj)=Pr (Gridm|ATi,t) (2)
For Pr (ATi|t,ATj), which can rewrite are as follows:
Pr(ATi|t,ATj)=Pr (ATi,t|ATj, t-1) and (3)
In conjunction with formula (2) (3), formula (1) is converted are as follows:
Pr(ATi|Gridm,t,ATj) ∝ Pr (Gridm|ATi, t) and Pr(ATi,t|ATj,t-1)Pr(ATj|t)(5)
It for the movable locus of points, is sequentially inputted in formula (5), different movable probability sizes are engaged in calculating, take maximum
The activity mark of probability is the maximum probability Activity Type of the moving point;
Particularly, for have been marked as at home or work activities type grid position, then by Pr (Gridm|ATi,t)
It is set as 1, and by ATj, t-1=AThomeorATworking, continue the label processing for being input to next moving point.Until all work
Moving point in moving point trace is marked, and output obtains movable Space-time Chain.AThomeIndicate that Activity Type is AT at homeworking
Indicate that Activity Type is to work.
Wherein, moving point track extraction method depends on the spatial and temporal resolution of specific data type, data, is not limited to
The method that the present invention introduces;
The observation duration of space-time data is limited to work activities detection method at home, the selection of threshold value is not limited to this hair
The method of bright introduction;
The prior information of building and study group activity rule is not limited to social media and registers data, can also use residence
The modes such as people's survey data, GPS track data, volunteer's data.
The present invention proposes a kind of completely new group activity collection method based on multi-source space-time trajectory data, using Bayes
Model carries out the deduction of individual activity, solves existing method the problems such as taking time and effort, is at high cost, sample size is small, realize it is a wide range of,
The accurate, quick of magnanimity group activity, high efficiency extraction and collection.Group activity deduction of the invention not only allows for city space
The Factors on Human class such as middle time, position movable constraint, it is also contemplated that previous moment Activity Type is to rear in spatio-temporal activity track
The influence of one moment Activity Type considers movable deduction in mankind's spatio-temporal activity chain.
The present invention also provides a kind of preferable realities of group activity data gathering system based on multi-source space-time trajectory data
The functional schematic block diagram of example is applied, as shown in Fig. 2, system includes:
Preprocessing module 100, for obtaining originating mobile terminal signaling data from the background and original social software is registered data,
Originating mobile terminal signaling data and original social software data of registering are pre-processed respectively, the correspondence of generation meets specific
The signaling data to be processed of format and data to be processed of registering;Specifically as described in embodiment of the method.
Activity venue data acquisition module 200, for backstage by presetting the rule of time and space, to be processed
Moving point is extracted in signaling data, obtained moving point track data;According to the classification information of registering in data to be processed of registering,
Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;Specific such as side
Described in method embodiment.
Semantic marker module 300, for backstage according to moving point track data, the prior information of group activity rule, work
Dynamic locality data, is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information;Tool
Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module
It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range
Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;Specific such as side
Described in method embodiment.
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software
Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range
Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw
It registers data at pretreatment;Specifically as described in embodiment of the method.
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre-
The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated;Tool
Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count
It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background
Sequence, the sequential track of obtained people;Specifically as described in embodiment of the method.
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to
The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points
A moving point;Specifically as described in embodiment of the method.
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from
From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity
Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active
The locus of points;Specifically as described in embodiment of the method.
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting
The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold
After removing in the movable locus of points, moving point track data is generated;Specifically as described in embodiment of the method.
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when
Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;Specific such as method is implemented
Described in example.
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user
Movable transfering probability distribution;Specifically as described in embodiment of the method.
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit
Dynamic probability distribution;Specifically as described in embodiment of the method.
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively
Window, the second active window;Specifically as described in embodiment of the method.
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people
It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and
50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time
Select moving position;Specifically as described in embodiment of the method.
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user
Dynamic locality data;Specifically as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould
Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment
Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;Specifically as described in embodiment of the method.
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from
The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;Tool
Body is as described in embodiment of the method.
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data
Activity space-time trajectory chain;Specifically as described in embodiment of the method.
In conclusion the present invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and
System, method include: that backstage obtains originating mobile terminal signaling data and original social software and registers and data and pre-processed,
Generate the signaling data to be processed for meeting specific format and data to be processed of registering;The work that backstage is obtained from signaling data to be processed
Moving point trace data;Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue is obtained
Data;Backstage is according to moving point track data, the prior information of group activity rule, activity venue data, using based on pattra leaves
This model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.The present invention is carried out using Bayesian model
The deduction of individual activity, and previous moment Activity Type is considered in spatio-temporal activity track to the shadow of later moment in time Activity Type
It rings, realizes a wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection.
It should be understood that the application of the present invention is not limited to the above for those of ordinary skills can
With improvement or transformation based on the above description, all these modifications and variations all should belong to the guarantor of appended claims of the present invention
Protect range.
Claims (6)
1. a kind of group activity method of data capture based on multi-source space-time trajectory data, which is characterized in that the described method includes:
A, backstage obtains originating mobile terminal signaling data and original social software and registers data, believes respectively originating mobile terminal
Data and original social software data of registering are enabled to be pre-processed, the correspondence of generation meets the signaling data to be processed of specific format
With data to be processed of registering;
B, from the background by presetting the rule of time and space, moving point is extracted from signaling data to be processed, obtained work
Moving point trace data;According to the classification information of registering in data to be processed of registering, the priori of group activity rule is constructed and learnt
Information;Moving point track data is obtained, activity venue data are obtained;
C, from the background according to moving point track data, the prior information of group activity rule, activity venue data, using based on pattra leaves
This model carry out activity locus of points semantic information label, generation activity space-time trajectory chain;
By presetting the rule of time and space in the B, moving point is extracted from signaling data to be processed, obtained work
Moving point trace data specifically include:
B11, signaling data to be processed is obtained from the background, people and time are ranked up according to specific time rule, obtained people
Sequential track;
B12, the sequential track according to people calculate the time that people enters and leaves specific position, each position for successively entering people
It installs and is set to moving point, and first position that people enters is set as first moving point in the movable locus of points;
B13, the space length and time difference for calculating every bit and existing moving point in sequential track, if space length is less than
Given threshold, and time difference is less than given threshold, then moving point is added in the point and the point is otherwise set as new work
Dynamic point obtains the candidate active locus of points until all the points are all calculated and finished in sequential track;
Candidate active point in B14, the acquisition candidate active locus of points, when detecting the entry time of candidate active point with leaving
Between difference less than the second given threshold, then will corresponding candidate active point after being removed in the candidate active locus of points, generation activity
Locus of points data;
According to the classification information of registering in data to be processed of registering in the B, the priori letter of group activity rule is constructed and learnt
Breath specifically includes:
B21, classification and the user total amount of data in different time periods of registering in one day of registering of platform of being registered according to social activity, meter
Calculation obtains different groups activity and is distributed in intraday intensive probable;
B22, the data of registering according to user calculate movable transfering probability distribution of the different groups activity under different time;
B23, the data of registering according to user calculate different regions and carry out the movable probability distribution of different groups;
Moving point track data is obtained in the B, is obtained activity venue data and is specifically included:
B31, preset people activity venue time identification window, be denoted as the first active window, the second active window respectively;
B32, obtain people moving point track data, by the moving point duration respectively with the first active window and the second active window
Mouthful matched, if the duration of moving point falls in a certain active window, and account for the 50% of total activity window time length with
On, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
B33, activity venue data of the match time longest candidate active position as user are obtained.
2. the group activity method of data capture according to claim 1 based on multi-source space-time trajectory data, feature exist
In the A is specifically included:
A1, backstage obtain originating mobile terminal signaling data, carry out quality cleaning, removal weight to originating mobile terminal signaling data
Complex data, the data of removal attribute missing, the data of removal time and space not within the predefined range, removal user's point quantity are small
In or greater than certain threshold value user data, generate pretreatment signaling data;
A2, original social software is obtained from the background registering data, quality cleaning, removal weight are carried out to original social software data of registering
Complex data, the data of removal attribute missing remove data not in research range of time and space, and removal user registers quantity
In the user data that a certain range of user data, removal are only registered in one place, generates and pre-process data of registering;
A3, by pre-process signaling data and pretreatment register data spatial resolution according to point of the scale of pre-defined rule grid
Resolution is converted, and corresponding signaling data to be processed and data to be processed of registering are generated.
3. the group activity method of data capture according to claim 1 based on multi-source space-time trajectory data, feature exist
In the C is specifically included:
C1, according to Bayesian model, and after the Activity Type of given position, time and previous moment, generate next
Moment carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, are obtained most general
The activity mark of rate is the maximum probability Activity Type of the moving point;
C3, by moving point track data all moving points label after, output activity space-time trajectory chain.
4. a kind of group activity data gathering system based on multi-source space-time trajectory data, which is characterized in that system includes:
Preprocessing module is right respectively for obtaining originating mobile terminal signaling data from the background and original social software is registered data
Originating mobile terminal signaling data and original social software data of registering are pre-processed, and the correspondence of generation meets specific format
Signaling data to be processed and data to be processed of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from signaling number to be processed
According to middle extraction moving point, obtained moving point track data;According to the classification information of registering in data to be processed of registering, building is simultaneously
Learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively count
According to using based on Bayesian model carry out activity locus of points semantic information label, generation activity space-time trajectory chain;
The activity venue data acquisition module specifically includes:
People and time are ranked up by sequencing unit for obtaining signaling data to be processed from the background according to specific time rule,
The sequential track of obtained people;
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, successively will
Each position that people enters is set as moving point, and first position that people enters is set as first work in the movable locus of points
Dynamic point;
Candidate active locus of points generation unit, for calculate in sequential track the space length of every bit and existing moving point with
Time difference, if space length is less than given threshold, and time difference is less than given threshold, then moving point is added in the point,
Otherwise, the point is set as new moving point, until in sequential track all the points all calculate finish, obtain candidate active point rail
Mark;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting candidate
The entry time of moving point and the difference of time departure will then correspond to candidate active point from candidate active less than the second given threshold
After removing in the locus of points, moving point track data is generated;
First probability calculation unit, for classification and the user different time sections in one day of registering according to social platform of registering
Total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;
Second probability calculation unit calculates work of the different groups activity under different time for the data of registering according to user
Dynamic transfering probability distribution;
It is movable to calculate different region progress different groups for the data of registering according to user for third probability calculation unit
Probability distribution;
Preset unit, the time identification window of the activity venue for presetting people, be denoted as respectively the first active window,
Second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point duration respectively with
One active window and the second active window are matched, if the duration of moving point falls in a certain active window, and Zhan is total
50% or more of active window time span, then the moving point corresponds to the corresponding activity venue of the active window as candidate living
Dynamic position;
Activity venue data capture unit, for obtaining match time longest candidate active position as user actively
Point data.
5. the group activity data gathering system according to claim 4 based on multi-source space-time trajectory data, feature exist
In the preprocessing module specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling data
Quality cleaning is carried out, repeated data, the data of removal attribute missing, the number of removal time and space not within the predefined range are removed
According to removal user's point quantity is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
It registers data processing unit, registers data for obtaining original social software from the background, register data to original social software
Quality cleaning is carried out, repeated data is removed, the data of removal attribute missing remove the number of time and space not in research range
According to removal user registers quantity in a certain range of user data, removes the user data only registered in one place, generates pre-
Handle data of registering;
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre- set pattern
Then the resolution ratio of the scale of grid is converted, and generates corresponding signaling data to be processed and data to be processed of registering.
6. the group activity data gathering system according to claim 4 based on multi-source space-time trajectory data, feature exist
In the semantic marker module specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, the work of time and previous moment
After dynamic type, the new probability formula that subsequent time carries out a certain type of activity is generated;
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculating to be engaged in not
With movable probability size, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Activity space-time trajectory chain generation unit, after all moving points in moving point track data are marked, output activity
Space-time trajectory chain.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610517438.6A CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610517438.6A CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106211071A CN106211071A (en) | 2016-12-07 |
CN106211071B true CN106211071B (en) | 2019-05-21 |
Family
ID=57464652
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610517438.6A Active CN106211071B (en) | 2016-07-04 | 2016-07-04 | Group activity method of data capture and system based on multi-source space-time trajectory data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106211071B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169260B (en) * | 2017-03-23 | 2021-05-11 | 四川省公安厅 | Heterogeneous multi-source data resonance system and method based on space-time trajectory |
CN107274058A (en) * | 2017-05-10 | 2017-10-20 | 福建海峡中创网络信息技术股份有限公司 | A kind of determination methods of mechanics |
CN108629000A (en) * | 2018-05-02 | 2018-10-09 | 深圳市数字城市工程研究中心 | A kind of the group behavior feature extracting method and system of mobile phone track data cluster |
CN108597224B (en) * | 2018-05-02 | 2020-05-19 | 深圳市数字城市工程研究中心 | Method and system for identifying to-be-improved traffic facilities based on space-time trajectory data |
CN109918395A (en) * | 2019-02-19 | 2019-06-21 | 北京明略软件系统有限公司 | One kind of groups method for digging and device |
CN110543457A (en) * | 2019-09-11 | 2019-12-06 | 北京明略软件系统有限公司 | Track type document processing method and device, storage medium and electronic device |
CN111275969B (en) * | 2020-02-15 | 2022-02-25 | 湖南大学 | Vehicle track filling method based on intelligent identification of road environment |
CN112069573B (en) * | 2020-08-24 | 2021-04-13 | 深圳大学 | City group space simulation method, system and equipment based on cellular automaton |
CN112070304B (en) * | 2020-09-09 | 2021-05-18 | 深圳大学 | City group element interaction measuring method, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7373524B2 (en) * | 2004-02-24 | 2008-05-13 | Covelight Systems, Inc. | Methods, systems and computer program products for monitoring user behavior for a server application |
CN102880719A (en) * | 2012-10-16 | 2013-01-16 | 四川大学 | User trajectory similarity mining method for location-based social network |
CN104750751A (en) * | 2013-12-31 | 2015-07-01 | 华为技术有限公司 | Method and device for annotating trace data |
CN104750829A (en) * | 2015-04-01 | 2015-07-01 | 华中科技大学 | User position classifying method and system based on signing in features |
CN105243148A (en) * | 2015-10-25 | 2016-01-13 | 西华大学 | Checkin data based spatial-temporal trajectory similarity measurement method and system |
-
2016
- 2016-07-04 CN CN201610517438.6A patent/CN106211071B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7373524B2 (en) * | 2004-02-24 | 2008-05-13 | Covelight Systems, Inc. | Methods, systems and computer program products for monitoring user behavior for a server application |
CN102880719A (en) * | 2012-10-16 | 2013-01-16 | 四川大学 | User trajectory similarity mining method for location-based social network |
CN104750751A (en) * | 2013-12-31 | 2015-07-01 | 华为技术有限公司 | Method and device for annotating trace data |
CN104750829A (en) * | 2015-04-01 | 2015-07-01 | 华中科技大学 | User position classifying method and system based on signing in features |
CN105243148A (en) * | 2015-10-25 | 2016-01-13 | 西华大学 | Checkin data based spatial-temporal trajectory similarity measurement method and system |
Non-Patent Citations (1)
Title |
---|
Exploring the distribution and dynamics of functional regions using mobile phone data and social media data;Jinzhou CAO;《CUPUM》;20151231;正文第2.1-2.4.2节,图2.5 |
Also Published As
Publication number | Publication date |
---|---|
CN106211071A (en) | 2016-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106211071B (en) | Group activity method of data capture and system based on multi-source space-time trajectory data | |
CN110135295A (en) | A kind of unsupervised pedestrian recognition methods again based on transfer learning | |
CN107977673B (en) | Economic activity population identification method based on big data | |
CN108629978A (en) | A kind of traffic trajectory predictions method based on higher-dimension road network and Recognition with Recurrent Neural Network | |
CN109241255A (en) | A kind of intension recognizing method based on deep learning | |
CN113378891B (en) | Urban area relation visual analysis method based on track distribution representation | |
CN105493109A (en) | Air quality inference using multiple data sources | |
CN108629000A (en) | A kind of the group behavior feature extracting method and system of mobile phone track data cluster | |
CN109784416B (en) | Traffic mode discrimination method of semi-supervised SVM (support vector machine) based on mobile phone signaling data | |
CN107977734A (en) | A kind of Forecasting Methodology based on mobile Markov model under space-time big data | |
CN113780665B (en) | Private car stay position prediction method and system based on enhanced recurrent neural network | |
CN110837973B (en) | Human trip selection information mining method based on traffic trip data | |
CN114519302A (en) | Road traffic situation simulation method based on digital twin | |
CN111144281A (en) | Urban rail transit OD passenger flow estimation method based on machine learning | |
CN103150383A (en) | Event evolution analysis method of short text data | |
CN112907941A (en) | Configuration method of emergency police dispatch points in accident-prone area | |
CN111222491A (en) | Deep learning-based traffic flow evaluation method | |
CN108882152A (en) | A kind of privacy of user guard method reported based on Path selection | |
Xu et al. | A taxi dispatch system based on prediction of demand and destination | |
Zheng et al. | A deep learning–based approach for moving vehicle counting and short-term traffic prediction from video images | |
CN113159371B (en) | Unknown target feature modeling and demand prediction method based on cross-modal data fusion | |
Tan et al. | Statistical analysis and prediction of regional bus passenger flows | |
CN114862001B (en) | Urban crowd flow prediction method and system based on regional function enhancement features | |
Gao et al. | Method of Predicting Passenger Flow in Scenic Areas Considering Multisource Traffic Data. | |
Zhu et al. | Transportation modes behaviour analysis based on raw GPS dataset |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |