CN106211071B - Group activity method of data capture and system based on multi-source space-time trajectory data - Google Patents

Group activity method of data capture and system based on multi-source space-time trajectory data Download PDF

Info

Publication number
CN106211071B
CN106211071B CN201610517438.6A CN201610517438A CN106211071B CN 106211071 B CN106211071 B CN 106211071B CN 201610517438 A CN201610517438 A CN 201610517438A CN 106211071 B CN106211071 B CN 106211071B
Authority
CN
China
Prior art keywords
data
activity
time
moving point
registering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610517438.6A
Other languages
Chinese (zh)
Other versions
CN106211071A (en
Inventor
涂伟
曹劲舟
李清泉
乐阳
曹瑞
王振声
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201610517438.6A priority Critical patent/CN106211071B/en
Publication of CN106211071A publication Critical patent/CN106211071A/en
Application granted granted Critical
Publication of CN106211071B publication Critical patent/CN106211071B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W64/00Locating users or terminals or network equipment for network management purposes, e.g. mobility management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses group activity methods of data capture and system based on multi-source space-time trajectory data, method includes: that backstage obtains originating mobile terminal signaling data and original social software and registers and data and pre-processed, and generates the signaling data to be processed for meeting specific format and data to be processed of registering;The moving point track data that backstage is obtained from signaling data to be processed;Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;Backstage is marked, generation activity space-time trajectory chain according to moving point track data, the prior information of group activity rule, activity venue data using based on Bayesian model carry out activity locus of points semantic information.The present invention carries out the deduction of individual activity using Bayesian model, and considers influence of the previous moment Activity Type to later moment in time Activity Type in spatio-temporal activity track, realizes a wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection.

Description

Group activity method of data capture and system based on multi-source space-time trajectory data
Technical field
The present invention relates to technical field of data processing, more particularly to the group activity data based on multi-source space-time trajectory data Collection method and system.
Background technique
Traditional movable gathering method depends on activity log or activity survey, and sample size is few, and the collection time is long, time-consuming consumption Power.The outburst of space-time trajectory data provides new tool for the movable acquisition of large-scale groups.Space-time data analyzes correlative study The individual activity identification being primarily upon in realistic space, especially travel activity, lack the extraction to activity essential attribute information. Need to develop the group activity extracting method of fusion multi-source space-time trajectory data, to establish based on the movable urban science research of magnanimity Determine data basis.Space-time trajectory data (such as mobile phone signaling data, vehicle GPS data, social activity register data) is although comprising rich Rich temporal information and location information, but semantic information is opposite to be lacked, and spatial and temporal resolution is different, can not directly provide Group activity information.
Therefore, the existing technology needs to be improved and developed.
Summary of the invention
In view of the deficiencies in the prior art, it is an object of that present invention to provide a kind of, and the group based on multi-source space-time trajectory data is living Dynamic method of data capture and system.
Technical scheme is as follows:
A kind of group activity method of data capture based on multi-source space-time trajectory data, wherein method includes:
A, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to it is original it is mobile eventually End signaling data and original social software data of registering are pre-processed, and the correspondence of generation meets the signaling to be processed of specific format Data and data to be processed of registering;
B, from the background by presetting the rule of time and space, moving point is extracted from signaling data to be processed, is obtained Moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity rule Prior information;Moving point track data is obtained, activity venue data are obtained;
C, from the background according to moving point track data, the prior information of group activity rule, activity venue data, using being based on Bayesian model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the A is specifically included:
A1, backstage obtain originating mobile terminal signaling data, carry out quality cleaning to originating mobile terminal signaling data, go Except repeated data, the data of attribute missing, the data of removal time and space not within the predefined range, removal user's points are removed Amount is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
A2, original social software is obtained from the background registering data, quality cleaning is carried out to original social software data of registering, is gone Except repeated data, the data of attribute missing are removed, remove the data of time and space not in research range, removal user registers The user data that quantity is only registered in one place in a certain range of user data, removal generates and pre-processes data of registering;
A3, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to the scale of pre-defined rule grid Resolution ratio converted, generate corresponding signaling data to be processed and data to be processed of registering.
The group activity method of data capture based on multi-source space-time trajectory data, wherein by preparatory in the B The rule of setting time and space extracts moving point from signaling data to be processed, and obtained moving point track data specifically wraps It includes:
B11, signaling data to be processed is obtained from the background, people and time are ranked up according to specific time rule, obtained People sequential track;
B12, the sequential track according to people calculate the time that people enters and leaves specific position, successively enter people each A position is set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
B13, the space length and time difference for calculating every bit and existing moving point in sequential track, if space length Less than given threshold, and time difference is less than given threshold, then moving point is added in the point and the point is otherwise set as new Moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
B14, obtain the candidate active locus of points in candidate active point, when the entry time for detecting candidate active point and from The difference of ETAD expected time of arrival and departure will then correspond to candidate active point after removing in the candidate active locus of points, generate less than the second given threshold Moving point track data.
The group activity method of data capture based on multi-source space-time trajectory data, wherein according to wait locate in the B The classification information of registering in data of registering is managed, constructs and the prior information for learning group activity rule specifically includes:
B21, according to social activity register platform register classification and user's data in different time periods of registering in one day it is total Amount, is calculated different groups activity and is distributed in intraday intensive probable;
B22, the data of registering according to user calculate the different groups activity movable transition probability under different time point Cloth;
B23, the data of registering according to user calculate different regions and carry out the movable probability distribution of different groups.
The group activity method of data capture based on multi-source space-time trajectory data, wherein acquisition activity in the B Locus of points data obtain activity venue data and specifically include:
B31, preset people activity venue time identification window, be denoted as respectively the first active window, second activity Window;
B32, the moving point track data for obtaining people are living with the first active window and second respectively by the moving point duration Dynamic window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time length 50% or more, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
B33, activity venue data of the match time longest candidate active position as user are obtained.
The group activity method of data capture based on multi-source space-time trajectory data, wherein the C is specifically included:
C1, according to Bayesian model, and after the Activity Type of given position, time and previous moment, generate Subsequent time carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, are obtained most The activity mark of maximum probability is the maximum probability Activity Type of the moving point;
C3, by moving point track data all moving points label after, output activity space-time trajectory chain.
A kind of group activity data gathering system based on multi-source space-time trajectory data, wherein system includes:
Preprocessing module, for obtaining originating mobile terminal signaling data from the background and original social software is registered data, point Other to pre-process to originating mobile terminal signaling data and original social software data of registering, the correspondence of generation meets particular bin The signaling data to be processed of formula and data to be processed of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from letter to be processed It enables and extracts moving point in data, obtained moving point track data;According to the classification information of registering in data to be processed of registering, structure Build and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively Point data is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw It registers data at pretreatment;
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre- The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background Sequence, the sequential track of obtained people;
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points A moving point;
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active The locus of points;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold After removing in the movable locus of points, moving point track data is generated;
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user Movable transfering probability distribution;
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit Dynamic probability distribution;
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively Window, the second active window;
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and 50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time Select moving position;
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user Dynamic locality data.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data Activity space-time trajectory chain.
The present invention provides a kind of group activity method of data capture and system based on multi-source space-time trajectory data, this hairs The bright deduction that individual activity is carried out using Bayesian model, and previous moment Activity Type is considered in spatio-temporal activity track to rear A wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection are realized in the influence of one moment Activity Type.
Detailed description of the invention
Fig. 1 is a kind of preferable implementation of group activity method of data capture based on multi-source space-time trajectory data of the invention The flow chart of example.
Fig. 2 is a kind of preferable implementation of group activity data gathering system based on multi-source space-time trajectory data of the invention The functional schematic block diagram of example.
Specific embodiment
To make the purpose of the present invention, technical solution and effect clearer, clear and definite, below to the present invention further specifically It is bright.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of preferable implementations of group activity method of data capture based on multi-source space-time trajectory data The flow chart of example, as shown in Figure 1, wherein method includes:
Step S100, backstage obtains originating mobile terminal signaling data and original social software and registers data, respectively to original Beginning mobile terminal signaling data and original social software data of registering are pre-processed, the correspondence of generation meet specific format to Handle signaling data and data to be processed of registering.Wherein mobile terminal is preferably mobile phone.
In further embodiment, specifically included in step S100:
Step S101, backstage obtains originating mobile terminal signaling data, carries out quality to originating mobile terminal signaling data Cleaning removes repeated data, the data of removal attribute missing, the data of removal time and space not within the predefined range, removal User's point quantity is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
Step S102, original social software is obtained from the background to register data, and quality is carried out to original social software data of registering Cleaning, removes repeated data, and the data of removal attribute missing remove the data of time and space not in research range, removal User registers quantity in a certain range of user data, removes the user data only registered in one place, generates pretreatment label To data;
Step S103, signaling data will be pre-processed and pre-process the spatial resolution for data of registering according to pre-defined rule grid The resolution ratio of scale converted, generate corresponding signaling data to be processed and data to be processed of registering.
When it is implemented, pre-processing mobile phone signaling data and social activity data of registering, obtain locating after being allowed to meet Desired data are managed, particular content includes:
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to mobile phone signaling data Between and data of the space not in research range, removal user's point quantity be less than or greater than certain threshold value user data;Threshold value Selection depend on specific data type, data format, the quality of data.Preferably, threshold value value range is every less than 3 It, is greater than 100 daily.
Quality cleaning, including removal repeated data, the data of removal attribute missing, when removal are carried out to social activity data of registering Between and data of the space not in research range;Removal user registers user data of the quantity less than 2 times, greater than 100 times;Removal The user data only registered in one place;
For multi-source space-time trajectory data, the influence of spatial resolution is considered.Mobile phone signaling data and social activity are registered number According to spatial resolution be uniformly converted to the scale of rule-based grid.The scale size of regular grid generally depends on above two The spatial resolution of class data itself.Preferential scale selection is 500m*500m.
Step S200, from the background by presetting the rule of time and space, the extraction activity from signaling data to be processed Point, obtained moving point track data;According to the classification information of registering in data to be processed of registering, constructs and learn group activity The prior information of rule;Moving point track data is obtained, activity venue data are obtained.
Further, it is mentioned from signaling data to be processed in step S200 by presetting the rule of time and space Moving point is taken, obtained moving point track data specifically includes:
Step S211, signaling data to be processed is obtained from the background, and people and time are ranked up according to specific time rule, The sequential track of obtained people;
Step S212, according to the sequential track of people, the time that people enters and leaves specific position is calculated, successively enters people Each position be set as moving point, and first position that people enters is set as first moving point in the movable locus of points;
Step S213, the space length and time difference for calculating every bit and existing moving point in sequential track, if empty Between distance be less than given threshold, and time difference be less than given threshold, then by it is described point be added moving point, otherwise, by the point Be set as new moving point, until in sequential track all the points all calculate finish, obtain the candidate active locus of points;
Step S214, the candidate active point in the candidate active locus of points is obtained, when detecting the entrance of candidate active point Between and time departure difference less than the second given threshold, then corresponding candidate active point is removed from the candidate active locus of points Afterwards, moving point track data is generated.
When it is implemented, by extracting the moving point of people, obtaining the work of people for passing through processed mobile phone signaling data Moving point trace.The method for extracting moving point mainly determines that the specific method is as follows by setting time and the rule in space:
It for the mobile phone signaling data of generation, is ranked up according to people and time, obtains the sequential track of people;
Using the sequential track of people, its time for entering and leaving each position (grid) is calculated, first position is set as First moving point in the movable locus of points;
As the time is mobile, calculate in sequential track the space of every bit and the moving point in the existing movable locus of points away from From with time difference;If space length is less than given threshold, and time difference is less than given threshold, then the point is added to this Moving point;Otherwise, which is set as new moving point;Until in sequential track all the points all calculate finish, obtain candidate active The locus of points;The range of preferred given threshold is 500m-1000m.
For the candidate active point in the candidate active locus of points, if the entry time of the point and the difference of time departure are less than Certain threshold value, then it is assumed that the point is not moving point, it is removed from the candidate active locus of points, the moving point rail finally obtained Mark.Preferably, threshold value value range is -3 hours 1 hour.
Further, it constructs according to the classification information of registering in data to be processed of registering in step S200 and learns group The prior information of mechanics specifically includes:
Step S221, according to classification and the user number in different time periods of registering in one day of registering of social platform of registering According to total amount, different groups activity is calculated and is distributed in intraday intensive probable;
Step S222, according to the data of registering of user, it is general to calculate activity transfer of the different groups activity under different time Rate distribution;
Step S223, it according to the data of registering of user, calculates different regions and carries out the movable probability distribution of different groups.
When it is implemented, it is rich in classification information abundant of registering using it for registering data by processed social activity, The prior information of building and study group activity rule.The specific method is as follows:
It is registered according to social activity classification and the user data in different time periods of registering in one day of registering provided by platform Total amount is calculated different groups activity in intraday intensive probable and is distributed Pr (ATi| t), indicate are as follows:
checkins(ATi, t) and indicate that moment t Activity Type is the quantity of registering of i, ∑tcheckins(ATi, t) and it is one day Interior each moment is engaged in the quantity of registering that Activity Type is i, wherein ATiTo be engaged in the number of registering that the class of activity is i, according to user Track of registering, movable transfering probability distribution of the different groups activity under different time is calculated, is expressed as Pr (ATi,t| ATj, t-1), wherein i, j indicate movable classification, and t indicates the time.ATi, it is registering for i that t expression, which is engaged in Activity Type in t moment, Quantity, (ATj, t-1) and it indicates to be engaged in the number of registering that Activity Type is j, probability P r (AT at the t-1 momenti,t|ATj, t-1) meaning Justice is to be engaged in the probability of movable i in moment t in the case where known previous moment t-1 is engaged in movable j;Pr (X) indicates event X Probability announce;
According to the track of registering of user, different mesh regions are calculated and carry out the movable probability distribution of different groups, table It is shown as: Pr (Gridm|ATi, t), wherein m is grid serial number, GridmIndicate m-th of grid, i is the class of activity, and t is the time.
In further embodiment, moving point track data is obtained in the step S200, obtains activity venue data tool Body includes:
Step S231, the time identification window for presetting the activity venue of people, is denoted as the first active window, second respectively Active window;
Step S232, obtain people moving point track data, by the moving point duration respectively with the first active window and Second active window is matched, if the duration of moving point falls in a certain active window, and accounts for total activity window time 50% or more of length, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
Step S233, activity venue data of the match time longest candidate active position as user are obtained.
When it is implemented, obtained moving point track data, detects house and the work activities of people.The specific method is as follows:
According to common sense, the identification window of setting house activity and work activities is set to: 0. -7 point, 9. -17 points;
For the moving point track data of people, the duration of moving point is matched with two above identification window, If the duration of the moving point falls in identification window, and 50% or more of the total identification window time span of Zhan, then it is assumed that With success, as candidate house or work activities position;
Find match time longest house or work activities position house and work activities position as the user;If There is no successful match, then it is assumed that the user does not find house or work activities position.
Step S300, from the background according to moving point track data, the prior information of group activity rule, activity venue data, It is marked using based on Bayesian model carry out activity locus of points semantic information, generation activity space-time trajectory chain.
Using the movable locus of points by obtaining, obtained group activity temporal prior information, the house work of obtained people Make action message, marked based on Bayesian model carry out activity locus of points semantic information, the action message of label mainly includes occupying Family, work, other (such as: amusement/shopping/study/leisure/trip), obtain activity space-time trajectory chain.
Obtained spatio-temporal activity track chain has important meaning for research urban planning and urban function region dynamic change Justice.According to the variation of spatio-temporal activity, tune can quickly be made in time for the dynamic change for the urban function region planned Whole and prediction.
Further embodiment, step S300 are specifically included:
Step S301, according to Bayesian model, and given position, the Activity Type of time and previous moment Afterwards, the new probability formula that subsequent time carries out a certain type of activity is generated;
Step S302, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, The activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Step S303, after all moving points in moving point track data being marked, output activity space-time trajectory chain.
When it is implemented, according to Bayesian model, in the activity class of given specific location, time and previous moment Under type, the lower moment at a moment will carry out the probability of a certain type of activity are as follows:
Wherein, m is grid serial number, and j is the Activity Type at previous moment, and t is current time, and i is current time activity Type.
For Pr (Gridm|ATi,t,ATj), it is believed that ATjWith GridmCondition is unrelated, then the formula can simplify are as follows:
Pr(Gridm|ATi,t,ATj)=Pr (Gridm|ATi,t) (2)
For Pr (ATi|t,ATj), which can rewrite are as follows:
Pr(ATi|t,ATj)=Pr (ATi,t|ATj, t-1) and (3)
In conjunction with formula (2) (3), formula (1) is converted are as follows:
Pr(ATi|Gridm,t,ATj) ∝ Pr (Gridm|ATi, t) and Pr(ATi,t|ATj,t-1)Pr(ATj|t)(5)
It for the movable locus of points, is sequentially inputted in formula (5), different movable probability sizes are engaged in calculating, take maximum The activity mark of probability is the maximum probability Activity Type of the moving point;
Particularly, for have been marked as at home or work activities type grid position, then by Pr (Gridm|ATi,t) It is set as 1, and by ATj, t-1=AThomeorATworking, continue the label processing for being input to next moving point.Until all work Moving point in moving point trace is marked, and output obtains movable Space-time Chain.AThomeIndicate that Activity Type is AT at homeworking Indicate that Activity Type is to work.
Wherein, moving point track extraction method depends on the spatial and temporal resolution of specific data type, data, is not limited to The method that the present invention introduces;
The observation duration of space-time data is limited to work activities detection method at home, the selection of threshold value is not limited to this hair The method of bright introduction;
The prior information of building and study group activity rule is not limited to social media and registers data, can also use residence The modes such as people's survey data, GPS track data, volunteer's data.
The present invention proposes a kind of completely new group activity collection method based on multi-source space-time trajectory data, using Bayes Model carries out the deduction of individual activity, solves existing method the problems such as taking time and effort, is at high cost, sample size is small, realize it is a wide range of, The accurate, quick of magnanimity group activity, high efficiency extraction and collection.Group activity deduction of the invention not only allows for city space The Factors on Human class such as middle time, position movable constraint, it is also contemplated that previous moment Activity Type is to rear in spatio-temporal activity track The influence of one moment Activity Type considers movable deduction in mankind's spatio-temporal activity chain.
The present invention also provides a kind of preferable realities of group activity data gathering system based on multi-source space-time trajectory data The functional schematic block diagram of example is applied, as shown in Fig. 2, system includes:
Preprocessing module 100, for obtaining originating mobile terminal signaling data from the background and original social software is registered data, Originating mobile terminal signaling data and original social software data of registering are pre-processed respectively, the correspondence of generation meets specific The signaling data to be processed of format and data to be processed of registering;Specifically as described in embodiment of the method.
Activity venue data acquisition module 200, for backstage by presetting the rule of time and space, to be processed Moving point is extracted in signaling data, obtained moving point track data;According to the classification information of registering in data to be processed of registering, Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;Specific such as side Described in method embodiment.
Semantic marker module 300, for backstage according to moving point track data, the prior information of group activity rule, work Dynamic locality data, is marked, generation activity space-time trajectory chain using based on Bayesian model carry out activity locus of points semantic information;Tool Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the preprocessing module It specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not within the predefined range Data, removal user's point quantity be less than or greater than certain threshold value user data, generate pretreatment signaling data;Specific such as side Described in method embodiment.
It registers data processing unit, registers data for obtaining original social software from the background, register to original social software Data carry out quality cleaning, remove repeated data, and the data of removal attribute missing remove time and space not in research range Data, removal user registers quantity in a certain range of user data, and the user data that removal is only registered in one place is raw It registers data at pretreatment;Specifically as described in embodiment of the method.
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre- The resolution ratio for determining the scale of regular grid is converted, and corresponding signaling data to be processed and data to be processed of registering are generated;Tool Body is as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein described actively to count It is specifically included according to module is obtained:
Sequencing unit carries out people and time according to specific time rule for obtaining signaling data to be processed from the background Sequence, the sequential track of obtained people;Specifically as described in embodiment of the method.
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, according to The secondary each position for entering people is set as moving point, and first first position that people enters be set as in the movable locus of points A moving point;Specifically as described in embodiment of the method.
Candidate active locus of points generation unit, for calculate the space of every bit and existing moving point in sequential track away from From with time difference, if space length be less than given threshold, and time difference be less than given threshold, then by it is described point addition activity Otherwise the point is set as new moving point by point, until in sequential track all the points all calculate finish, obtain candidate active The locus of points;Specifically as described in embodiment of the method.
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting The entry time of candidate active point and the difference of time departure will then correspond to candidate active point from candidate less than the second given threshold After removing in the movable locus of points, moving point track data is generated;Specifically as described in embodiment of the method.
First probability calculation unit, for according to social activity register platform register classification and user it is different in one day when Between section total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;Specific such as method is implemented Described in example.
Second probability calculation unit calculates different groups activity under different time for the data of registering according to user Movable transfering probability distribution;Specifically as described in embodiment of the method.
It is living to calculate different region progress different groups for the data of registering according to user for third probability calculation unit Dynamic probability distribution;Specifically as described in embodiment of the method.
Unit is preset, the time identification window of the activity venue for presetting people is denoted as the first activity respectively Window, the second active window;Specifically as described in embodiment of the method.
Candidate active location determination unit distinguishes the moving point duration for obtaining the moving point track data of people It is matched with the first active window and the second active window, if the duration of moving point falls in a certain active window, and 50% or more of total activity window time length is accounted for, then the moving point corresponds to the corresponding activity venue of the active window as time Select moving position;Specifically as described in embodiment of the method.
Activity venue data capture unit, for obtaining work of the match time longest candidate active position as user Dynamic locality data;Specifically as described in embodiment of the method.
The group activity data gathering system based on multi-source space-time trajectory data, wherein the semantic marker mould Block specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, time and previous moment Activity Type after, generate subsequent time and carry out the new probability formula of a certain type of activity;Specifically as described in embodiment of the method.
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculate from The different movable probability sizes of thing, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;Tool Body is as described in embodiment of the method.
Activity space-time trajectory chain generation unit, for exporting after all moving points label in moving point track data Activity space-time trajectory chain;Specifically as described in embodiment of the method.
In conclusion the present invention provides a kind of group activity method of data capture based on multi-source space-time trajectory data and System, method include: that backstage obtains originating mobile terminal signaling data and original social software and registers and data and pre-processed, Generate the signaling data to be processed for meeting specific format and data to be processed of registering;The work that backstage is obtained from signaling data to be processed Moving point trace data;Construct and learn the prior information of group activity rule;Moving point track data is obtained, activity venue is obtained Data;Backstage is according to moving point track data, the prior information of group activity rule, activity venue data, using based on pattra leaves This model carry out activity locus of points semantic information label, generation activity space-time trajectory chain.The present invention is carried out using Bayesian model The deduction of individual activity, and previous moment Activity Type is considered in spatio-temporal activity track to the shadow of later moment in time Activity Type It rings, realizes a wide range of, the accurate, quick of magnanimity group activity, high efficiency extraction and collection.
It should be understood that the application of the present invention is not limited to the above for those of ordinary skills can With improvement or transformation based on the above description, all these modifications and variations all should belong to the guarantor of appended claims of the present invention Protect range.

Claims (6)

1. a kind of group activity method of data capture based on multi-source space-time trajectory data, which is characterized in that the described method includes:
A, backstage obtains originating mobile terminal signaling data and original social software and registers data, believes respectively originating mobile terminal Data and original social software data of registering are enabled to be pre-processed, the correspondence of generation meets the signaling data to be processed of specific format With data to be processed of registering;
B, from the background by presetting the rule of time and space, moving point is extracted from signaling data to be processed, obtained work Moving point trace data;According to the classification information of registering in data to be processed of registering, the priori of group activity rule is constructed and learnt Information;Moving point track data is obtained, activity venue data are obtained;
C, from the background according to moving point track data, the prior information of group activity rule, activity venue data, using based on pattra leaves This model carry out activity locus of points semantic information label, generation activity space-time trajectory chain;
By presetting the rule of time and space in the B, moving point is extracted from signaling data to be processed, obtained work Moving point trace data specifically include:
B11, signaling data to be processed is obtained from the background, people and time are ranked up according to specific time rule, obtained people Sequential track;
B12, the sequential track according to people calculate the time that people enters and leaves specific position, each position for successively entering people It installs and is set to moving point, and first position that people enters is set as first moving point in the movable locus of points;
B13, the space length and time difference for calculating every bit and existing moving point in sequential track, if space length is less than Given threshold, and time difference is less than given threshold, then moving point is added in the point and the point is otherwise set as new work Dynamic point obtains the candidate active locus of points until all the points are all calculated and finished in sequential track;
Candidate active point in B14, the acquisition candidate active locus of points, when detecting the entry time of candidate active point with leaving Between difference less than the second given threshold, then will corresponding candidate active point after being removed in the candidate active locus of points, generation activity Locus of points data;
According to the classification information of registering in data to be processed of registering in the B, the priori letter of group activity rule is constructed and learnt Breath specifically includes:
B21, classification and the user total amount of data in different time periods of registering in one day of registering of platform of being registered according to social activity, meter Calculation obtains different groups activity and is distributed in intraday intensive probable;
B22, the data of registering according to user calculate movable transfering probability distribution of the different groups activity under different time;
B23, the data of registering according to user calculate different regions and carry out the movable probability distribution of different groups;
Moving point track data is obtained in the B, is obtained activity venue data and is specifically included:
B31, preset people activity venue time identification window, be denoted as the first active window, the second active window respectively;
B32, obtain people moving point track data, by the moving point duration respectively with the first active window and the second active window Mouthful matched, if the duration of moving point falls in a certain active window, and account for the 50% of total activity window time length with On, then the moving point corresponds to the corresponding activity venue of the active window as candidate active position;
B33, activity venue data of the match time longest candidate active position as user are obtained.
2. the group activity method of data capture according to claim 1 based on multi-source space-time trajectory data, feature exist In the A is specifically included:
A1, backstage obtain originating mobile terminal signaling data, carry out quality cleaning, removal weight to originating mobile terminal signaling data Complex data, the data of removal attribute missing, the data of removal time and space not within the predefined range, removal user's point quantity are small In or greater than certain threshold value user data, generate pretreatment signaling data;
A2, original social software is obtained from the background registering data, quality cleaning, removal weight are carried out to original social software data of registering Complex data, the data of removal attribute missing remove data not in research range of time and space, and removal user registers quantity In the user data that a certain range of user data, removal are only registered in one place, generates and pre-process data of registering;
A3, by pre-process signaling data and pretreatment register data spatial resolution according to point of the scale of pre-defined rule grid Resolution is converted, and corresponding signaling data to be processed and data to be processed of registering are generated.
3. the group activity method of data capture according to claim 1 based on multi-source space-time trajectory data, feature exist In the C is specifically included:
C1, according to Bayesian model, and after the Activity Type of given position, time and previous moment, generate next Moment carries out the new probability formula of a certain type of activity;
C2, according to each moving point in moving point track data, different movable probability sizes are engaged in calculating, are obtained most general The activity mark of rate is the maximum probability Activity Type of the moving point;
C3, by moving point track data all moving points label after, output activity space-time trajectory chain.
4. a kind of group activity data gathering system based on multi-source space-time trajectory data, which is characterized in that system includes:
Preprocessing module is right respectively for obtaining originating mobile terminal signaling data from the background and original social software is registered data Originating mobile terminal signaling data and original social software data of registering are pre-processed, and the correspondence of generation meets specific format Signaling data to be processed and data to be processed of registering;
Activity venue data acquisition module, for backstage by presetting the rule of time and space, from signaling number to be processed According to middle extraction moving point, obtained moving point track data;According to the classification information of registering in data to be processed of registering, building is simultaneously Learn the prior information of group activity rule;Moving point track data is obtained, activity venue data are obtained;
Semantic marker module, for backstage according to moving point track data, the prior information of group activity rule, actively count According to using based on Bayesian model carry out activity locus of points semantic information label, generation activity space-time trajectory chain;
The activity venue data acquisition module specifically includes:
People and time are ranked up by sequencing unit for obtaining signaling data to be processed from the background according to specific time rule, The sequential track of obtained people;
Moving point marking unit calculates the time that people enters and leaves specific position for the sequential track according to people, successively will Each position that people enters is set as moving point, and first position that people enters is set as first work in the movable locus of points Dynamic point;
Candidate active locus of points generation unit, for calculate in sequential track the space length of every bit and existing moving point with Time difference, if space length is less than given threshold, and time difference is less than given threshold, then moving point is added in the point, Otherwise, the point is set as new moving point, until in sequential track all the points all calculate finish, obtain candidate active point rail Mark;
Moving point track data processing unit, for obtaining the candidate active point in the candidate active locus of points, when detecting candidate The entry time of moving point and the difference of time departure will then correspond to candidate active point from candidate active less than the second given threshold After removing in the locus of points, moving point track data is generated;
First probability calculation unit, for classification and the user different time sections in one day of registering according to social platform of registering Total amount of data of registering, different groups activity is calculated and is distributed in intraday intensive probable;
Second probability calculation unit calculates work of the different groups activity under different time for the data of registering according to user Dynamic transfering probability distribution;
It is movable to calculate different region progress different groups for the data of registering according to user for third probability calculation unit Probability distribution;
Preset unit, the time identification window of the activity venue for presetting people, be denoted as respectively the first active window, Second active window;
Candidate active location determination unit, for obtaining the moving point track data of people, by the moving point duration respectively with One active window and the second active window are matched, if the duration of moving point falls in a certain active window, and Zhan is total 50% or more of active window time span, then the moving point corresponds to the corresponding activity venue of the active window as candidate living Dynamic position;
Activity venue data capture unit, for obtaining match time longest candidate active position as user actively Point data.
5. the group activity data gathering system according to claim 4 based on multi-source space-time trajectory data, feature exist In the preprocessing module specifically includes:
Signaling data processing unit, for obtaining originating mobile terminal signaling data from the background, to originating mobile terminal signaling data Quality cleaning is carried out, repeated data, the data of removal attribute missing, the number of removal time and space not within the predefined range are removed According to removal user's point quantity is less than or greater than the user data of certain threshold value, generates pretreatment signaling data;
It registers data processing unit, registers data for obtaining original social software from the background, register data to original social software Quality cleaning is carried out, repeated data is removed, the data of removal attribute missing remove the number of time and space not in research range According to removal user registers quantity in a certain range of user data, removes the user data only registered in one place, generates pre- Handle data of registering;
Resolution conversion unit, for that will pre-process signaling data and pre-process the spatial resolution for data of registering according to pre- set pattern Then the resolution ratio of the scale of grid is converted, and generates corresponding signaling data to be processed and data to be processed of registering.
6. the group activity data gathering system according to claim 4 based on multi-source space-time trajectory data, feature exist In the semantic marker module specifically includes:
4th probability calculation unit is used for according to Bayesian model, and given position, the work of time and previous moment After dynamic type, the new probability formula that subsequent time carries out a certain type of activity is generated;
Maximum probability Activity Type marking unit, for according to each moving point in moving point track data, calculating to be engaged in not With movable probability size, the activity mark for obtaining maximum probability is the maximum probability Activity Type of the moving point;
Activity space-time trajectory chain generation unit, after all moving points in moving point track data are marked, output activity Space-time trajectory chain.
CN201610517438.6A 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data Active CN106211071B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610517438.6A CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610517438.6A CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Publications (2)

Publication Number Publication Date
CN106211071A CN106211071A (en) 2016-12-07
CN106211071B true CN106211071B (en) 2019-05-21

Family

ID=57464652

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610517438.6A Active CN106211071B (en) 2016-07-04 2016-07-04 Group activity method of data capture and system based on multi-source space-time trajectory data

Country Status (1)

Country Link
CN (1) CN106211071B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169260B (en) * 2017-03-23 2021-05-11 四川省公安厅 Heterogeneous multi-source data resonance system and method based on space-time trajectory
CN107274058A (en) * 2017-05-10 2017-10-20 福建海峡中创网络信息技术股份有限公司 A kind of determination methods of mechanics
CN108629000A (en) * 2018-05-02 2018-10-09 深圳市数字城市工程研究中心 A kind of the group behavior feature extracting method and system of mobile phone track data cluster
CN108597224B (en) * 2018-05-02 2020-05-19 深圳市数字城市工程研究中心 Method and system for identifying to-be-improved traffic facilities based on space-time trajectory data
CN109918395A (en) * 2019-02-19 2019-06-21 北京明略软件系统有限公司 One kind of groups method for digging and device
CN110543457A (en) * 2019-09-11 2019-12-06 北京明略软件系统有限公司 Track type document processing method and device, storage medium and electronic device
CN111275969B (en) * 2020-02-15 2022-02-25 湖南大学 Vehicle track filling method based on intelligent identification of road environment
CN112069573B (en) * 2020-08-24 2021-04-13 深圳大学 City group space simulation method, system and equipment based on cellular automaton
CN112070304B (en) * 2020-09-09 2021-05-18 深圳大学 City group element interaction measuring method, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373524B2 (en) * 2004-02-24 2008-05-13 Covelight Systems, Inc. Methods, systems and computer program products for monitoring user behavior for a server application
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN104750751A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Method and device for annotating trace data
CN104750829A (en) * 2015-04-01 2015-07-01 华中科技大学 User position classifying method and system based on signing in features
CN105243148A (en) * 2015-10-25 2016-01-13 西华大学 Checkin data based spatial-temporal trajectory similarity measurement method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373524B2 (en) * 2004-02-24 2008-05-13 Covelight Systems, Inc. Methods, systems and computer program products for monitoring user behavior for a server application
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN104750751A (en) * 2013-12-31 2015-07-01 华为技术有限公司 Method and device for annotating trace data
CN104750829A (en) * 2015-04-01 2015-07-01 华中科技大学 User position classifying method and system based on signing in features
CN105243148A (en) * 2015-10-25 2016-01-13 西华大学 Checkin data based spatial-temporal trajectory similarity measurement method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Exploring the distribution and dynamics of functional regions using mobile phone data and social media data;Jinzhou CAO;《CUPUM》;20151231;正文第2.1-2.4.2节,图2.5

Also Published As

Publication number Publication date
CN106211071A (en) 2016-12-07

Similar Documents

Publication Publication Date Title
CN106211071B (en) Group activity method of data capture and system based on multi-source space-time trajectory data
CN110135295A (en) A kind of unsupervised pedestrian recognition methods again based on transfer learning
CN107977673B (en) Economic activity population identification method based on big data
CN108629978A (en) A kind of traffic trajectory predictions method based on higher-dimension road network and Recognition with Recurrent Neural Network
CN109241255A (en) A kind of intension recognizing method based on deep learning
CN113378891B (en) Urban area relation visual analysis method based on track distribution representation
CN105493109A (en) Air quality inference using multiple data sources
CN108629000A (en) A kind of the group behavior feature extracting method and system of mobile phone track data cluster
CN109784416B (en) Traffic mode discrimination method of semi-supervised SVM (support vector machine) based on mobile phone signaling data
CN107977734A (en) A kind of Forecasting Methodology based on mobile Markov model under space-time big data
CN113780665B (en) Private car stay position prediction method and system based on enhanced recurrent neural network
CN110837973B (en) Human trip selection information mining method based on traffic trip data
CN114519302A (en) Road traffic situation simulation method based on digital twin
CN111144281A (en) Urban rail transit OD passenger flow estimation method based on machine learning
CN103150383A (en) Event evolution analysis method of short text data
CN112907941A (en) Configuration method of emergency police dispatch points in accident-prone area
CN111222491A (en) Deep learning-based traffic flow evaluation method
CN108882152A (en) A kind of privacy of user guard method reported based on Path selection
Xu et al. A taxi dispatch system based on prediction of demand and destination
Zheng et al. A deep learning–based approach for moving vehicle counting and short-term traffic prediction from video images
CN113159371B (en) Unknown target feature modeling and demand prediction method based on cross-modal data fusion
Tan et al. Statistical analysis and prediction of regional bus passenger flows
CN114862001B (en) Urban crowd flow prediction method and system based on regional function enhancement features
Gao et al. Method of Predicting Passenger Flow in Scenic Areas Considering Multisource Traffic Data.
Zhu et al. Transportation modes behaviour analysis based on raw GPS dataset

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant