CN103995859A - Geographical-tag-oriented hot spot area event detection system applied to LBSN - Google Patents

Geographical-tag-oriented hot spot area event detection system applied to LBSN Download PDF

Info

Publication number
CN103995859A
CN103995859A CN201410206191.7A CN201410206191A CN103995859A CN 103995859 A CN103995859 A CN 103995859A CN 201410206191 A CN201410206191 A CN 201410206191A CN 103995859 A CN103995859 A CN 103995859A
Authority
CN
China
Prior art keywords
regx
region
geographical labels
poi
hot spot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410206191.7A
Other languages
Chinese (zh)
Other versions
CN103995859B (en
Inventor
李巍
李国君
李云春
蒋江涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING ZHONGSHI INFORMATION TECHNOLOGY Co.,Ltd.
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201410206191.7A priority Critical patent/CN103995859B/en
Publication of CN103995859A publication Critical patent/CN103995859A/en
Application granted granted Critical
Publication of CN103995859B publication Critical patent/CN103995859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information

Abstract

The invention discloses a geographical-tag-oriented hot spot area event detection system applied to an LBSN, and belongs to the technical field of network data processing. The detection system is operated in the LBSN and is composed of a sign-in clustering module, an area calculation module based on tag clustering and a hot-spot area event calculation module. The sign-in clustering module is used for performing clustering processing on sign-in information to obtain a geographical area corresponding to the sign-in information, a geographical tag clustering algorithm is adopted by the area calculation module based on tag clustering to obtain an in-cluster area set from the geographical area corresponding to the sign-in information, the sign-in frequency in a time window is applied by the hot spot area event calculation module to obtain a hot spot area event from the in-cluster area set, and therefore the obtained hot spot area event is provided for a user. The cluster is utilized by the geographical-tag-oriented hot spot area event detection system to perform further clustering on points in the cluster within the smaller range, and the system has the advantages of being capable of greatly reducing the calculated data volume in the LBSN and improving the calculation efficiency.

Description

A kind of hot spot region incident detection system based on geographical labels that is applied to LBSN network
Technical field
The present invention relates to a kind of technical field of registering of geographical labels, more particularly, refer to a kind of hot spot region incident detection system based on geographical labels of the LBSN of being applied to network, wherein hot spot region is the division that the cluster based on label and geographic position is carried out.
Background technology
Geographical labels (Geo Tags) refers to that the data message for describing point of interest geographic position of living in, its information content include point of interest address information, point of interest latitude and longitude information.Geographical labels is the geographical location information of digitizing point of interest better, is conducive to global metadata location and geographical location information and reviews.Geographical labels is also referred to as geographical indication.The record of registering refers to that the main body of the society is at the point of interest data message obtaining of registering.
At present, location-based social networks (LBSN Location-based Social Networking) becomes more and more popular.Due to the fast development of the 4th third generation mobile communication network fast, and to Map Services and the powerful interface support of embedded GPS module smart mobile phone, it is easily mobile subscriber and identifies their position, and shares their LBSN database.In a LBSN database, user can find and create point of interest (poi point of interest), can register in their current location, makes comments and suggestion and interpolation good friend etc.Therefore, LBSN network, as Foursquare, Facebook Places, Sina's microblogging etc., has been taked different mechanism to attract user, and has been encouraged user to share their information of registering.And, had some researchs to start to utilize that these have that user produces with the geographical labels information of registering.Because these data can allow researcher to go to analyze the interests problem of social hierarchy in the mode of data-driven, and according to the INFORMATION DISCOVERY user's that registers Move Mode, predict good friend's relation, better understand the different aspect in city.Also can utilize the information of registering to come discovering hot region.
At present the hot spot region event based on containing geographical labels finds to mainly contain a kind of mode: the first good geographic grid of artificial division, then add up the information of registering on region separately, and the total amount of registering reaches certain threshold value and is designated hot spot region.There are three problems in this method, (1) first, zoning may be divided into different grids by actual region in advance, cannot reflect actual hot spot region.(2) criterion of hot spot region be this in advance the total quantity of registering of zoning whether reached a threshold value, reach and be designated hot spot region, but do not consider the impact of time factor.(3) region that grid is divided is larger, is difficult to more accurate region, location.
Summary of the invention
For the register feature of data of LBSN, and existing hot spot region event is found the deficiency of disposal route the present invention proposes a kind of hot spot region incident detection system based on geographical labels.This hot spot region incident detection system synthesis has considered that user is at the recent historical record of registering, building coarseness region in conjunction with geospatial information divides, then, adopt the clustering algorithm of geographical labels to calculate fine granularity regional extent, finally in region, calculate the hot spot region under certain hour window.The hot spot region incident detection system based on geographical labels of the present invention's design is embedded in LBSN database, is following the LBSN network operation.
A kind of hot spot region incident detection system based on geographical labels that is applied to LBSN network of the present invention's design, arranges the described hot spot region incident detection system (3) based on geographical labels between the LBSN database (2) in described LBSN network and user (1);
The described hot spot region incident detection system (3) based on geographical labels includes the cluster module (31) of registering, region computing module (32) and hot spot region event computing module (33) based on label clustering; Described hot spot region event computing module (33) is the interface that is connected between LBSN database (2) and user (1);
The cluster of registering module (31) first aspect is for sending to LBSN database (2) the solicited message Q that registers that contains geographical labels 31-2, described Q 31-2=R_POI p(x, y), POI;
R_POI p(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude;
POI represents geographical labels; Any one geographical labels in described POI is designated as a, and another geographical labels is designated as b, a, b ∈ POI;
The cluster of registering module (31) second aspect is according to Q 31-2=R_POI p(x, y), POI can search out the record of registering mating with geographical labels POI in LBSN database (2), is designated as the return message Q that registers 2-31;
The cluster of registering module (31) third aspect is to the return message Q that registers receiving 2-31carry out the processing of k-means clustering method according to cluster kcluster-span interval time, obtain region unit information Q 31-32, described Q 31-32={ regX 1, regX 2..., regX y, then by Q 31-32export to the region computing module (32) based on label clustering;
RegX 1represent first region unit in any one geographic area R;
RegX 2represent second region unit in any one geographic area R;
RegX yrepresent last region unit in any one geographic area R;
Y represents region unit number;
Region computing module (32) first aspect based on label clustering is for receiving area block message Q 31-32={ regX 1, regX 2..., regX y;
Region computing module (32) second aspect based on label clustering according to geographical labels cluster strategy POI-CP to Q 31-32={ regX 1, regX 2..., regX yprocess, obtain restraining geographical labels region unit Q 32-2; And convergence geographical labels region unit is written to LBSN database (2);
Hot spot region event computing module (33) first aspect is accepted user's (1) hot spot region inquiry request Request, described Request={Geo (x, y), dist, Hot}, and by Request={Geo (x, y), dist, Hot} is transmitted to LBSN database (2);
Hot spot region event computing module (33) second aspect is according to Request={Geo (x, y), dist, Hot} can search out (the x with Geo in LBSN database (2), y) hot spot region of coupling, is designated as inquiry return message Q 2-33;
Hot spot region event computing module (33) third aspect according to the frequency strategy POI-TP that registers under time window to described Q 2-33carry out computing, obtain region focus incident and rank ChecFreq, and described ChecFreq is fed back to user (1).
In the present invention, described geographical labels cluster strategy POI-CP has the following step:
Extraction belongs to the same area piece regX yin the step of geographical labels POI;
Calculating belongs to the same area piece regX yin the position number of geographical labels POI step;
Calculate the place-centric point of geographical labels POI and geographical labels position between maximum linear distance then described in judgement with zone radius threshold value r threshold valuesize, if by r threshold valueassignment is to the region unit distance correlation radius in affiliated area if choose maximum linear distance as the region unit distance correlation radius in affiliated area and then by the distance correlation radius of a geographical labels distance correlation radius with b geographical labels sum is than any two geographical labels a in upper geographical labels POI, the central point distance between b obtain distance correlation H _ rel a - b regX y = rD a regX y + rD b regX y CLD a - b regX y Step;
Calculate any two geographical labels a in geographical labels POI, between b semantic dependency S _ rel a - b Q 31 - 32 = 1 - E a - b Q 31 - 32 max ( L a - b Q 31 - 32 ) Step;
According to described with described with distance correlation threshold value rel distance, semantic dependency threshold value rel semanticcontrast, and merge region unit regX according to comparing result ystep;
If and time, by the position of registering of b geographical labels merge to the position of registering of a geographical labels
If or time, the position of registering of b geographical labels not with the position of registering of a geographical labels carry out region unit merging.
In the present invention, the concrete mode of the frequency strategy POI-TP that registers under described time window is: hot spot region event computing module (33) can calculate in real time user (1) and ask any region regX yfocus incident, register and record Q when ask separately in region to contain geography by the end of the history of current time t to focus region clustering database (2) 2-33, event description is Trend = ChecFreq t - ChecFreq t - 1 ΔT , Δ T represents time window, Δ T=|t-(t-1) |, t represents current time, t-1 represents previous moment, ChecFreq trepresent the frequency of registering of current time t, ChecFreq t-1represent the frequency of registering of previous moment t-1;
The active degree Rank of event is directly proportional with the event of continuing of registering to the frequency of registering of Δ T, that is:
Rank = Σ i = 1 regX y ChecFreq i t Ω i × 1 + max j ∈ regX j ( CU j ) regX y ;
Wherein be illustrated in the frequency of registering in time window Δ T, regX yrepresent any region, the element of suing for peace, i represents summing target, Ω ibe illustrated in time window Δ T user (1) the scope inner region of the asking total quantity of registering, represent the time window number in current all hot spot regions maximal value, j represents the area identification number of time window number maximum.
The advantage of the hot spot region incident detection system based on geographical labels of the present invention's design is:
1. use hot spot region of the present invention incident detection system to solve the low defect of hot spot region precision that adopts artificial division region to cause at LBSN database.Hot spot region of the present invention incident detection system is carried out cluster to hot spot region on Label space first in the ground, obtains rough hot spot region and divides, and has reduced the data volume in cluster process.
2. hot spot region of the present invention incident detection system adopts in coarseness bunch and carries out fine-grained excavation, and the hot spot region that obtains being true to life makes the hot spot region event of obtaining find better realistic demand.
3. hot spot region of the present invention incident detection system adopts time window to divide hot spot region, goes out hot spot region by the rate of change threshold search of registering, and the query time of LBSN network is shortened, and has improved response speed.
4. hot spot region of the present invention incident detection system adopts modular design, and LBSN network by hot spot region event computing module for being connected interface, realize and user interactions, improved user and inquired about the efficiency of hot spot region event.
Brief description of the drawings
Fig. 1 is the structured flowchart that the present invention is based on the hot spot region incident detection system of geographical labels.
Fig. 2 is the sequential chart that the present invention is based on the hot spot region incident detection system of geographical labels.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail.
In Fig. 1, provide the hot spot region incident detection system architecture diagram of registering based on user according to of the present invention.The hot spot region incident detection system 3 based on geographical labels of the present invention design is set between existing LBSN database 2 and user 1, and this system includes the cluster module 31 of registering, region computing module 32 and hot spot region event computing module 33 based on label clustering.Described hot spot region event computing module 33 is the interface that is connected between LBSN database 2 and user 1.
In the present invention, the history information of registering that the LBSN database 2 that utilizes location-based social networks (LBSN Location-based Social Networking) to form provides is carried out hot spot region incident detection, is the data source of finding out hot spot region related information.
In the present invention, the point of interest POI information of registering in any one geographic area is designated as R_POI p(x, y), POI.R represents geographic area, and POI is illustrated in the character string in R, i.e. the geographical labels of the point of interest of geographic area, place is also source file string required in semantic analysis, and the character length of POI is designated as L pOI, p represents the number of times that the main body of the society is registered at POI, POI p(x, y) represents the sign-in desk geographic position of the p time, and x represents longitude, and y represents latitude.
Usually, for the statement of present patent application content, geographical labels POI can be set to dining room A, the B of hospital, the C of library, the teaching building D etc. in any region R; Geographical labels POI adopts set form to be expressed as POI={A, B, and C, D}, expresses geographical labels POI for broad sense, and in described POI, any one geographical labels is designated as a, and another geographical labels is designated as b, a, b ∈ POI.Based on being described as of geographical labels:
The dining room A information of registering in any one geographic area R is designated as R_A α(x, y), A; R represents geographic area, and A is illustrated in the character string in R, and the character length of A is designated as L a(i.e. " geographic area " " dining room ", L a2 bytes of=12, Chinese character), a represents the number of times that the main body of the society is registered at A, A α(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude.
Hospital's B information of registering in any one geographic area R is designated as R_B β(x, y), B; R represents geographic area, and B is illustrated in the character string in R, and the character length of B is designated as L b(i.e. " geographic area " " hospital ", L a=12), β represents the number of times that the main body of the society is registered at B, B β(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude.
Library's C information of registering in any one geographic area R is designated as R_C γ(x, y), C; R represents geographic area, and C is illustrated in the character string in R, and the character length of C is designated as L c(i.e. " geographic area " " library ", L a=14), γ represents the number of times that the main body of the society is registered at C, C γ(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude.
The teaching building D information of registering in any one geographic area R is designated as R_D θ(x, y), D; R represents geographic area, and D is illustrated in the character string in R, and the character length of D is designated as L d(i.e. " geographic area " " teaching building ", L a=14), θ represents the number of times that the main body of the society is registered at D, D θ(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude.
User 1
User 1 is in the time using the hot spot region incident detection system 3 based on geographical labels of the present invention design, and first aspect is interest request Request={Geo (x, y), dist, and Hot} sends to hot spot region event computing module 33; Second aspect is for receiving the real-time hot spot region of the cluster information ChecFreq that hot spot region event computing module 33 returns.
Described recommendation request Request={Geo (x, y), dist, the Geo (x, y) in Hot} represents the geographic position at request user place, and x is longitude, and y is latitude; Dist represents the interest distance radius that user arranges; Hot represents the hot spot region that user pays close attention to.
Shown in Fig. 1 and Fig. 2, in the present invention, the hot spot region incident detection system 3 based on geographical labels includes the cluster module 31 of registering, region computing module 32 and hot spot region event computing module 33 based on label clustering.To describe modules below in detail:
The cluster of registering module 31
The cluster of registering module 31 first aspects are for sending the user of the containing geographical labels information Q that registers to LBSN database 2 31-2=R_POI p(x, y), POI;
The cluster of registering module 31 second aspects are according to Q 31-2=R_POI p(x, y), POI can search out multiple registering of mating with geographical labels POI and record Q in LBSN database 2 2-31;
The cluster of registering module 31 third aspect are to the information Q that registers that returns receiving 2-31carry out the processing of k-means clustering method according to cluster kcluster-span interval time, obtain region unit information Q 31-32={ regX 1, regX2 ,, regX y, then by Q 31-32export to the region computing module 32 based on label clustering.
In the present invention, described Q 31-2=R_POI p(x, y), the R_POI in POI p(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude.POI represents geographical labels, i.e. R_POI pthe geographic name at (x, y) place is also the content that source file string is recorded; Described POI={A, B, C, D}, A is dining room geographical labels, and B is hospital's geographical labels, and C is library's geographical labels, and D is teaching building geographical labels.
In the present invention, region unit information Q 31-32={ regX 1, regX 2..., regX yin regX 1represent first region unit in any one geographic area R, regX 2represent second region unit in any one geographic area R, regX yrepresent last region unit in any one geographic area R, y represents region unit number.RegX yalso referred to as any one region unit marking off in any one geographic area R.
Enumerate, meet registering of dining room A and record R_A 1(x, y), A, R_A 2(x, y), A, R_A 3(x, y), A, R_A 4(x, y), A, R_A 5(x, y), A ..., R_A α(x, y), A;
R_A 1(x, y) represents first sign-in desk position of dining room A;
R_A 2(x, y) represents second sign-in desk position of dining room A;
R_A 3(x, y) represents the 3rd the sign-in desk position of dining room A;
R_A 4(x, y) represents the 4th the sign-in desk position of dining room A;
R_A 5(x, y) represents the 5th the sign-in desk position of dining room A;
R_A α(x, y) represents last sign-in desk position of dining room A; For convenience of description, R_A α(x, y) is also referred to as any one sign-in desk position of dining room A;
Enumerate, meet registering of the B of hospital and record R_B 1(x, y), B, R_B 2(x, y), B, R_B 3(x, y), B, R_B 4(x, y), B, R_B 5(x, y), B ..., R_B β(x, y), B;
R_B 1(x, y) represents first sign-in desk position of the B of hospital;
R_B 2(x, y) represents second sign-in desk position of the B of hospital;
R_B 3(x, y) represents the 3rd the sign-in desk position of the B of hospital;
R_B 4(x, y) represents the 4th the sign-in desk position of the B of hospital;
R_B 5(x, y) represents the 5th the sign-in desk position of the B of hospital;
R_B β(x, y) represents last sign-in desk position of the B of hospital; For convenience of description, R_B β(x, y) is also referred to as any one sign-in desk position of the B of hospital;
Enumerate, meet registering of the C of library and record R_C 1(x, y), C, R_C 2(x, y), C, R_C 3(x, y), C, R_C 4(x, y), C, R_C 5(x, y), C ..., R_C γ(x, y), C;
R_C 1(x, y) represents first sign-in desk position of the C of library;
R_C 2(x, y) represents second sign-in desk position of the C of library;
R_C 3(x, y) represents the 3rd the sign-in desk position of the C of library;
R_C 4(x, y) represents the 4th the sign-in desk position of the C of library;
R_C 5(x, y) represents the 5th the sign-in desk position of the C of library;
R_C γ(x, y) represents last sign-in desk position of the C of library; For convenience of description, R_C γ(x, y) is also referred to as any one sign-in desk position of the C of library;
Enumerate, meet registering of teaching building D and record R_D 1(x, y), D, R_D 2(x, y), D, R_D 3(x, y), D, R_D 4(x, y), D, R_D 5(x, y), D ..., R_D θ(x, y), D;
R_D 1(x, y) represents first sign-in desk position of teaching building D;
R_D 2(x, y) represents second sign-in desk position of teaching building D;
R_D 3(x, y) represents the 3rd the sign-in desk position of teaching building D;
R_D 4(x, y) represents the 4th the sign-in desk position of teaching building D;
R_D 5(x, y) represents the 5th the sign-in desk position of teaching building D;
R_D θ(x, y) represents last sign-in desk position of teaching building D; For convenience of description, R_D θ(x, y) is also referred to as any one sign-in desk position of teaching building D.
According to dining room A, the B of hospital, the C of library, the teaching building D that give an example out, Q is recorded in registering of mating with geographical labels POI 2-31;
Q 2 - 31 = R _ A : A 1 ( x , y ) A 2 ( x , y ) A 3 ( x , y ) A 4 ( x , y ) A 5 ( x , y ) · · · A α ( x , y ) R _ B : B 1 ( x , y ) B 2 ( x , y ) B 3 ( x , y ) B 4 ( x , y ) B 5 ( x , y ) · · · B α ( x , y ) R _ C : C 1 ( x , y ) C 2 ( x , y ) C 3 ( x , y ) C 4 ( x , y ) C 5 ( x , y ) · · · C α ( x , y ) R _ D : D 1 ( x , y ) D 2 ( x , y ) D 3 ( x , y ) D 4 ( x , y ) D 5 ( x , y ) · · · D α ( x , y ) .
Enumerate first region unit regX 1registering of comprising records R_A 2(x, y), A, R_A 3(x, y), A, R_A 4(x, y), A, R_A 5(x, y), A, R_A α(x, y), A, R_B 1(x, y), B, R_B 2(x, y), B, R_B 3(x, y), B, R_B 4(x, y), B, R_C 1(x, y), C, R_C 2(x, y), C, R_D 1(x, y), D and R_D θ(x, y), D; First region unit regX 1adopt set form to be expressed as:
regX 1 = [ R _ A 2 ( x , y ) , A ] , [ R _ A 3 ( x , y ) , A ] , [ R _ A 4 ( x , y ) , A ] , [ R _ A 5 ( x , y ) , A ] , [ R _ A α ( x , y ) , A ] , [ R _ B 1 ( x , y ) , B ] , [ R _ B 2 ( x , y ) , B ] , [ R _ B 3 ( x , y ) , B ] , [ R _ B 4 ( x , y ) , B ] , [ R _ C 1 ( x , y ) , C ] , [ R _ C 2 ( x , y ) , C ] , [ R _ D 1 ( x , y ) , D ] , [ R _ D θ ( x , y ) , D ] .
Enumerate second region unit regX 2registering of comprising records R_A 1(x, y), A, R_B 5(x, y), B, R_B β(x, y), B, R_C 3(x, y), C, R_C 4(x, y), C, R_C 5(x, y), C and R_C γ(x, y), C; Second region unit regX 2adopt set form to be expressed as:
regX 2 = [ R _ A 1 ( x , y ) , A ] , [ R _ B 5 ( x , y ) , B ] , [ R _ B β ( x , y ) , B ] , [ R _ C 3 ( x , y ) , C ] , [ R _ C 4 ( x , y ) , C ] , [ R _ C 5 ( x , y ) , C ] , [ R _ C γ ( x , y ) , C ] .
Enumerate last region unit regX yregistering of comprising records R_D 2(x, y), D, R_D 3(x, y), D, R_D 4(x, y), D and R_D 5(x, y), D; Last region unit regX yadopt set form to be expressed as:
regX y = [ R _ D 2 ( x , y ) , D ] , [ R _ D 3 ( x , y ) , D ] , [ R _ D 4 ( x , y ) , D ] , [ R _ D 5 ( x , y ) , D ] .
Record regional piece, adopting set form to express region unit information is Q 31-32={ regX 1, regX 2..., regX y.
In the present invention, k-means clustering method please refer to " large data interconnection net large-scale data excavate and distributed treatment ", Wang Binyi, September in 2012 the 1st edition.
In the present invention, the cluster of registering module 31 is set to simple task, is in order to obtain the rough record of registering in the middle of the LBSN database 2 huge, by the rough record application k-means clustering method processing of registering, the record aggregate of registering can be arrived to region separately.
Use the R_POI in k-means clustering method process database p(x, y), does not externally provide on line and processes in real time service, and the object of the cluster of registering module 31 is the scopes (being called the zoning of coarseness) that roughly mark off region CR; In addition, the scope of region CR register on space, change not obvious, can be according to the division of cluster kcluster-span interval time online lower treatment plot territory CR.
Region computing module 32 based on label clustering
Region computing module 32 first aspects based on label clustering are for receiving area block message Q 31-32={ regX 1, regX 2..., regX y;
Region computing module 32 second aspects based on label clustering according to geographical labels cluster strategy POI-CP to Q 31-32={ regX 1, regX 2..., regX yprocess, obtain restraining geographical labels region unit Q 32-2; And will restrain geographical labels region unit Q 32-2be written to LBSN database 2.
In the present invention, enumerate the concrete processing of explanation to geographical labels below, i.e. the implementation step of geographical labels cluster strategy POI-CP:
(1) first area piece
In the present invention, to first region unit regX 1in the geographical labels treatment step that carries out geographical labels cluster strategy POI-CP be:
Step 101: extract the geographical labels belonging in the same area piece
From Q 31-32={ regX 1, regX 2..., regX yin extract and meet first region unit regX 1geographical labels, if first region unit regX 1in geographical labels include A, B and C, meet regX 1geographical labels adopt set description be;
regX 1 = [ R _ A 1 ( x , y ) , A ] , [ R _ A 2 ( x , y ) , A ] , [ R _ A 3 ( x , y ) , A ] , [ R _ A 4 ( x , y ) , A ] , [ R _ A 5 ( x , y ) , A ] , [ R _ A α ( x , y ) , A ] , [ R _ B 1 ( x , y ) , B ] , [ R _ B 2 ( x , y ) , B ] , [ R _ B 3 ( x , y ) , B ] , [ R _ B 4 ( x , y ) , B ] , [ R _ C 1 ( x , y ) , C ] , [ R _ C 2 ( x , y ) , C ] .
Step 102: the position number of geographical labels is obtained;
To first region unit regX 1in geographical labels A carry out position classification, be met regX 1in A geographical labels position Add A X 1 = R _ A 1 ( x , y ) , R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) ; Described A is at regX 1the number of times of middle appearance is designated as and
To first region unit regX 1in geographical labels B carry out position classification, be met regX 1in B geographical labels position Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) ; Described B is at regX 1the number of times of middle appearance is designated as and
To first region unit regX 1in geographical labels C carry out position classification, be met regX 1in C geographical labels position Add C X 1 = { R _ C 1 ( x , y ) , R _ C 2 ( x , y ) } ; Described C is at regX 1the number of times of middle appearance is designated as and α C regX 1 = 2 .
To first region unit regX 1in geographical labels D carry out position classification, be met regX 1in D geographical labels position Add D X 1 = { R _ D 1 ( x , y ) , R _ D θ ( x , y ) } ; Described D is at regX 1the number of times of middle appearance is designated as and α D reg X 1 = 2 .
In the present invention, statistics is at first region unit regX 1in the number of times of registering of all geographical labels, be designated as and α POI regX 1 = α A regX 1 + α B regX 1 + α C regX 1 + α D reg X 1 .
Step 103: distance correlation
Step 103-1: resolve Add A X 1 = R _ A 1 ( x , y ) , R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) In longitude mean value x ‾ _ A = Σ i = 1 α X 1 R _ A i ( x ) α X 1 With latitude mean value y ‾ _ A = Σ i = 1 α X 1 R _ A i ( y ) α X 1 , I represents the summing target in summation relation, be met regX 1in the place-centric point of A geographical labels position resolve Add A X 1 = R _ A 1 ( x , y ) , R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) In each label point position arrive distance, and select maximum linear distance, be designated as
Resolve Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) In longitude mean value x ‾ _ B = Σ j = 1 β X 1 R _ B j ( x ) β X 1 With latitude mean value y ‾ _ B = Σ j = 1 β X 1 R _ B j ( x ) β X 1 , J represents the summing target in summation relation, be met regX 1in the place-centric point of B geographical labels position resolve Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) In each label point position arrive distance, and select maximum linear distance and be designated as
Resolve Add C X 1 = { R _ C 1 ( x , y ) , R _ C 2 ( x , y ) } In longitude mean value x ‾ _ C = Σ m = 1 γ X 1 R _ C m ( x ) γ X 1 With latitude mean value y ‾ _ C = Σ m = 1 γ X 1 R _ C m ( y ) γ X 1 , M represents the summing target in summation relation, obtain the place-centric point of AC geographical labels position resolve Add C X 1 = { R _ C 1 ( x , y ) , R _ C 2 ( x , y ) } In each label point position arrive distance, and select maximum linear distance and be designated as
Resolve Add D X 1 = { R _ D 1 ( x , y ) , R _ D θ ( x , y ) } In longitude mean value x ‾ _ D = Σ n = 1 θ X 1 R _ C n ( x ) θ X 1 With latitude mean value y ‾ _ D = Σ n = 1 θ X 1 R _ C n ( y ) θ X 1 , N represents the summing target in summation relation, obtain the place-centric point of AD geographical labels position resolve Add D X 1 = { R _ D 1 ( x , y ) , R _ D θ ( x , y ) } In each label point position arrive distance, and select maximum linear distance and be designated as
In the present invention, statistics is at first region unit regX 1in all geographical labels and the place-centric point of geographical labels position between maximum linear distance and LD POI - max reg X 1 = { LD A - max X 1 , LD B - max X 1 , LD C - max X 1 , LD D - max X 1 } .
Step 103-2: setting area piece radius threshold value is designated as r threshold value;
If maximum linear distance is less than zone radius threshold value r threshold value, by r threshold valueassignment is to the region unit distance correlation radius in affiliated area
If maximum linear distance is more than or equal to zone radius threshold value r threshold value, choose maximum linear distance as the region unit distance correlation radius in affiliated area
In like manner can obtain: the distance correlation radius of a geographical labels is designated as the distance correlation radius of b geographical labels is designated as
Enumerate, if by r threshold valueassignment is given and is met regX 1in region unit distance correlation radius if will assignment is given and is met regX 1in region unit distance correlation radius
Enumerate, if by r threshold valueassignment is given and is met regX 1in region unit distance correlation radius if will assignment is given and is met regX 1in region unit distance correlation radius
Enumerate, if by r threshold valueassignment is given and is met regX 1in region unit distance correlation radius if will assignment is given and is met regX 1in region unit distance correlation radius
Enumerate, if by r threshold valueassignment is given and is met regX 1in region unit distance correlation radius if will assignment is given and is met regX 1in region unit distance correlation radius
Step 103-3: calculate and meet regX 1in the central point distance of any two geographical labels positions;
with central point distance be designated as
with central point distance be designated as
with central point distance be designated as
with central point distance be designated as
with central point distance be designated as
In the present invention, statistics is at first region unit regX 1in geographical labels POI in any two geographical labels a, central point between b distance is designated as
Step 103-4: definition meets regX 1in distance correlation
Enumerate, with distance correlation be designated as H _ rel A - B X 1 = r D A X 1 + r D B X 1 CLD A - B reg X 1 ;
Enumerate, with distance correlation be designated as H _ rel A - C X 1 = r D A X 1 + r D C X 1 CLD A - C reg X 1 ;
Enumerate, with distance correlation be designated as H _ rel A - D X 1 = r D A X 1 + r D D X 1 CLD A - D reg X 1 ;
Enumerate, with distance correlation be designated as H _ rel B - C X 1 = r D B X 1 + r D C X 1 CLD B - C reg X 1 .
Enumerate, with distance correlation be designated as H _ rel B - D X 1 = r D B X 1 + r D D X 1 CLD B - D reg X 1 .
In the present invention, plan range is that known range formula calculates, as | AB | = ( x 1 - x 2 ) 2 + ( y 1 - y 2 ) 2 .
In the present invention, statistics is at first region unit regX 1in distance correlation be designated as H _ rel a - b reg X 1 = r D a reg X 1 + r D b reg X 1 CLD a - b reg X 1 .
Step 104: semantic dependency
In the present invention, definition semantic distance is the editing distance between POI, i.e. E pOI.Described editing distance E pOIfor source file string POI={A, B, C, D} converts target strings TPOI={A to, B, C, the sequence of operation that D} cost is minimum.The calculating of editing distance please refer to " introduction to algorithms " 218-219 page of the 1st edition the 14th printing Dec in 2009, (U.S.) Thomas H.Cormen Charles E.Leiserson Ronald L.Rivest Clifford Stein work, Pan Jingui, Gu Tiecheng, Li Chengfa, Ye Maoyi.
Enumerate, at first region unit regX 1in the string length of dining room A be designated as the string length of the B of hospital is designated as the string length of the C of library is designated as the string length of teaching building D is designated as
Enumerate, at first region unit regX 1in dining room A and the editing distance of the B of hospital be designated as the editing distance of dining room A and the C of library is designated as the editing distance of dining room A and teaching building D is designated as the editing distance of the B of hospital and the C of library is the editing distance of the B of hospital and teaching building D is designated as
Enumerate, at first region unit regX 1in dining room A and the semantic dependency of the B of hospital be designated as S _ rel A - B X 1 = 1 - E A - B X 1 max ( L A X 1 , L B X 1 ) ;
Enumerate, at first region unit regX 1in dining room A and the semantic dependency of the C of library be designated as S _ rel A - C X 1 = 1 - E A - C X 1 max ( L A X 1 , L C X 1 ) ;
Enumerate, at first region unit regX 1in dining room A and the semantic dependency of teaching building D be designated as S _ rel A - D X 1 = 1 - E A - D X 1 max ( L A X 1 , L D X 1 ) ;
Enumerate, at first region unit regX 1in the B of hospital and the semantic dependency of the C of library be designated as S _ rel B - C X 1 = 1 - E B - C X 1 max ( L B X 1 , L C X 1 ) .
Enumerate, at first region unit regX 1in the B of hospital and the semantic dependency of teaching building D be designated as S _ rel B - D X 1 = 1 - E B - D X 1 max ( L B X 1 , L D X 1 ) .
In the present invention, at area information Q 31-32in the semantic dependency of any two geographical labels be designated as S _ rel a - b Q 31 - 32 = 1 - E a - b Q 31 - 32 max ( L a - b Q 31 - 32 ) .
Step 105: whether region unit merges
Distance correlation threshold value is set and is designated as rel distance, semantic dependency threshold value is designated as rel semantic, and according to rel distanceand rel semanticwhether carry out the merging processing of region unit;
Step 105-1: if and time, will Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) Merge to Add A X 1 = R _ A 1 ( x , y ) , R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) In, be updated to Add A _ new X 1 = R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) , R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) ; And will export to LBSN database, execution step 105-2;
Step 105-2: if or time, with do not carry out region unit merging; And will with export to LBSN database, execution step 105-3;
Step 105-3: if and time, will Add C X 1 = { R _ C 1 ( x , y ) , R _ C 2 ( x , y ) } Merge to Add A X 1 = R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) In, be updated to Add A _ new X 1 = R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) , R _ C 1 ( x , y ) , R _ C 2 ( x , y ) ; And will export to LBSN database, execution step 105-4;
Step 105-4: if or time, with do not carry out region unit merging; And will with export to LBSN database, execution step 105-5;
If step 105-5 and time, will Add D X 1 = { R _ D 1 ( x , y ) , R _ D θ ( x , y ) } Merge to Add A X 1 = R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) In, be updated to Add A _ new X 1 = R _ A 2 ( x , y ) , R _ A 3 ( x , y ) , R _ A 4 ( x , y ) , R _ A 5 ( x , y ) , R _ A α ( x , y ) , R _ D 1 ( x , y ) , R _ D θ ( x , y ) ; And will
Step 105-6: if or time, with do not carry out region unit merging; And will with export to LBSN database, execution step 105-7;
Step 105-7: if and time, will Add C X 1 = { R _ C 1 ( x , y ) , R _ C 2 ( x , y ) } Merge to Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) In, be updated to Add B _ new X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) , R _ C 1 ( x , y ) , R _ C 2 ( x , y ) ; And will export to LBSN database, execution step 105-8;
Step 105-8: if or time, with do not carry out region unit merging; And will with export to LBSN database, execution step 105-9;
Step 105-9: if and time, will Add D X 1 = { R _ D 1 ( x , y ) , R _ D θ ( x , y ) } Merge to Add B X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) In, be updated to Add B _ new X 1 = R _ B 1 ( x , y ) , R _ B 2 ( x , y ) , R _ B 3 ( x , y ) , R _ B 4 ( x , y ) , R _ D 1 ( x , y ) , R _ D θ ( x , y ) ; And will export to LBSN database, execution step 105-10;
Step 105-10: if or time, with do not carry out region unit merging; And will with export to LBSN database.
In the present invention, by first region unit regX 1the sign-in desk positional information that writes LBSN database 2 adopt set form to be expressed as Q 2 - 33 = { checkin regX 1 } .
(2) second area piece
In the present invention, to second region unit regX 2in the geographical labels treatment step that carries out geographical labels cluster strategy POI-CP be:
Step 201: extract the geographical labels belonging in the same area piece
From Q 31-32={ regX 1, regX 2..., regX yin extract meet second region unit regX 2geographical labels, if second region unit regX 2in geographical labels include B and C, meet regX 2geographical labels adopt set description be;
regX 2 = [ R _ B 5 ( x , y ) , B ] , [ R _ B β ( x , y ) , B ] , [ R _ C 3 ( x , y ) , C ] , [ R _ C 4 ( x , y ) , C ] , [ R _ C 5 ( x , y ) , C ] , [ R _ C γ ( x , y ) , C ] .
Step 202: the position number of geographical labels is obtained;
To second region unit regX 2in geographical labels B carry out position classification, be met regX 2in B geographical labels position Add B X 2 = { R _ B 5 ( x , y ) , R _ B β ( x , y ) } ; Described B is at regX 2the number of times of middle appearance is designated as and α B regX 2 = 2 .
To second region unit regX 2in geographical labels C carry out position classification, be met regX 2in C geographical labels position Add C X 2 = R _ C 3 ( x , y ) , R _ C 4 ( x , y ) , R _ C 5 ( x , y ) , R _ C γ ( x , y ) ; Described C is at regX 2the number of times of middle appearance is designated as and
In the present invention, statistics is at second region unit regX 2in the number of times of registering of all geographical labels, be designated as and α POI regX 2 = α B reg X 2 + α C regX 2 .
Step 203: distance correlation
Step 203-1: resolve Add B X 2 = { R _ B 5 ( x , y ) , R _ B β ( x , y ) } In longitude mean value x ‾ _ B = Σ j = 1 β X 2 R _ B j ( x ) β X 2 With latitude mean value y ‾ _ B = Σ j = 1 β X 2 R _ B j ( x ) β X 2 , J represents the summing target in summation relation, be met regX 2in the place-centric point of B geographical labels position resolve Add B X 2 = { R _ B 5 ( x , y ) , R _ B β ( x , y ) } In each label point position arrive distance, and select maximum linear distance, be designated as
Resolve Add C X 2 = R _ C 3 ( x , y ) , R _ C 4 ( x , y ) , R _ C 5 ( x , y ) , R _ C γ ( x , y ) In longitude mean value x ‾ _ C = Σ m = 1 γ X 2 R _ C m ( x ) γ X 2 With latitude mean value y ‾ _ C = Σ m = 1 γ X 2 R _ C m ( y ) γ X 2 , M represents the summing target in summation relation, and m ∈ γ ', is met regX 2in the place-centric point of C geographical labels position resolve Add C X 2 = R _ C 3 ( x , y ) , R _ C 4 ( x , y ) , R _ C 5 ( x , y ) , R _ C γ ( x , y ) In each label point position arrive distance, and select maximum linear distance, be designated as
In the present invention, statistics is at second region unit regX 2in all geographical labels and the place-centric point of geographical labels position between maximum linear distance and LD POI - max reg X 2 = { LD B - max X 1 , LD C - max X 1 } .
Step 203-2: setting area piece radius threshold value is designated as r threshold value;
If maximum linear distance is less than zone radius threshold value r threshold value, by r threshold valueassignment is to the region unit distance correlation radius r D in affiliated area pOI;
If maximum linear distance is more than or equal to zone radius threshold value r threshold value, choose maximum linear distance for the region unit distance correlation radius r D in affiliated area pOI;
In like manner can obtain: the distance correlation radius of a geographical labels is designated as the distance correlation radius of b geographical labels is designated as
Enumerate, if by r threshold valueassignment is given and is met regX 2in region unit distance correlation radius if will assignment is given and is met regX 2in region unit distance correlation radius
Enumerate, if by r threshold valueassignment is given and is met regX 2in region unit distance correlation radius if will assignment is given and is met regX 2in region unit distance correlation radius
Step 203-3: calculate and meet regX 2in the central point distance of any two geographical labels positions;
with central point distance be designated as
Step 203-4: definition meets regX 2in distance correlation
Enumerate, with distance correlation be designated as H _ rel B - C X 2 = r D B X 2 + r D C X 2 CL D B - C reg X 2 .
In the present invention, statistics is at second region unit regX 2in distance correlation be designated as H _ rel a - b regX 2 = r D a reg X 2 + r D b regX 2 CL D a - b reg X 2 .
Step 204: semantic dependency
Enumerate, at second region unit regX 2in the string length of the B of hospital be designated as the string length of the C of library is designated as
Enumerate, at second region unit regX 2the editing distance of the B of institute of traditional Chinese medicine and the C of library is
Enumerate, at second region unit regX 2in the B of hospital and the semantic dependency of the C of library be designated as S _ re l B - C X 2 = 1 - E B - C X 2 max ( L B X 2 , L C X 2 ) .
In the present invention, at area information Q 31-32in the semantic dependency of geographical labels be designated as S _ re l POI Q 31 - 32 = 1 - E POI Q 31 - 32 max ( L POI Q 31 - 32 ) .
Step 205: whether region unit merges
Distance correlation threshold value is set and is designated as rel distance, semantic dependency threshold value is designated as rel semantic, and according to rel distanceand rel semanticwhether carry out the merging processing of region unit;
Step 205-1: if and time, will Add C X 2 = R _ C 3 ( x , y ) , R _ C 4 ( x , y ) , R _ C 5 ( x , y ) , R _ C γ ( x , y ) Merge to Add B X 2 = { R _ B 5 ( x , y ) , R _ B β ( x , y ) } In, be updated to Add B _ new X 2 = R _ B 5 ( x , y ) , R _ B β ( x , y ) , R _ C 3 ( x , y ) , R _ C 4 ( x , y ) , R _ C 5 ( x , y ) , R _ C γ ( x , y ) ; And will export to LBSN database, execution step 205-2;
Step 205-2: if or time, with do not carry out region unit merging, and will with export to LBSN database.
In the present invention, by second region unit regX 2the sign-in desk positional information that writes LBSN database 2 adopt set form to be expressed as Q 2 - 33 = { checkin regX 2 } .
(3) the 3rd region units
In the present invention, to last region unit regX ycarrying out the treatment step that geographical labels carries out geographical labels cluster strategy POI-CP is:
Step 301: extract the geographical labels belonging in the same area piece
From Q 31-32={ regX 1, regX 2..., regX yin extract and meet last region unit regX ygeographical labels, if last region unit regX yin geographical labels be only D, meet regX ygeographical labels be;
regX y [ R _ D 2 ( x , y ) , D ] , [ R _ D 3 ( x , y ) , D ] , [ R _ D 4 ( x , y ) , D ] , [ R _ D 5 ( x , y ) , D ] .
Step 302: the position number of geographical labels is obtained;
To last region unit regX yin geographical labels D carry out position classification, be met regX yin D geographical labels position Add D X y = R _ D 2 ( x , y ) , R _ D 3 ( x , y ) , R _ D 4 ( x , y ) , R _ D 5 ( x , y ) ; Described D is at regX ythe number of times of middle appearance is designated as and
Step 303: distance correlation
Step 303-1: resolve Add D x y = R _ D 2 ( x , y ) , R _ D 3 ( x , y ) , R _ D 4 ( x , y ) , R _ D 5 ( x , y ) In longitude mean value x ‾ _ A = Σ i = 1 α X y R _ A i ( x ) α X y With latitude mean value y ‾ _ A = Σ i = 1 α X y R _ A i ( y ) α X y , I represents the summing target in summation relation, be met regX yin the place-centric point of D geographical labels position resolve Add D x y = R _ D 2 ( x , y ) , R _ D 3 ( x , y ) , R _ D 4 ( x , y ) , R _ D 5 ( x , y ) In each label point position arrive distance, and select maximum linear distance, be designated as
In the present invention, an in the end region unit regX of statistics yin all geographical labels and the place-centric point of geographical labels position between maximum linear distance and LD POI - max regX y = { LD D - max X y } .
Step 303-2: setting area piece radius threshold value is designated as r threshold value;
If maximum linear distance is less than zone radius threshold value r threshold value, by r threshold valueassignment is to the region unit distance correlation radius r D in affiliated area pOI;
If maximum linear distance is more than or equal to zone radius threshold value r threshold value, choose maximum linear distance as the region unit distance correlation radius r D in affiliated area pOI;
Enumerate, if by r threshold valueassignment is given and is met regX yin region unit distance correlation radius if will assignment is given and is met regX yin region unit distance correlation radius
Step 303-3: calculate and meet regX yin the central point distance of any two geographical labels positions;
Owing to meeting regX yin geographical labels only have D, therefore r threshold valuecentered by some distance be designated as
Step 303-4: definition meets regX yin distance correlation
Enumerate, will distance correlation be designated as
Step 304: semantic dependency
Enumerate, in the end a region unit regX yin the string length of teaching building D be designated as
Enumerate, in the end a region unit regX yin owing to only having teaching building D, therefore the editing distance of D is designated as and E D - 0 X y = 0 .
Enumerate, in the end a region unit regX yin the semantic dependency of teaching building D be designated as S _ re l D - 0 X y = 1 - E D - 0 X y max ( L D X y , 0 ) , And S _ rel D - 0 X y = 1 .
Step 305: whether region unit merges
Distance correlation threshold value is set and is designated as rel distance, semantic dependency threshold value is designated as rel semantic, and according to rel distanceand rel semanticwhether carry out the merging processing of region unit;
In the present invention, due to E D - 0 X y = 0 With S _ rel D - 0 X y = 1 , Therefore with therefore last region unit regX ydo not need to carry out region merging.
In the present invention, by last region unit regX ythe sign-in desk positional information that writes LBSN database 2 adopt set form to be expressed as Q 2 - 33 = { checkin regX y } .
Hot spot region event computing module 33
Hot spot region event computing module 33 first aspects are accepted user 1 hot spot region inquiry request Request, described Request={Geo (x, y), dist, Hot}, and by Request={Geo (x, y), dist, Hot} is transmitted to LBSN database 2;
Hot spot region event computing module 33 second aspects are according to Request={Geo (x, y), dist, and Hot} can search out the hot spot region with Geo (x, y) coupling in LBSN database 2, is designated as inquiry return message Q 2-33;
Hot spot region event computing module 33 third aspect according to the frequency strategy POI-TP that registers under time window to described Q 2-33carry out computing, obtain region focus incident and rank ChecFreq, and described ChecFreq is fed back to user 1.
In the present invention, hot spot region event computing module 33 can calculate user 1 in real time asks the focus incident of scope inner region, and specifically asking separately in region by the end of the request moment to focus region clustering database 2 (is that current time history t) contains the geography record of registering Q 2 - 33 = { checkin regX 1 , checkin regX 2 , · · · , checkin reg X y } .
In the present invention, with any region regX yregistering according to historical geography, to detect focus incident be example to label, and definition time window is Δ T, and the frequency of registering occurring within the described Δ T time period is ChecFreq.The number of times that the described frequency ChecFreq that registers refers to that register in identical hot spot region in Δ T.Described Δ T=|t-(t-1) |, current time is t, previous moment is t-1.
In the present invention, event definition is any region regX yin the register variable quantity of frequency ChecFreq, event description expression-form is Trend:
Trend = ChecFreq t - ChecFreq t - 1 ΔT
When Trend exceeds certain event detection threshold value Trend threshold valuetime, hot spot region event computing module 33 is by region regX ybe labeled as focus incident.And be surge type event flag or suddenly fall type according to Trend value positive and negative.
In the present invention, focus incident is carried out to ranking, the ranking that focus event time window number is more is front.At any one regX yin, have under continuous time window Δ T, if the frequency ChecFreq that registers of focus incident exceedes the frequency threshold value ChecFreq that registers threshold value, choose the focus incident lasting time window number continuously that exceedes threshold value, be designated as in like manner known, in all hot spot regions, focus incident is lasting time window number continuously, is designated as RCU = { CU regX 1 , CU regX 2 , · · · , CU regX y } .
The active degree Rank of event is directly proportional with the event of continuing of registering to the frequency of registering of Δ T, that is:
Rank = Σ i = 1 regX y ChecFreq i t Ω i × 1 + max j ∈ regX y ( CU j ) regX y
Wherein be illustrated in the quantity of registering in time window Δ T, regX yrepresent any region, the element of suing for peace, i represents summing target, Ω ibe illustrated in time window Δ T the user 1 scope inner region of the asking total quantity of registering. represent the time window number in current all hot spot regions maximal value, j represents the area identification number of time window number maximum.
A kind of hot spot region incident detection system of registering based on geographical labels that the present invention proposes, this system belongs to the event detection technical field in location-based social network.First system always the uninterrupted LBSN that must move crawl module, in warp-wise LBSN database, write the record of registering that contains geographical labels, next can adopt the clustering algorithm of registering to obtain general regional cluster; Then the region clustering module of system can adopt the region clustering algorithm that contains geographical labels to calculate precise region; The request of hot spot region computing module meeting relative users, this is the unique service interface externally providing of system, first the query argument that this module can be submitted to according to user, submit corresponding inquiry to database, and carry out region focus incident detection algorithm to returning to the data of coming, calculate the rank of focus incident and event, and return to request user.
embodiment
Shown in Fig. 1, Fig. 2, the user who supposes to meet geographical labels in the LBSN database information Q that registers 31-2=R_POI p(x, y), the Search Results of POI has multiple, applies cluster kcluster-span interval time and carries out the processing of k-means clustering method, obtains region unit information Q 31-32={ regX 1, regX 2..., regX y.Described Q 31-2=R_POI p(x, y), POI is raw readings, does not also carry out cluster.
Suppose Q 31-32={ regX 1, regX 2..., regX ycoarseness geographic area after cluster is 6, region unit is counted y=6, i.e. Q 31-32={ regX 1, regX 2, regX 3, regX 4, regX 5, regX 6.
At Q 31-32in, interregional distance relation is as following table:
While supposing that request user sends local hot spot region inquiry request, the operational process of system is as follows:
Step 1: request user's recommendation request information is Request={Geo (x, y), dist, Hot}={inR 3, 2000, Hot}, inR 3for the location point of request, it and other regional distance are:
Can see, within the scope of request, having 3 regions to satisfy condition, i.e. regX 1, regX 3, regX 4.First user's request can be received by the hot spot region event computing module 33 of hot spot region incident detection system.
Step 2: the parametric configuration that hot spot region event computing module 33 passes into user becomes rational query statement requested database, database is by these three region regX that meet the demands 1, regX 3, regX 4with their history with the geographical labels record of registering Q 2 - 33 = { checkin regX 1 , checkin regX 2 , · · · , checkin reg X y } Return to hot spot region event computing module 33.
Step 3: hot spot region event computing module 33 obtains registering after record, starts to calculate.The value of Δ T is 1 hour, the frequency of registering threshold value ChecFreq threshold valuebe 100, event detection threshold value Trend threshold valuebe 50/h, find forward 4 time windows, first time window T 1=t-3 Δ T, second time window T 2=t-2 Δ T, the 3rd time window T 3=t-Δ T, the 4th time window T 4=t.
At region regX 1in, the frequency ChecFreq that registers of a certain event is:
Period T 1 T 2 T 3 T 4
ChecFreq 167 101 150 50
Its maximum is enlivened time window value continuously, and to be 3 (be focus incident lasting time window number continuously CU regX 1 = 3 )。
At region regX 3in, the frequency ChecFreq that registers of a certain event is:
Period T 1 T 2 T 3 T 4
ChecFreq 24 30 50 99
It is 3 that its maximum is enlivened time window value continuously.
At region regX 4in, the frequency ChecFreq that registers of current period is:
Period T 1 T 2 T 3 T 4
ChecFreq 112 22 23 12
It is 1 that its maximum is enlivened time window value continuously.
According to Trend regX 1 = ChecFreq t - ChecFreq t - 1 ΔT = 66 , Exceed event detection threshold; do not reach event detection threshold value; exceed event detection threshold.Can see and have two hot spot regions.Respectively regX 1and regX 4.
For regX 1, its time window value of enlivening is continuously 3, the active degree rank of its event:
Rank regX 1 = Σ i = 1 4 ChecFreq i Ω i × 1 + 3 4 = 0.89 × 1.48 = 1.31 .
For regX 4, its time window value of enlivening is continuously 1, the active degree rank of its event:
Rank regX 4 = Σ i = 1 4 ChecFreq i Ω i × 1 + 1 4 = 0 . 66 × 1.11 = 0 . 73 .
Result of calculation is returned to user by hot spot region event computing module 33.Be that ChecFreq returns results as ChecFreq={regX 1: 1, regX 4: 2}.

Claims (3)

1. the hot spot region incident detection system based on geographical labels that is applied to LBSN network, is characterized in that: the described hot spot region incident detection system (3) based on geographical labels is set between the LBSN database (2) in described LBSN network and user (1);
The described hot spot region incident detection system (3) based on geographical labels includes the cluster module (31) of registering, region computing module (32) and hot spot region event computing module (33) based on label clustering; Described hot spot region event computing module (33) is the interface that is connected between LBSN database (2) and user (1);
The cluster of registering module (31) first aspect is for sending to LBSN database (2) the solicited message Q that registers that contains geographical labels 31-2, described Q 31-2=R_POI p(x, y), POI;
R_POI p(x, y) represents sign-in desk geographic position, and x represents longitude, and y represents latitude;
POI represents geographical labels; Any one geographical labels in described POI is designated as a, and another geographical labels is designated as b, a, b ∈ POI;
The cluster of registering module (31) second aspect is according to Q 31-2=R_POI p(x, y), POI can search out the record of registering mating with geographical labels POI in LBSN database (2), is designated as the return message Q that registers 2-31;
The cluster of registering module (31) third aspect is to the return message Q that registers receiving 2-31carry out the processing of k-means clustering method according to cluster kcluster-span interval time, obtain region unit information Q 31-32, described Q 31-32={ regX 1, regX 2..., regX y, then by Q 31-32export to the region computing module (32) based on label clustering;
RegX 1represent first region unit in any one geographic area R;
RegX 2represent second region unit in any one geographic area R;
RegX yrepresent last region unit in any one geographic area R;
Y represents region unit number;
Region computing module (32) first aspect based on label clustering is for receiving area block message Q 31-32={ regX 1, regX 2..., regX y;
Region computing module (32) second aspect based on label clustering according to geographical labels cluster strategy POI-CP to Q 31-32={ regX 1, regX 2..., regX yprocess, obtain restraining geographical labels region unit Q 32-2; And convergence geographical labels region unit is written to LBSN database (2);
Hot spot region event computing module (33) first aspect is accepted user's (1) hot spot region inquiry request Request, described Request={Geo (x, y), dist, Hot}, and by Request={Geo (x, y), dist, Hot} is transmitted to LBSN database (2);
Hot spot region event computing module (33) second aspect is according to Request={Geo (x, y), dist, Hot} can search out (the x with Geo in LBSN database (2), y) hot spot region of coupling, is designated as inquiry return message Q 2-33;
Hot spot region event computing module (33) third aspect according to the frequency strategy POI-TP that registers under time window to described Q 2-33carry out computing, obtain region focus incident and rank ChecFreq, and described ChecFreq is fed back to user (1).
2. the hot spot region incident detection system based on geographical labels that is applied to LBSN network according to claim 1, is characterized in that: described geographical labels cluster strategy POI-CP has the following step:
Extraction belongs to the same area piece regX yin the step of geographical labels POI;
Calculating belongs to the same area piece regX yin the position number of geographical labels POI step;
Calculate the place-centric point of geographical labels POI and geographical labels position between maximum linear distance then described in judgement with zone radius threshold value r threshold valuesize, if by r threshold valueassignment is to the region unit distance correlation radius in affiliated area if choose maximum linear distance as the region unit distance correlation radius in affiliated area and then by the distance correlation radius of a geographical labels distance correlation radius with b geographical labels sum is than any two geographical labels a in upper geographical labels POI, the central point distance between b obtain distance correlation H _ re l a - b reg X y = r D a reg X y + r D b reg X y CL D a - b reg X y Step;
Calculate any two geographical labels a in geographical labels POI, between b semantic dependency S _ re l a - b Q 31 - 32 = 1 - E a - b Q 31 - 32 max ( L a - b Q 31 - 32 ) Step;
According to described with described with distance correlation threshold value rel distance, semantic dependency threshold value rel semanticcontrast, and merge region unit regX according to comparing result ystep;
If and time, by the position of registering of b geographical labels merge to the position of registering of a geographical labels
If or time, the position of registering of b geographical labels not with the position of registering of a geographical labels carry out region unit merging.
3. the hot spot region incident detection system based on geographical labels that is applied to LBSN network according to claim 1, is characterized in that: the concrete mode of the frequency strategy POI-TP that registers under described time window is: hot spot region event computing module (33) can calculate in real time user (1) and ask any region regX yfocus incident, register and record Q when ask separately in region to contain geography by the end of the history of current time t to focus region clustering database (2) 2-33, event description is Trend = ChecFreq t - ChecFreq t - 1 ΔT , Δ T represents time window, Δ T=|t-(t-1) |, t represents current time, t-1 represents previous moment, ChecFreq trepresent the frequency of registering of current time t, ChecFreq t-1represent the frequency of registering of previous moment t-1;
The active degree Rank of event is directly proportional with the event of continuing of registering to the frequency of registering of Δ T, that is:
Rank = Σ i = 1 reg X y ChecFreq i t Ω i × 1 + max j ∈ reg X y ( CU j ) reg X y ;
Wherein be illustrated in the frequency of registering in time window Δ T, regX yrepresent any region, the element of suing for peace, i represents summing target, Ω ibe illustrated in time window Δ T user (1) the scope inner region of the asking total quantity of registering, represent the time window number in current all hot spot regions maximal value, j represents the area identification number of time window number maximum.
CN201410206191.7A 2014-05-15 2014-05-15 A kind of hot spot region incident detection system based on geographical labels applied to LBSN networks Active CN103995859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410206191.7A CN103995859B (en) 2014-05-15 2014-05-15 A kind of hot spot region incident detection system based on geographical labels applied to LBSN networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410206191.7A CN103995859B (en) 2014-05-15 2014-05-15 A kind of hot spot region incident detection system based on geographical labels applied to LBSN networks

Publications (2)

Publication Number Publication Date
CN103995859A true CN103995859A (en) 2014-08-20
CN103995859B CN103995859B (en) 2017-07-21

Family

ID=51310024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410206191.7A Active CN103995859B (en) 2014-05-15 2014-05-15 A kind of hot spot region incident detection system based on geographical labels applied to LBSN networks

Country Status (1)

Country Link
CN (1) CN103995859B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331483A (en) * 2014-11-05 2015-02-04 北京航空航天大学 Method and equipment for detecting area events based on short text data
CN105389332A (en) * 2015-10-13 2016-03-09 广西师范学院 Geographical social network based user similarity computation method
CN105824840A (en) * 2015-01-07 2016-08-03 阿里巴巴集团控股有限公司 Method and apparatus for region tag management
CN105847310A (en) * 2015-01-13 2016-08-10 中国移动通信集团江苏有限公司 Position determination method and apparatus
CN109257703A (en) * 2018-10-09 2019-01-22 江苏满运软件科技有限公司 Methods of exhibiting, device, electronic equipment, the storage medium of driver's accumulation point
CN111339446A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Interest point mining method and device, electronic equipment and storage medium
CN111368170A (en) * 2020-02-11 2020-07-03 口碑(上海)信息技术有限公司 Method, device and equipment for polling page data
CN111523036A (en) * 2020-04-24 2020-08-11 北京百度网讯科技有限公司 Search behavior mining method and device and electronic equipment
CN112148947A (en) * 2020-09-28 2020-12-29 微梦创科网络科技(中国)有限公司 Method and system for mining and reviewing users in batches
CN113392652A (en) * 2021-03-30 2021-09-14 中国人民解放军战略支援部队信息工程大学 Sign-in hotspot functional feature identification method based on semantic clustering

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN103020130A (en) * 2012-11-20 2013-04-03 北京航空航天大学 k nearest neighbor query method oriented to support area in LBS (Location-based Service) of urban road network
CN103488678A (en) * 2013-08-05 2014-01-01 北京航空航天大学 Friend recommendation system based on user sign-in similarity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880719A (en) * 2012-10-16 2013-01-16 四川大学 User trajectory similarity mining method for location-based social network
CN103020130A (en) * 2012-11-20 2013-04-03 北京航空航天大学 k nearest neighbor query method oriented to support area in LBS (Location-based Service) of urban road network
CN103488678A (en) * 2013-08-05 2014-01-01 北京航空航天大学 Friend recommendation system based on user sign-in similarity

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331483B (en) * 2014-11-05 2017-12-01 北京航空航天大学 Zone issue detection method and equipment based on short text data
CN104331483A (en) * 2014-11-05 2015-02-04 北京航空航天大学 Method and equipment for detecting area events based on short text data
US11113354B2 (en) 2015-01-07 2021-09-07 Alibaba Group Holding Limited Method and apparatus for managing region tag
CN105824840A (en) * 2015-01-07 2016-08-03 阿里巴巴集团控股有限公司 Method and apparatus for region tag management
US11755675B2 (en) 2015-01-07 2023-09-12 Alibaba Group Holding Limited Method and apparatus for managing region tag
CN105847310A (en) * 2015-01-13 2016-08-10 中国移动通信集团江苏有限公司 Position determination method and apparatus
CN105389332A (en) * 2015-10-13 2016-03-09 广西师范学院 Geographical social network based user similarity computation method
CN105389332B (en) * 2015-10-13 2018-09-11 广西师范学院 It is a kind of geography social networks under user's similarity calculation method
CN109257703A (en) * 2018-10-09 2019-01-22 江苏满运软件科技有限公司 Methods of exhibiting, device, electronic equipment, the storage medium of driver's accumulation point
CN111368170A (en) * 2020-02-11 2020-07-03 口碑(上海)信息技术有限公司 Method, device and equipment for polling page data
CN111368170B (en) * 2020-02-11 2023-03-31 口碑(上海)信息技术有限公司 Method, device and equipment for polling page data
CN111339446B (en) * 2020-02-18 2023-04-18 腾讯科技(深圳)有限公司 Interest point mining method and device, electronic equipment and storage medium
CN111339446A (en) * 2020-02-18 2020-06-26 腾讯科技(深圳)有限公司 Interest point mining method and device, electronic equipment and storage medium
CN111523036A (en) * 2020-04-24 2020-08-11 北京百度网讯科技有限公司 Search behavior mining method and device and electronic equipment
CN111523036B (en) * 2020-04-24 2023-12-19 北京百度网讯科技有限公司 Search behavior mining method and device and electronic equipment
CN112148947A (en) * 2020-09-28 2020-12-29 微梦创科网络科技(中国)有限公司 Method and system for mining and reviewing users in batches
CN112148947B (en) * 2020-09-28 2024-03-22 微梦创科网络科技(中国)有限公司 Method and system for excavating and brushing users in batches
CN113392652A (en) * 2021-03-30 2021-09-14 中国人民解放军战略支援部队信息工程大学 Sign-in hotspot functional feature identification method based on semantic clustering

Also Published As

Publication number Publication date
CN103995859B (en) 2017-07-21

Similar Documents

Publication Publication Date Title
CN103995859A (en) Geographical-tag-oriented hot spot area event detection system applied to LBSN
Chen et al. TrajCompressor: An online map-matching-based trajectory compression framework leveraging vehicle heading direction and change
Yao et al. Sensing spatial distribution of urban land use by integrating points-of-interest and Google Word2Vec model
EP3241370B1 (en) Analyzing semantic places and related data from a plurality of location data reports
Yuan et al. Measuring similarity of mobile phone user trajectories–a Spatio-temporal Edit Distance method
Lv et al. Mining user similarity based on routine activities
CN106960044B (en) Time perception personalized POI recommendation method based on tensor decomposition and weighted HITS
EP3605365A1 (en) Site selection method and device
Yu et al. Road network generalization considering traffic flow patterns
Berlingerio et al. The GRAAL of carpooling: GReen And sociAL optimization from crowd-sourced data
Li et al. An improved PSO algorithm for distributed localization in wireless sensor networks
CN102829794A (en) Navigation system and its path designing method
CN102724751B (en) Wireless indoor positioning method based on off-site survey
CN104239453B (en) Data processing method and device
CN105183870A (en) Urban functional domain detection method and system by means of microblog position information
CN105528395A (en) Method and system for recommending potential consumers
US10366134B2 (en) Taxonomy-based system for discovering and annotating geofences from geo-referenced data
Han et al. Localization algorithms in large-scale underwater acoustic sensor networks: A quantitative comparison
Xiao et al. Assessing polycentric urban development in Shanghai, China, with detailed passive mobile phone data
Feng et al. Grid-based improved maximum likelihood estimation for dynamic localization of mobile robots
Honarparvar et al. Improvement of a location-aware recommender system using volunteered geographic information
Lei Geospatial data conflation: A formal approach based on optimization and relational databases
CN105516251A (en) Positioning calibration method and position service pushing system thereof
Shi et al. An adaptive approach for modelling the movement uncertainty in trajectory data based on the concept of error ellipses
CN103517210B (en) Indoor positioning method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210510

Address after: 100193 room 402, floor 4, block B, building 12, east yard, No. 10, northwest Wangdong Road, Haidian District, Beijing

Patentee after: BEIJING ZHONGSHI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 100191 No. 37, Haidian District, Beijing, Xueyuan Road

Patentee before: BEIHANG University