CN106649846A - Geographic space interest point retrieval method based on diversity - Google Patents

Geographic space interest point retrieval method based on diversity Download PDF

Info

Publication number
CN106649846A
CN106649846A CN201611254804.XA CN201611254804A CN106649846A CN 106649846 A CN106649846 A CN 106649846A CN 201611254804 A CN201611254804 A CN 201611254804A CN 106649846 A CN106649846 A CN 106649846A
Authority
CN
China
Prior art keywords
node
fraction
space
weakening
calculated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611254804.XA
Other languages
Chinese (zh)
Other versions
CN106649846B (en
Inventor
才智
李彤
兰许
曹阳
丁治明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201611254804.XA priority Critical patent/CN106649846B/en
Publication of CN106649846A publication Critical patent/CN106649846A/en
Application granted granted Critical
Publication of CN106649846B publication Critical patent/CN106649846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Abstract

The invention discloses a geographic space interest point retrieval method based on diversity in order to obtain front k spatial positions. The method includes the following steps that 1, given position points or given combinations of the position points and keywords are subjected to initialized sorting; 2, other nodes are subjected to weakening of geographic space according to the geographic position where a selected node with the highest grade is located; 3, when end conditions are not met, a new node is selected. In conclusion, new grades of remaining nodes in R obtained after weakening of texts and the space are calculated, and the node with the highest grade is selected from the nodes. Finally, the front k spatial positions are obtained through an algorithm for the position points or the combinations of the position points and the keywords input by a user, and k pieces of most comprehensive information are returned to the user according to the weights of the texts and the spatial positions.

Description

Based on multifarious geographical space interest point search method
Technical field
The invention belongs to Data Mining, is related to a kind of based on multifarious geographical space interest point search method.
Background technology
In recent years, due to the popularization of global position system GPS on mobile device (such as smart mobile phone), location Based service (LBS) extensive concern of academia and industrial quarters has been obtained.Many location Based services are obtained for popularization and apply, and bring The related retrieval experience of customer location.
Existing LBS systems help user that position correlation is found from spatial database by the way of keyword retrieval As a result.Specifically, it is assumed that have one group of point of interest (POI points) in spatial database, wherein each POI point includes positional information With certain text message.The position of given user and a group polling keyword, LBS systems return from space and text all with The related POI points of inquiry.But now most LBS systems are that k bars before fraction ranking are directly extracted from database Information, in order to make up without the deficiency for comprehensively considering locus, present invention proposition is a kind of all to cut to text and space Weak algorithm, so as to get final result is as far as possible comprising on each direction.
The technology introduces tuple-set (Object Summaries, be abbreviated as OS), it be comprising positional information and The set based on locus and the information tuple of text generated in the spatial database of certain text message.One OS can To be with the data tuple comprising given text message and locus as root, with locus and the adjacent segments of the information of text Point is the tree structure of its descendant nodes.In order to generate OS, one is possessed with regard to inquiring about data subject (Data Subjects, is abbreviated as DS) relation of information, this relation is abbreviated as RDS, it is the root of tree structure;Another need with RDSThe relation of link, that is, generate RDSDescendants.For each RDSFor can form a DS ideograph, that is, GDS.This technology be according to generate OS come constantly carry out beta pruning optimization finally draw important information.
There may be thousands of bar tuple informations in one complete OS, these information are all included not only to disappear More times are consumed, and it is also extremely difficult to choose useful information for oneself wherein to user, so selecting Choose the most useful tuple information of k bars;To the natural number k being input into, will obtain with algorithm (referring to step 3.3) in whole OS To the more comprehensive information of k bars, in order to avoid a plurality of similar information repeats, this k bars information is set to go up to greatest extent The more diversified information of user is presented to, allows users to more fully understand information, present invention introduces Spatial diversity and text This method with two kinds of balance information importances of weight shared by space.This method can not only greatly reduce the consumption of time, Improve return information efficiency, and disclosure satisfy that user to search for information diversified demand, so as to get locus point Not only only it is partial to a certain orientation.
The content of the invention
It is an object of the invention to provide a kind of be based on multifarious geographical space interest point search method, it is defeated to user institute The location point for entering or location point and crucial contamination, obtain front k locus, further according to text and space bit with algorithm Put shared weight and return to the most comprehensive information of user k bars.
For achieving the above object, the technical solution used in the present invention is based on multifarious geographical space interest point search side Method, to obtaining front k locus, method realizes that step is as follows:
Step one:For given location point or location point carry out initialization sequence with crucial contamination;
Step 1.1:Collect and disposal data collection, build data relationship.At this moment digraph G (V, E), wherein V are defined (v1,...,vn) it is node (summit) collection, node on behalf various information here, E is the set of representative edge (arc), E=<vi, vj>|vi,vj∈ V },<vi,vj>Represent from viTo vjA line (arc), v1,...,vnThe arbitrary node in digraph is represented, this In n be natural number;
Step 1.2:By below equation to calculate R in each node viFraction:
DF(vi)=[fs (vi)*ds(vi)]as*[ft(vi)*dt(vi)]at*[fg(vi)*dg(vi)]ag (1)
Wherein fs (.), ft (.), fg (.) are respectively social (social) parameter, text (textual) parameter and geography (geographical) fraction of parameter, ds (.), dt (.), dg (.) is respectively corresponding diversity fraction, the sum of as, at, ag For 1, affect for controlling each parameter.
Diversity fraction is calculated by below equation:
Wherein ss (vi,vj) it is viAnd vjThe difference of social parameters, is calculated using Jaccard distances Ibid, the value of dt (.) and dg (.) is calculated.
To sum up, the fraction of each node in data set is iterated to calculate out, and selects node mid-score highest node v0
Step 2:Geographical space is carried out to other nodes according to the geographical position that the fraction highest node for selecting is located Weaken;
Step 2.1:Fraction highest node according to selecting in step one is associated the weakening of relation to other summits While be also carried out the weakening of geographical space, it is assumed that fraction highest node v0Location point to initial position p distance be d (p,v0), the distance of initial position to other nodes is d (p, vi), v0Distance to other nodes is d (v0,vi), then pass through Below equation is calculating geographical space value:
Knowable in formula 3, d (v0,vi) it is v0Distance to other nodes is bigger, and required geographical space value is bigger, says Bright node viBigger with the nodal distance for selecting, two node directions spatially are also just different.
To sum up, selected node is calculated successively to geographical space value d of remaining remaining nodei
Step 3:When termination condition is unsatisfactory for, new node is selected;
Step 3.1:Assume that the result after weakening to incidence relation is a, weight shared by text is α, then remaining node weakens Textual value afterwards is a × α;
Step 3.2:Assume to weight shared by space to be β, wherein alpha+beta=1, then the spatial value after remaining node weakens is d ×β;
Step 3.3:The fraction after remaining node weakens to text and space is calculated by below equation:
DF′(vi)=DF (vi)×(a×α+d×β) (4)
To sum up, calculate in R new fraction of the remaining node after the weakening to text and space, then therefrom select point Number highest node.So the process for selecting k result is:
1.) queue H is initializedkFor sky, input position point or location point and crucial contamination;
2.) according to input information, data relationship is built;
3. the fraction of each node) is calculated;
4.) obtain fraction highest node and add HkIn, l=1;
5.) l is worked as<Turn 6.), otherwise to turn 9.) during k;
6.) weakening of relation is associated according to selected node, and calculates diValue;
7.) weakening according to text and space and shared weight, calculate new fraction;
8.) obtain fraction highest node and add HkIn, 5.) l++ turns;
9.) queue H is returnedk
The H for now returningkThe i.e. required k bar information that will be retrieved.
Jing the results shows, the experiment effect that this method is obtained is notable.
Description of the drawings
Fig. 1 is the implementing procedure figure of the inventive method.
Fig. 2 is the locus schematic diagram of retrieval result information
Specific embodiment
With reference to relevant drawings 1-2 method involved in the present invention is explained and illustrated:
Step one:For given location point or location point carry out initialization sequence with crucial contamination;
The initial value of each node of data set is calculated according to formula (1).
Assume that given position point is " Tian'anmen Square ", keyword is " university ", and k=5 calculates initial point according to formula Number, as a result as shown in table 1:
The initialization fraction of 1 13 nodes of table
Node Fraction
Central Drama Institute 9.5
Central Conservatory of Music 9
Beijing commerce Professional School 8.7
Beijing Normal University north school district 8.1
The Chinese College of Buddhism 7.5
China Concord Medical Science University's nursing college 7.3
China Islamism Scripture Institute 6
Xuan Wu branch of Beijing Institute of Education 5.8
Beijing Jiaotong University 5.3
Beijing University of Technology 5
The Central University Of Finance and Economics 4.6
Chinese department of traditional Chinese medicine institute 3
China University of Political Science & Law 2
Step 2:Geographical space is carried out to other nodes according to the geographical position that the fraction highest node for selecting is located Weaken;
Step 2.1:Fraction highest node according to selecting in step one is associated the weakening of relation to other summits;
Fraction highest node " Central Drama Institute " is chosen, according to associating for " Central Drama Institute " and other nodes System is weakened, as a result as shown in table 2.
Step 2.2:Calculate the spatial value of each node;
The distance (as shown in table 3) of each node is arrived according to " Tian'anmen Square " and " Central Drama Institute " arrives remaining node Distance (as shown in table 4) can calculate the spatial value of each node, wherein
Table 2 weakens result according to the incidence relation of " Central Drama Institute " and other nodes
Node Incidence relation weakens
Central Conservatory of Music 0.255
Beijing commerce Professional School 0.538
Beijing Normal University north school district 0.435
The Chinese College of Buddhism 0.856
China Concord Medical Science University's nursing college 0.801
China Islamism Scripture Institute 0.756
Xuan Wu branch of Beijing Institute of Education 0.522
Beijing Jiaotong University 0.373
Beijing University of Technology 0.689
The Central University Of Finance and Economics 0.617
Chinese department of traditional Chinese medicine institute 0.493
China University of Political Science & Law 0.345
Distance of the table 3 " Tian'anmen Square " to node
Node Distance (km)
Central Drama Institute 3.69
Central Conservatory of Music 3.27
Beijing commerce Professional School 3.08
Beijing Normal University north school district 3.78
The Chinese College of Buddhism 3.22
China Concord Medical Science University's nursing college 2.08
China Islamism Scripture Institute 3.30
Xuan Wu branch of Beijing Institute of Education 3.23
Beijing Jiaotong University 7.05
Beijing University of Technology 7.87
The Central University Of Finance and Economics 7.84
Chinese department of traditional Chinese medicine institute 4.65
China University of Political Science & Law 7.78
Distance of the table 4 " Central Drama Institute " to remaining node
Node Distance (km)
Central Conservatory of Music 5.40
Beijing commerce Professional School 2.24
Beijing Normal University north school district 1.18
The Chinese College of Buddhism 5.72
China Concord Medical Science University's nursing college 3.09
China Islamism Scripture Institute 6.58
Xuan Wu branch of Beijing Institute of Education 6.90
Beijing Jiaotong University 5.53
Beijing University of Technology 9.66
The Central University Of Finance and Economics 1.97
Chinese department of traditional Chinese medicine institute 5.80
China University of Political Science & Law 5.39
Step 3:When termination condition is unsatisfactory for, new node is selected
Weight value α=β=0.5 shared by hypothesis text and space, so trying to achieve new dividing according to formula (1), (2), (3) Number, such as DF ' (Central Conservatory of Music)=9 × (0.5 × 0.255+0.5 × 0.729)=4.428, DF ' (Beijing commerce occupations Institute)=8.7 × (0.5 × 0.538+0.5 × 0.331)=3.780 result is as shown in table 5:
Table 5 selects fractional result new after " Central Drama Institute " node
Node Fraction
Central Conservatory of Music 4.428
Beijing commerce Professional School 3.780
Beijing Normal University north school district 2.402
The Chinese College of Buddhism 6.315
China Concord Medical Science University's nursing college 5.034
China Islamism Scripture Institute 5.091
Xuan Wu branch of Beijing Institute of Education 4.405
Beijing Jiaotong University 2.353
Beijing University of Technology 3.813
The Central University Of Finance and Economics 1.812
Chinese department of traditional Chinese medicine institute 1.782
China University of Political Science & Law 0.185
Fraction highest node " the Chinese College of Buddhism " is obtained according to the result of table 5, " the central authorities' play of two nodes has been obtained now Acute institute " and " the Chinese College of Buddhism ", because 2<K=5, continuation tries to achieve 4 nodes according to algorithm.
Selecting, the new fractional result of " the Chinese College of Buddhism " remaining node afterwards is as shown in table 6:
Table 6 selects fractional result new after " the Chinese College of Buddhism " node
Node Fraction
Central Conservatory of Music 1.242
Beijing commerce Professional School 2.767
Beijing Normal University north school district 1.546
China Concord Medical Science University's nursing college 4.367
China Islamism Scripture Institute 1.392
Xuan Wu branch of Beijing Institute of Education 1.821
Beijing Jiaotong University 1.320
Beijing University of Technology 2.926
The Central University Of Finance and Economics 1.242
Chinese department of traditional Chinese medicine institute 1.295
China University of Political Science & Law 0.477
Fraction highest node " China Concord Medical Science University's nursing college " is obtained according to the result of table 6, remaining node New fractional result is as shown in table 7:
Table 7 selects fractional result new after " China Concord Medical Science University's nursing college " node
Node Fraction
Central Conservatory of Music 0.738
Beijing commerce Professional School 0.876
Beijing Normal University north school district 0.843
China Islamism Scripture Institute 1.027
Xuan Wu branch of Beijing Institute of Education 1.216
Beijing Jiaotong University 0.725
Beijing University of Technology 1.719
The Central University Of Finance and Economics 0.806
Chinese department of traditional Chinese medicine institute 0.520
China University of Political Science & Law 0.256
Fraction highest node " Beijing University of Technology ", the new fractional result of remaining node are obtained according to the result of table 7 As shown in table 8:
Table 8 selects fractional result new after " Beijing University of Technology " node
Node Fraction
Central Conservatory of Music 0435
Beijing commerce Professional School 0.493
Beijing Normal University north school district 0.523
China Islamism Scripture Institute 0.613
Xuan Wu branch of Beijing Institute of Education 0.580
Beijing Jiaotong University 0.394
The Central University Of Finance and Economics 0.645
Chinese department of traditional Chinese medicine institute 0.261
China University of Political Science & Law 0.136
Fraction highest node " The Central University Of Finance and Economics " is obtained according to the result of table 8, present l=5=k obtains 5 letters Breath, " Central Drama Institute ", " the Chinese College of Buddhism ", " China Concord Medical Science University's nursing college ", " Beijing University of Technology ", " in Its concrete locus of centre finance and economics university " is as shown in Figure 2:Fig. 2 is the locus schematic diagram of retrieval result information.According to Fig. 2 It can be seen that 5 information can be caused to cover for the retrieving all directions of " Tian'anmen Square " periphery, do not limit to some direction.

Claims (2)

1. multifarious geographical space interest point search method is based on, it is characterised in that:
This method realizes that step is as follows to obtain front k locus:
Step one:For given location point or location point carry out initialization sequence with crucial contamination;
Step 1.1:Collect and disposal data collection, build data relationship;At this moment digraph G (V, E), wherein V (v are defined1,..., vn) it is set of node, node on behalf various information here, E is the set of representative edge, E=<vi,vj>|vi,vj∈ V },<vi,vj >Represent from viTo vjA line, v1,...,vnThe arbitrary node in digraph is represented, here n is natural number;
Step 1.2:By below equation to calculate R in each node viFraction:
DF(vi)=[fs (vi)*ds(vi)]as*[ft(vi)*dt(vi)]at*[fg(vi)*dg(vi)]ag (1)
Wherein fs (.), ft (.), fg (.) are respectively the fraction of social parameter, text parameter and geographic factor, ds (.), dt (.), dg (.) is respectively corresponding diversity fraction, as, at, ag and for 1, affect for controlling each parameter;
Diversity fraction is calculated by below equation:
d s ( v i ) = &Sigma; v j &Element; R , v i &NotEqual; v j s s ( v i , v j ) k - 1 - - - ( 2 )
Wherein ss (vi,vj) it is viAnd vjThe difference of social parameters, is calculated using Jaccard distances Ibid, the value of dt (.) and dg (.) is calculated;
To sum up, the fraction of each node in data set is iterated to calculate out, and selects node mid-score highest node v0
Step 2:The geographical position being located according to the fraction highest node for selecting carries out geographical space and cuts to other nodes It is weak;
Step 2.1:Fraction highest node according to selecting in step one is associated the same of the weakening of relation to other summits When be also carried out the weakening of geographical space, it is assumed that fraction highest node v0Location point to initial position p distance be d (p, v0), Initial position to the distance of other nodes is d (p, vi), v0Distance to other nodes is d (v0,vi), then by following public affairs Formula is calculating geographical space value:
d i = d ( v 0 , v i ) d ( p , v 0 ) + d ( p , v i ) - - - ( 3 )
Knowable in formula 3, d (v0,vi) it is v0Distance to other nodes is bigger, and required geographical space value is bigger, illustrates section Point viBigger with the nodal distance for selecting, two node directions spatially are also just different;
To sum up, selected node is calculated successively to geographical space value d of remaining remaining nodei
Step 3:When termination condition is unsatisfactory for, new node is selected;
Step 3.1:Assume that the result after weakening to incidence relation is a, weight shared by text is α, then after remaining node weakens Textual value is a × α;
Step 3.2:Assume to weight shared by space to be β, wherein alpha+beta=1, then the spatial value after remaining node weakens is d × β;
Step 3.3:The fraction after remaining node weakens to text and space is calculated by below equation:
DF’(vi)=DF (vi)×(a×α+d×β) (4)
To sum up, new fraction of the remaining node after the weakening to text and space in R is calculated, then therefrom selects fraction most High node.
2. according to claim 1 based on multifarious geographical space interest point search method, it is characterised in that:Select k The process of individual result is:
1.) queue H is initializedkFor sky, input position point or location point and crucial contamination;
2.) according to input information, data relationship is built;
3. the fraction of each node) is calculated;
4.) obtain fraction highest node and add HkIn, l=1;
5.) l is worked as<Turn 6.), otherwise to turn 9.) during k;
6.) weakening of relation is associated according to selected node, and calculates diValue;
7.) weakening according to text and space and shared weight, calculate new fraction;
8.) obtain fraction highest node and add HkIn, l++ turns 5;
9.) queue H is returnedk
The H for now returningkThe i.e. required k bar information that will be retrieved.
CN201611254804.XA 2016-12-30 2016-12-30 Geographic space interest point retrieval method based on diversity Active CN106649846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611254804.XA CN106649846B (en) 2016-12-30 2016-12-30 Geographic space interest point retrieval method based on diversity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611254804.XA CN106649846B (en) 2016-12-30 2016-12-30 Geographic space interest point retrieval method based on diversity

Publications (2)

Publication Number Publication Date
CN106649846A true CN106649846A (en) 2017-05-10
CN106649846B CN106649846B (en) 2019-12-20

Family

ID=58837252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611254804.XA Active CN106649846B (en) 2016-12-30 2016-12-30 Geographic space interest point retrieval method based on diversity

Country Status (1)

Country Link
CN (1) CN106649846B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101784005A (en) * 2009-12-17 2010-07-21 华为终端有限公司 Method for retrieving point of interest and terminal thereof
CN102594905A (en) * 2012-03-07 2012-07-18 南京邮电大学 Method for recommending social network position interest points based on scene
US20130166196A1 (en) * 2011-12-21 2013-06-27 Telenav, Inc. Navigation system with point of interest validation mechanism and method of operation thereof
CN103984683A (en) * 2013-02-07 2014-08-13 百度在线网络技术(北京)有限公司 LBS (location based service)-based retrieval method and equipment
CN105912646A (en) * 2016-04-09 2016-08-31 北京工业大学 Keyword retrieval method based on diversity and proportion characteristics

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101784005A (en) * 2009-12-17 2010-07-21 华为终端有限公司 Method for retrieving point of interest and terminal thereof
US20130166196A1 (en) * 2011-12-21 2013-06-27 Telenav, Inc. Navigation system with point of interest validation mechanism and method of operation thereof
CN102594905A (en) * 2012-03-07 2012-07-18 南京邮电大学 Method for recommending social network position interest points based on scene
CN103984683A (en) * 2013-02-07 2014-08-13 百度在线网络技术(北京)有限公司 LBS (location based service)-based retrieval method and equipment
CN105912646A (en) * 2016-04-09 2016-08-31 北京工业大学 Keyword retrieval method based on diversity and proportion characteristics

Also Published As

Publication number Publication date
CN106649846B (en) 2019-12-20

Similar Documents

Publication Publication Date Title
Jiang Ranking spaces for predicting human movement in an urban environment
CN103268348B (en) A kind of user&#39;s query intention recognition methods
CN101458708B (en) Searching result clustering method and device
CN103678412B (en) A kind of method and device of file retrieval
CN102419778A (en) Information searching method for discovering and clustering sub-topics of query statement
CN102411621A (en) Chinese inquiry oriented multi-document automatic abstraction method based on cloud mode
EP2557511B1 (en) Information processing device, information processing method, information processing programme, and recording medium
CN105447080B (en) A kind of inquiry complementing method in community&#39;s question and answer search
WO2006017081A3 (en) Method and system for collecting and posting local advertising to a site accessible via a computer network
CN103294778A (en) Method and system for pushing messages
CN106202294A (en) The related news computational methods merged based on key word and topic model and device
CN102682046A (en) Member searching and analyzing method in social network and searching system
Caramazza et al. X-ray flares in Orion low-mass stars
CN101923556B (en) Method and device for searching webpages according to sentence serial numbers
CN101950291A (en) Search engine method for database
CN103186509A (en) Wildcard character class template generalization method and device and general template generalization method and system
CN107908627A (en) A kind of multilingual map POI search systems
CN102567392A (en) Control method for interest subject excavation based on time window
CN104536957B (en) Agricultural land circulation information retrieval method and system
CN108090220A (en) Point of interest search sort method and system
CN105205099A (en) Agricultural product price analysis method
CN102122296B (en) Search result clustering method and device
CN101237465A (en) A webpage context extraction method based on quick Fourier conversion
CN106649846A (en) Geographic space interest point retrieval method based on diversity
Mohamad et al. A bibliometric analysis on scientific production of Geographical Information System (GIS) in Web of Science

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant