CN103077201B - A kind of unknown position evaluation method based on the detection of internet active iteration - Google Patents
A kind of unknown position evaluation method based on the detection of internet active iteration Download PDFInfo
- Publication number
- CN103077201B CN103077201B CN201210579579.2A CN201210579579A CN103077201B CN 103077201 B CN103077201 B CN 103077201B CN 201210579579 A CN201210579579 A CN 201210579579A CN 103077201 B CN103077201 B CN 103077201B
- Authority
- CN
- China
- Prior art keywords
- location
- search
- internet
- location expression
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention relates to a kind of unknown position evaluation method based on the detection of internet active iteration.Comprise the following steps: 1) check user's input position, if data base querying failure, then utilize network engine to obtain the relevant collections of web pages in position; 2) extract the location expression in webpage and classify; 3) the credible rate <i>C of Search Results is calculated
s</i>, if <i>C
s</i> meets threshold value <i>C
min</i>, skips to step 5; 4) step 1 is repeated to step 3 to the ambiguous location in Search Results, until credible rate meets threshold value or reaches number of times restriction; 5) calculate the geographic range of location expression, merge the approximate geographic range obtaining target location; The present invention makes full use of internet, and middle quantity is enriched, the geographical knowledge resource of dynamic change, the approximate extents of estimation unknown position.Describe for text position various informative in internet, adopt the multiple dimensioned position extracting method based on semanteme, and utilize the approximate geographic range of Point-Radius algorithm estimated position.
Description
Technical field
The present invention relates to a kind of unknown position evaluation method, especially relate to a kind of unknown position evaluation method based on the detection of internet active iteration.
Background technology
Along with the development of the location technologies such as GPS and perfect, location Based service LBS(Location-BasedService) application constantly expand, such as various electronic map service platform (Baidu's map, Google Maps, Bing map etc.), Tourism Information Applying System, daily life point of interest query system, traffic query system, social networks etc.These Location Service Platform or system provide the method for place information inquiry to mainly contain two kinds: one utilizes GPS location, map operation etc. to obtain comparatively accurate position coordinates to inquire about; Another kind utilizes natural language location expression to inquire about, and this qualitative or semiquantitative location expression exists multiple uncertainty, but compares the communicative habits and cognition that meet the mankind.Towards natural language position enquiring, location database needs the mapping relations between memory location title and geographic range, and existing location database is due to reasons such as high, the consuming time length of construction cost, dimension-limited, renewal difficulties, be difficult to store all location names, but mainly concentrate on collection and the preservation of the critical positions such as main place name, address, conspicuousness POI.Therefore, substantial amounts, conspicuousness is little, importance is relatively low position in life are carried out to inquiry and become and cannot realize, thus contradict with the location-based service demand of comprehensive, multi-level, many granularities.(list of references: Gu Jing, location-based information service applied system research and exploitation [D]. Xian Electronics Science and Technology University, 2004; Xia Baoguo, based on the design and implimentation [D] of Wuhan City's Tourism Information Applying System of GIS. the Central China University of Science and Technology, 2006; Gao Weisi, the design [D] of location Based service and City Traffic Navigation System. Yunnan University, 2011; Yang Yuyao etc., a kind of mobile Internet social model [J] based on geographical location information. Journal of Computer Research and Development, 2011; )
Internet provides abundant geographical knowledge as large scale knowledge base, can as the growth data source of location-based service.The reference position information of web search, needs to utilize natural language understanding extracting position from large amount of text information to describe.Natural language understanding is the various Theories and methods that can realize carrying out with natural language between people and computing machine efficient communication, and the natural language understanding of location expression is mainly to the identification of position Name & Location relation.About the identification of location name, existing research lays particular emphasis on extracts geographical named entity or place name, mainly contain two kinds of methods: one is rule-based method, set up corpus and the formation rule of geographical named entity or place name, adopt the mode of rule match to identify, this method requires strict to concept formation rule, can improve the accuracy rate extracting result, but it is a lot of to make recall ratio decline, be difficult to the problem solving ambiguous location and reposition identification; Another kind is Statistics-Based Method, owing to not considering syntax, information semantically, the noise inevitably introduced acquisition and the adjacent high frequency words of some low frequency languages exists some problems.About the identification of position relationship, existing research mainly lays particular emphasis on extracts basic spatial relationships (topological relation, metric relation, position relation etc.), mainly contain two kinds of methods: a kind of is method based on Sentence analysis, this method needs thoroughly to understand syntactic structure and sentence semantics, there is fragility and many ambiguity problems; Be the method based on pattern, can avoid carrying out exhaustive analysis to statement, but rich due to natural language expressing, and there is multiple expression way in same information, the quantity of pattern can be made sharply to expand.(list of references: happy little legendary small dragon with horns etc., based on the natural language spatial Concept Extraction [J] of spatial semantic role. Wuhan University Journal information science version, 2005; Jiang Lin etc., the acquisition of geographical entity concept and position relationship thereof and checking [J]. computer science, 2007; Li Lishuan etc., based on place name identification [J] in the Chinese text of support vector machine. Dalian University of Science & Engineering journal, 2007; Li Hanjing, based on the concept of space Modeling Research [D] of natural language processing. Harbin Institute of Technology, 2007; Li Yusen, towards the geographical named entity recognition research [J] of GIS. Chongqing Mail and Telephones Unvi's journal (natural science edition), 2008; Malong, based on the research [D] of the Chinese place names recognition of conditional random fields model. Dalian University of Technology, 2009; Tang Xu etc., the Chinese place names recognition based on chapter is studied [J]. Journal of Chinese Information Processing, 2010; Jiang Wenming, the directional spatial relationships abstracting method towards Chinese text studies [D]. Nanjing Normal University, 2010; Shen Qijun, Chinese text spatial relationship mask method research [D]. Nanjing Normal University, 2010; Zhang Xueying etc., rule-based Chinese address analysis of essentials method [J]. Earth Information Science journal, 2010; Li Haiguang, Chinese named entity Relation extraction research [D] of position-based and semantic feature. HeFei University of Technology, 2011; Du Ping etc., Chinese place names recognition and ambiguity are eliminated---be called example [J] with China's administrative division above county level. remote sensing technology and application, 2011.)
There is dimension-limited, upgrade the problem of difficulty in location database, the geographic location information query (especially ambiguous location inquiry) of position-based database there will be location name and is difficult to identify or the situation of coverage disappearance, is not enough to meet consumers' demand.Contain abundant geographical knowledge in internet, the descriptor of a large amount of interested position can be provided for estimation " the unknown " position coverage.And the information how searching position is relevant from internet, and therefrom obtain the approximate geographic range of " the unknown " position, be groundwork of the present invention.
Summary of the invention
The present invention mainly solves the technical matters existing for prior art; Provide and a kind ofly can make full use of internet that middle quantity is enriched, the geographical knowledge resource of dynamic change, realize estimating the approximate extents of target location.
Above-mentioned technical matters of the present invention is mainly solved by following technical proposals:
Based on a unknown position evaluation method for internet active iteration detection, it is characterized in that, comprise the following steps:
Step 1, checks user's input position query word; If position cannot obtain geographical covering from spatial database, then initiatively start internet iteration detection, be namely the theme with target location and utilize network search engines to crawl target location relevant information from internet;
Step 2, is the theme with position enquiring word and carries out initial probe, utilizes network engine to obtain from internet to comprise the collections of web pages that target location describes;
Step 3, the network documentation that the target location obtained for step 2 describes carries out geographic position parsing, and namely from network documentation, extract natural language location expression, described natural language location expression comprises reference position and spatial relationship;
Step 4, the natural language location expression adopting step 3 to obtain carries out location expression classification; If the reference position of location expression can obtain geographical covering from location database, location expression stored in accurate description collections P, otherwise stored in vague description set A;
Step 5, the credible rate C of assessment current search
s; If C
sbe less than the credible threshold value C of search
min, be the theme with the reference position in vague description set A and carry out new round internet text search; If Cs is greater than or equal to the credible threshold value C of search
min, then step 7 is skipped to;
Step 6, repeats step 1 to step 5, till often the credible rate of wheel Search Results meets threshold value or reaches searching times restriction;
Step 7, calculates approximate geographic range and the confidence level thereof of all location expressions;
Step 8, the multiple location expression geography of integrated and refinement covers, and obtains the geographic range of target location;
In step 3 described in above-mentioned a kind of unknown position evaluation method detected based on internet active iteration, the identification of natural language location expression mainly comprises the identification of location name identification and spatial relationship, adopt the multiple dimensioned extracting method based on semanteme to extract natural language location expression, specifically comprise following sub-step:
Step 3.1, sets up the corpus of location expression, stores and express location name and the feature vocabulary of spatial relationship and the syntactic pattern of location expression in corpus; Here, set up corpus to be set up by the mode of artificial conclusion and machine learning.
Step 3.2, under the support of corpus, carries out pattern match to network text, obtains location expression;
Step 3.3, eliminates place name ambiguity based on geography with the semanteme of non-geographic.
In the step 4 described in above-mentioned a kind of unknown position evaluation method detected based on internet active iteration, the prerequisite utilizing reference position and spatial relationship estimation target location is that reference position can obtain accurate geographic range from location database, set single location expression to express according to formula one, RO is reference position title, SR is locational space relation, T is the time of origin of location expression, and C is the confidence level that location expression has, and S is the searching for reference of references object RO; Extract K location expression Loc before in result
i, and classify according to precondition, work as Loc
iwhen .RO meeting precondition, Loc
istored in accurate description collections P, otherwise stored in vague description set A;
Loc={RO, SR, T, C, S} formula one
In the step 5 described in above-mentioned a kind of unknown position evaluation method based on internet initiatively iteration detection, the credible rate C of assessment current search
sconcrete grammar be: definition search credible rate C
sas evaluation index, searching for credible rate is the confidence level sum of all location expressions in P and the ratio of location expression sum, and shown in two, m is location expression number in P, and K is location expression sum, Loc
i.C be the confidence level of certain location expression.
The confidence level of location expression calculates according to formula three, and wherein ε is attenuation parameter, and n is searching times, and it is 1 when searching for first that desired location describes confidence level, and decays along with the increase of searching times;
Loc
i.C=1* (ε)
nformula three
Work as C
smeet minimum credible threshold value C
mintime, directly export accurate description collections P and carry out target location estimation; Work as C
swhen not satisfying condition, the method based on the search of internet successive ignition is adopted to ensure to search for credible rate, new round internet hunt is carried out in the fuzzy reference position of namely getting in A, first estimates reference position geographic range by Internet resources, and then utilizes estimation target location, reference position.
Step 6 described in above-mentioned a kind of unknown position evaluation method based on internet initiatively iteration detection is fuzzy reference position iterative search; The process of foundation step 4 and step 5, setting search result adopts formula four to express, and n is searching times, and m is that P is accurate description collections, and A is vague description set, C when time position number of search
sit is the credible rate of search.
WS [n] [m]={ P, A, C
sformula four
Described iterative search procedures comprises following sub-step:
Step 6.1, describes WS [0] [0] .A stored in search set Q, if n=0, m=0 by the ambiguous location of target location Search Results;
Step 6.2, gets vague description set WS [n] [m] .A in Q, judges whether n+1 reaches searching times restriction, if it is exits search;
Step 6.3, gets location expression Loc in WS [n] [m] .A successively
icarry out (n+1)th search, obtain Search Results WS [n+1] [i], and the references object RO search being associated with location expression is quoted, be i.e. Loc
i.S=WS [n+1] [i];
Step 6.4, has removed vague description set WS [n] [m] .A of search from Q, checks
WS [n+1] [i] .C
swhether meet threshold value C
minif do not meet, WS [n+1] [i] .A is put into search set Q;
Step 6.5, checks in Q whether there is vague description set, if had, repeats step 6.2 and carries out iterative search to step 6.4.
In the step 7 described in above-mentioned a kind of unknown position evaluation method based on internet initiatively iteration detection, ambiguous location due to kth Search Results describes to be needed with reference to kth+1 Search Results, adopt the mode that backward calculates, namely from last search, carry out geographic range calculating, specifically comprise following sub-step:
Step 7.1, in definition Search Results WS, searching times is n, and n-th searching position number is m, m=WS [n-1] .size; Definition geographic range set FC stores the geographic range of each Search Results;
Step 7.2, gets Search Results WS [n-1] [m-1] of n-th search m position;
Step 7.3, gets the position Loc in WS [n-1] [m-1] .P successively
y, position-based data base querying reference position coordinate, utilizes Point-Radius algorithm to calculate geographical covering FP (y) and confidence level CP (y) thereof;
Step 7.4, gets the position Loc in WS [n-1] [m-1] .A successively
x, utilize Loc
x.S in geographic range set FC, inquire about reference position coordinate, if successfully obtain coordinate, then utilize Point-Radius algorithm to calculate geographical covering FA (y) and confidence level CA (y) thereof;
Step 7.5, merges the geographic range of all positions in P and A, obtains the geographic range FC (WS [n-1] [m-1]) when time Search Results;
Step 7.6, judges whether m-1 is greater than 0; If be greater than 0, then carry out the position calculation of next Search Results, make m=m-1, skip to step b); If be less than or equal to 0, then carry out next step;
Step 7.7, judges whether n-1 is greater than 0; If be greater than 0, then carry out the position calculation of a front Search Results, make n=n-1, m=WS [n-1] .size, skips to step b); If be less than or equal to 0, then carry out next step;
Step 7.8, exports FC (WS [0] [0]).
Therefore, tool of the present invention has the following advantages: can making full use of internet, middle quantity is enriched, the geographical knowledge resource of dynamic change, realizes estimating the approximate extents of target location.Because positional information in internet associates complexity with non-position information, and information representation diversification of forms, the present invention is directed to the natural language text information in internet, adopt the multiple dimensioned extracting method based on semanteme to extract location expression from web page text, and utilize the approximate geographic range of Point-Radius algorithm calculated target positions.。
Accompanying drawing explanation
Fig. 1 is the process flow diagram of internet active searching method.
Fig. 2 is based on the process flow diagram of the position calculation of the Internet search results.
Embodiment
Below by embodiment, and by reference to the accompanying drawings, technical scheme of the present invention is described in further detail.
Embodiment:
1, theoretical foundation.
1.1, geographic information retrieval (GeographicInformationRetrieval, GIR).
Geographic information retrieval is the restriction according to geographic query scope, returns the document relevant to geographical information query.Basic ideas utilize web crawlers from search and webpage set internet, by the place name in named entity recognition and classification and grammatical analysis identification webpage, thus determine the geographic range of query word and document, finally the degree of association (comprising textual association and space correlation) calculated between document and query word returns and sorts result for retrieval.Current most of geographic information retrieval mainly adopts Keywords matching algorithm, place name in term and network documentation all needs to have clear and definite coverage area and carries out corresponding technology, this mode is difficult to the situation adapting to fuzzy place name (such as the middle and lower reach of Yangtze River), thus cannot be directly used in the unknown position estimation of search Network Based.The present invention is with reference to the thinking of geographic information retrieval, propose a kind of multiple dimensioned Iterative search algorithm (as Fig. 1), the relevant network documentation of unknown position is obtained based on internet, and extract the location expression comprising unknown position, and then the reference position in location expression and spatial relationship is utilized to calculate the approximate geographic range of unknown position.Main flow obtains after collections of web pages by Meta Search Engine from internet, based on the location expression comprising query word in extraction of semantics webpage, if location expression is discontented can believe that rate carries out query word position estimation completely, then the ambiguous location identified is carried out to the Internal retrieval of a new round, this process is the process of an iteration, as long as credible rate condition does not meet or do not reach search restriction, obtain with regard to constantly carrying out web search the reference information can estimating ambiguous location geographic range.
1.2, location expression geographic registration (GeoreferencingLocalityDescriptions, GLD).
Location expression geographic registration position is described from text the numerical value converted to certain coordinate system describe.Desirable location expression geographic registration process be by text describe change into numeral describe can and be mapped on map, and express the spatial dimension of position and the uncertainty of position distribution, algorithm popular is at present Point-Radius algorithm and Probability algorithm.Point-Radius method utilizes a point and maximum error to describe position and uncertainty thereof, the uncertainty source of main consideration comprises reference position (spatial dimension of reference position, geodetic datum, coordinate precision, map scale) and spatial relationship (uncertainty of distance relation uncertainty and direction relations), all uncertainty Metric Projections, to the maximum error of a dimension as target location, express target location to put the border circular areas formed as radius with maximum error.Probability method adopts uncertainty probability density surface to express target location and uncertainty thereof, and main consideration uncertainty source comprises the space distribution of destination object, the out of true of spatial relationship and ambiguity, the imperfection of references object and the uncertainty of location expression itself.Point-Radius method belongs to the position calculation of quantification manner, can obtain target location and likely there is geography a little and cover, be applicable to semiquantitative text position and describe; Probability method cannot quantitatively calculated target positions geography cover, but the probability distribution of target location can be provided, be applicable to qualitatively text position describe.
2, implementation procedure.
(1), check that user inputs target location query word; Search query word in location database, if position does not exist or location geographic covers disappearance, then initiatively carry out the inquiry of search pattern Network Based, being namely the theme with target location utilizes network search engines to crawl target location relevant information from internet;
(2), identify and extract the natural language location expression (comprising reference position and spatial relationship) in network documentation; The identification of natural language location expression mainly comprises the identification of location name identification and spatial relationship, and the present invention adopts the multiple dimensioned extracting method based on semanteme to extract natural language location expression.First, by manually to conclude and the mode of machine learning sets up the corpus of location expression, in corpus, store expression location name and the feature vocabulary of spatial relationship and the syntactic pattern of location expression; Then, under the support of corpus, pattern match is carried out to network text, obtain location expression; Finally, place name ambiguity is eliminated based on geography with the semanteme of non-geographic;
(3), location expression classification; The prerequisite utilizing reference position and spatial relationship estimation target location is that reference position can obtain accurate geographic range from location database, set single location expression to express according to formula (1), RO is reference position title, SR is locational space relation, T is the time of origin of location expression, C is the confidence level that location expression has, and S is the searching for reference of references object RO.Extract K location expression Loc before in result
i, and classify according to precondition, work as Loc
iwhen .RO meeting precondition, Loc
istored in accurate description collections P, otherwise stored in vague description set A;
Loc={RO,SR,T,C,S}(1)
(4) the credible rate C of search, is calculated
s; In Search Results, the confidence level of location expression must reach certain level and could be used for estimating target location, and the present invention proposes to search for credible rate C
sas evaluation index, searching for credible rate is the confidence level sum of all location expressions in P and the ratio of location expression sum, and as shown in Equation (2), m is location expression number in P, and K is location expression sum, Loc
i.C be the confidence level of certain location expression.
The confidence level of location expression calculates according to formula (3), and wherein ε is attenuation parameter, and n is searching times, and it is 1 when searching for first that desired location describes confidence level, and decays along with the increase of searching times.
Loc
i.C=1*(ε)
n(3)
Work as C
smeet minimum credible threshold value C
mintime, directly export accurate description collections P and carry out target location estimation; Work as C
swhen not satisfying condition, the present invention adopts the method based on the search of internet successive ignition to ensure to search for credible rate, new round internet hunt is carried out in the fuzzy reference position of namely getting in A, first estimates reference position geographic range by Internet resources, and then utilizes estimation target location, reference position;
(5), fuzzy reference position iterative search; The process of foundation step 3 and step 4, setting search result adopts formula (4) to express, and n is searching times, and m is that P is accurate description collections, and A is vague description set, C when the secondary position number searched for
sit is the credible rate of search.
WS[n][m]={P,A,C
s}(4)
Iterative search procedures is as follows:
A). the ambiguous location of target location Search Results is described WS [0] [0] .A stored in search set Q, if n=0, m=0;
B). get vague description set WS [n] [m] .A in Q, judge whether n+1 reaches searching times restriction, if it is exits search;
C). get location expression Loc in WS [n] [m] .A successively
icarry out (n+1)th search, obtain Search Results WS [n+1] [i], and the references object RO search being associated with location expression is quoted, be i.e. Loc
i.S=WS [n+1] [i];
D). from Q, remove vague description set WS [n] [m] .A of search, checked
WS [n+1] [i] .C
swhether meet threshold value C
minif do not meet, WS [n+1] [i] .A is put into search set Q;
E). check in Q whether there is vague description set, if had, repeat step b) to steps d) carry out iterative search;
(6) approximate geographic range and the confidence level thereof of all location expressions, is calculated; Ambiguous location due to kth Search Results describes to be needed with reference to kth+1 Search Results, and therefore the present invention's mode of adopting backward to calculate, namely carries out geographic range calculating from searching for for the last time.As shown in Figure 2, computation process is as follows:
A). in definition Search Results WS, searching times is n, and n-th searching position number is m,
M=WS [n-1] .size; Definition geographic range set FC stores the geographic range of each Search Results;
B). get Search Results WS [n-1] [m-1] of n-th search m position;
C). get the position Loc in WS [n-1] [m-1] .P successively
y, position-based data base querying reference position coordinate, utilizes Point-Radius algorithm to calculate geographical covering FP (y) and confidence level CP (y) thereof;
D). get the position Loc in WS [n-1] [m-1] .A successively
x, utilize Loc
x.S in geographic range set FC, inquire about reference position coordinate, if successfully obtain coordinate, then utilize Point-Radius algorithm to calculate geographical covering FA (y) and confidence level CA (y) thereof;
E). merge the geographic range of all positions in P and A, obtain the geographic range FC (WS [n-1] [m-1]) when time Search Results;
F). judge whether m-1 is greater than 0; If be greater than 0, then carry out the position calculation of next Search Results, make m=m-1, skip to step b); If be less than or equal to 0, then carry out next step;
G). judge whether n-1 is greater than 0; If be greater than 0, then carry out the position calculation of a front Search Results, make n=n-1, m=WS [n-1] .size, skips to step b); If be less than or equal to 0, then carry out next step;
H). export FC (WS [0] [0]);
Specific embodiment described herein is only to the explanation for example of the present invention's spirit.Those skilled in the art can make various amendment or supplement or adopt similar mode to substitute to described specific embodiment, but can't depart from spirit of the present invention or surmount the scope that appended claims defines.
Claims (3)
1., based on a unknown position evaluation method for internet active iteration detection, it is characterized in that, comprise the following steps:
Step 1, checks user's input position query word; If position cannot obtain geographical covering from spatial database, then initiatively start internet iteration detection, be namely the theme with target location and utilize network search engines to crawl target location relevant information from internet;
Step 2, is the theme with position enquiring word and carries out initial probe, utilizes network engine to obtain from internet to comprise the collections of web pages that target location describes;
Step 3, the network documentation that the target location obtained for step 2 describes carries out geographic position parsing, and namely from network documentation, extract natural language location expression, described natural language location expression comprises reference position and spatial relationship;
Step 4, the natural language location expression adopting step 3 to obtain carries out location expression classification; If the reference position of location expression can obtain geographical covering from location database, location expression stored in accurate description collections P, otherwise stored in vague description set A;
Step 5, the credible rate Cs of assessment current search; If Cs is less than the credible threshold value C of search
min, be the theme with the reference position in vague description set A and carry out new round internet text search, if Cs is greater than or equal to the credible threshold value C of search
min, then skipping to the concrete grammar that step 7 assesses the credible rate Cs of current search is: the credible rate C of definition search
sas evaluation index, searching for credible rate is the confidence level sum of all location expressions in P and the ratio of location expression sum, and shown in two, m is location expression number in P, and K is location expression sum, Loc
i.C be the confidence level of certain location expression:
The confidence level of location expression calculates according to formula three, and wherein ε is attenuation parameter, and n is searching times, and it is 1 when searching for first that desired location describes confidence level, and decays along with the increase of searching times;
Loc
i.C=1* (ε)
nformula three
Work as C
smeet minimum credible threshold value C
mintime, directly export accurate description collections P and carry out target location estimation; Work as C
swhen not satisfying condition, the method based on the search of internet successive ignition is adopted to ensure to search for credible rate, new round internet hunt is carried out in the fuzzy reference position of namely getting in A, first estimates reference position geographic range by Internet resources, and then utilizes estimation target location, reference position.
Step 6, repeats step 1 to step 5, till often the credible rate of wheel Search Results meets threshold value or reaches searching times restriction;
Step 7, calculates approximate geographic range and the confidence level thereof of all location expressions;
Step 8, the multiple location expression geography of integrated and refinement covers, and obtains the geographic range of target location;
Described step 6 is fuzzy reference position iterative search; The process of foundation step 4 and step 5, setting search result adopts formula four to express, and n is searching times, and m is that P is accurate description collections, and A is vague description set, C when time position number of search
sthe credible rate of search:
WS [n] [m]={ P, A, C
sformula four
Described iterative search procedures comprises following sub-step:
Step 4.1, describes WS [0] [0] .A stored in search set Q, if n=0, m=0 by the ambiguous location of target location Search Results;
Step 4.2, gets vague description set WS [n] [m] .A in Q, judges whether n+1 reaches searching times restriction, if it is exits search;
Step 4.3, gets location expression Loc in WS [n] [m] .A successively
icarry out (n+1)th search, obtain Search Results WS [n+1] [i], and the references object RO search being associated with location expression is quoted, be i.e. Loc
i.S=WS [n+1] [i];
Step 4.4, has removed vague description set WS [n] [m] .A of search from Q, checks
WS [n+1] [i] .C
swhether meet threshold value C
minif do not meet, WS [n+1] [i] .A is put into search set Q;
Step 4.5, checks in Q whether there is vague description set, if had, repeats step 4.2 and carries out iterative search to step 4.4;
Described step 7, the ambiguous location due to kth Search Results describes to be needed, with reference to kth+1 Search Results, to adopt the mode that backward calculates, and namely from last search, carries out geographic range calculating, specifically comprises following sub-step:
Step 5.1, in definition Search Results WS, searching times is n, and n-th searching position number is m, m=WS [n-1] .size; Definition geographic range set FC stores the geographic range of each Search Results;
Step 5.2, gets Search Results WS [n-1] [m-1] of n-th search m position;
Step 5.3, gets the position Loc in WS [n-1] [m-1] .P successively
y, position-based data base querying reference position coordinate, utilizes Point-Radius algorithm to calculate geographical covering FP (y) and confidence level CP (y) thereof;
Step 5.4, gets the position Loc in WS [n-1] [m-1] .A successively
x, utilize Loc
x.S in geographic range set FC, inquire about reference position coordinate, if successfully obtain coordinate, then utilize Point-Radius algorithm to calculate geographical covering FA (y) and confidence level CA (y) thereof;
Step 5.5, merges the geographic range of all positions in P and A, obtains the geographic range FC (WS [n-1] [m-1]) when time Search Results;
Step 5.6, judges whether m-1 is greater than 0; If be greater than 0, then carry out the position calculation of next Search Results, make m=m-1, skip to step 5.2; If be less than or equal to 0, then carry out next step;
Step 5.7, judges whether n-1 is greater than 0; If be greater than 0, then carry out the position calculation of a front Search Results, make n=n-1, m=WS [n-1] .size, skips to step 5.2; If be less than or equal to 0, then carry out next step;
Step 5.8, exports FC (WS [0] [0]);
2. a kind of unknown position evaluation method based on the detection of internet active iteration according to claim 1, it is characterized in that, in described step 3, the identification of natural language location expression mainly comprises the identification of location name identification and spatial relationship, adopt the multiple dimensioned extracting method based on semanteme to extract natural language location expression, specifically comprise following sub-step:
Step 3.1, sets up the corpus of location expression, stores and express location name and the feature vocabulary of spatial relationship and the syntactic pattern of location expression in corpus;
Step 3.2, under the support of corpus, carries out pattern match to network text, obtains location expression;
Step 3.3, eliminates place name ambiguity based on geography with the semanteme of non-geographic.
3. a kind of unknown position evaluation method based on the detection of internet active iteration according to claim 1, it is characterized in that, in described step 3, the prerequisite utilizing reference position and spatial relationship estimation target location is that reference position can obtain accurate geographic range from location database, set single location expression to express according to formula one, RO is reference position title, SR is locational space relation, T is the time of origin of location expression, C is the confidence level that location expression has, and S is the searching for reference of references object RO; Extract K location expression Loc before in result
i, and classify according to precondition, work as Loc
iwhen .RO meeting precondition, Loc
istored in accurate description collections P, otherwise stored in vague description set A;
Loc={RO, SR, T, C, S} formula one.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210579579.2A CN103077201B (en) | 2012-12-27 | 2012-12-27 | A kind of unknown position evaluation method based on the detection of internet active iteration |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210579579.2A CN103077201B (en) | 2012-12-27 | 2012-12-27 | A kind of unknown position evaluation method based on the detection of internet active iteration |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103077201A CN103077201A (en) | 2013-05-01 |
CN103077201B true CN103077201B (en) | 2016-03-30 |
Family
ID=48153731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210579579.2A Expired - Fee Related CN103077201B (en) | 2012-12-27 | 2012-12-27 | A kind of unknown position evaluation method based on the detection of internet active iteration |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103077201B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103309992B (en) * | 2013-06-20 | 2016-03-16 | 武汉大学 | A kind of positional information extracting method towards natural language |
CN103347064A (en) * | 2013-06-25 | 2013-10-09 | 百度在线网络技术(北京)有限公司 | Method and system for displaying user location |
US10802485B2 (en) * | 2017-10-09 | 2020-10-13 | Here Global B.V. | Apparatus, method and computer program product for facilitating navigation of a vehicle based upon a quality index of the map data |
CN109858508A (en) * | 2018-10-23 | 2019-06-07 | 重庆邮电大学 | IP localization method based on Bayes and deep neural network |
CN109582792A (en) * | 2018-11-16 | 2019-04-05 | 北京奇虎科技有限公司 | A kind of method and device of text classification |
CN113254627B (en) * | 2021-04-16 | 2023-07-25 | 国网河北省电力有限公司经济技术研究院 | Data reading method, device and terminal |
CN113297456B (en) * | 2021-05-20 | 2023-04-07 | 北京三快在线科技有限公司 | Searching method, searching device, electronic equipment and storage medium |
CN116562234B (en) * | 2023-03-30 | 2024-08-09 | 深圳市规划和自然资源数据管理中心 | Multi-source data fusion voice indoor positioning method and related equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101147079A (en) * | 2005-03-24 | 2008-03-19 | SiRF技术公司 | System and method for providing location based services over a network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8285696B2 (en) * | 2006-06-09 | 2012-10-09 | Routecentric, Inc. | Apparatus and methods for providing route-based advertising and vendor-reported business information over a network |
-
2012
- 2012-12-27 CN CN201210579579.2A patent/CN103077201B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101147079A (en) * | 2005-03-24 | 2008-03-19 | SiRF技术公司 | System and method for providing location based services over a network |
Non-Patent Citations (2)
Title |
---|
一种基于地理位置信息的移动互联网社交模型;杨煜尧等;《计算机研究与发展》;20111231;全文 * |
基于规则的中文地址要素解析方法;李海光等;《地理信息科学学报》;20101231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103077201A (en) | 2013-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103077201B (en) | A kind of unknown position evaluation method based on the detection of internet active iteration | |
CN108388559B (en) | Named entity identification method and system under geographic space application and computer program | |
Rae et al. | Mining the web for points of interest | |
JP5390840B2 (en) | Information analyzer | |
US8682646B2 (en) | Semantic relationship-based location description parsing | |
US20150356088A1 (en) | Tile-based geocoder | |
EP2209073A1 (en) | Location based system utilizing geographical information from documents in natural language | |
CN110472066A (en) | A kind of construction method of urban geography semantic knowledge map | |
Chen et al. | Georeferencing places from collective human descriptions using place graphs | |
CN115129719B (en) | Qualitative position space range construction method based on knowledge graph | |
Drymonas et al. | Geospatial route extraction from texts | |
Musleh et al. | Let's speak trajectories | |
Shi et al. | Extraction of geospatial information on the Web for GIS applications | |
Cheng et al. | Quickly locating POIs in large datasets from descriptions based on improved address matching and compact qualitative representations | |
Shi et al. | Thematic data extraction from Web for GIS and applications | |
Dong et al. | Learning the spatial co-occurrence for browsing interests extraction of domain users on public map service platforms | |
CN116431839B (en) | Regional network generation method, system, computer equipment and storage medium | |
CN117473025A (en) | Address matching method, device, equipment and storage medium based on deep learning | |
Fränti et al. | Location-based search engine for multimedia phones | |
Bui | Automatic construction of POI address lists at city streets from geo-tagged photos and web data: a case study of San Jose City | |
CN111680122B (en) | Space data active recommendation method and device, storage medium and computer equipment | |
CN116431625A (en) | Positioning analysis method and device for geographic entity and computer equipment | |
Formica et al. | Constraint relaxation of the polygon-polyline topological relation for geographic pictorial query languages | |
Martins | Geographically aware web text mining | |
Venkateswaran et al. | Exploring and visualizing differences in geographic and linguistic web coverage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160330 Termination date: 20211227 |
|
CF01 | Termination of patent right due to non-payment of annual fee |