CN105022747B - A kind of address lookup string analysis method and device - Google Patents
A kind of address lookup string analysis method and device Download PDFInfo
- Publication number
- CN105022747B CN105022747B CN201410174465.9A CN201410174465A CN105022747B CN 105022747 B CN105022747 B CN 105022747B CN 201410174465 A CN201410174465 A CN 201410174465A CN 105022747 B CN105022747 B CN 105022747B
- Authority
- CN
- China
- Prior art keywords
- address
- level
- substring
- ingredient
- string
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Abstract
The present invention relates to geographical location search technique field more particularly to a kind of address lookup string analysis method and devices, to solve the problems, such as in the prior art in address searching result inaccuracy.The described method includes: obtaining the address lookup string of user's input and parsing by GIS-Geographic Information System GEO, address string is obtained;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.
Description
Technical field
The present invention relates to geographical location search technique field more particularly to a kind of address lookup string analysis method and devices.
Background technique
Existing address searching has been the important service that people use map products.By address searching, people can be real
Now to the application demand of specific destination, life kind service, road planning.
However address searching has the particularity of itself.In google maps, POI (Point of Interest,
Point of interest) etc. data when establishing inverted index, be to be distinguished with space.Correct spatial dimension is found to retrieve user's
Demand is most important.
In general, user will not oneself specified spatial dimension out in map search.The parsing of spatial dimension is generally required
The query string of user is analyzed by computer to obtain.After parsing spatial dimension, it is also necessary to from the query string of user
It extracts user and really wants the POI (or address) of search, then user can be retrieved in correct spatial dimension and really searched
Rope demand.It is simply that needing computer to analyze user's " where is thought " searches plain " what ", also referred to as
" where what " parses problem.
For example, user query " the Chaoyang District, Beijing City Fu Tong East Street International Center Fang Heng ", user is expectation in " Beijing
Search " side perseverance International Center " in the range of Chaoyang District Fu Tong East Street ", and existing address searching can not accurately parse
User query are intended to out, it will usually search for " the Chaoyang District Fu Tong East Street International Center Fang Heng " in the range of " Beijing ", no
Search result can accurately be obtained, user experience is bad.The where information in query string can be accurately analyzed there is an urgent need for one kind
With what information, correct content strings are looked for inquire in correct spatial dimension, to promote search quality.
Summary of the invention
The embodiment of the present invention provides a kind of address lookup string analysis method and device, to solve in the prior art in address
The problem of search result inaccuracy.
The embodiment of the present invention provides a kind of address lookup string analytic method, this method comprises:
It obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, modified address is obtained and looks into
Ask string;
Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.
The method also includes:
According to preset filtering vocabulary, non-address ingredient in the level-one substring and second level substring is filtered, will be filtered
The level-one substring and second level substring afterwards is as parsing result.
It is described to filter non-address ingredient in the level-one substring and second level substring, comprising:
Determine whether the non-address ingredient hits the filtering vocabulary, if so, directly filtering;Otherwise, step is continued with
Suddenly;
Determine whether the non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
It is described to construct level-one substring and second level substring respectively, comprising:
Obtain the spatial information of the spatial information of district rank and lowest address segment in the modified address query string;
In the spatial information range searching district rank query string segment below of the district rank, as level-one
String;
Query string segment after the spatial information range searching lowest address segment of the lowest address segment, as
Second level substring.
It is described to be modified address element in the string of the address and non-address ingredient by pattern-recognition, comprising:
Non-address ingredient mstart before the obtaining the address element and non-address ingredient mend after address element;
By pattern-recognition, judge whether the mstart and mend is important;
If the mstart be the important and described mend be it is inessential, the mend is added in the address element
After the segment of street;The mend is written into the mstart content;The mstart content is emptied;
If the mstart and mend be it is important, the mstart content is emptied;
If the mstart be the inessential and described mend be it is important, the mstart content is emptied.
The method also includes:
If the mstart content is sky, without pattern-recognition.
It is described by pattern-recognition, judge whether the mstart and mend is important, comprising:
Determine whether the non-address ingredient is chain store's title word or brand word, if so, be determined as it is important, otherwise, after
Continuous following steps;
Determine whether the non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
The non-address ingredient is segmented, determines that the non-address ingredient adds classifier for regional address word, if so, sentencing
Be set to it is important, otherwise, it is determined that be inessential.
A kind of address lookup string resolver, the device include:
GEO parsing module is obtained for obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO
It is gone here and there to address;
Pattern recognition module, for repairing address element in the string of the address and non-address ingredient by pattern-recognition
Just, modified address query string is obtained;
Substring constructing module constructs level-one substring and second level substring according to the modified address query string, as solution respectively
Analyse result.
Described device further includes filtering vocabulary module, for filtering level-one according to preset filtering vocabulary
Non-address ingredient in string and second level substring, using the filtered level-one substring and second level substring as parsing result.
The filtering vocabulary module, comprising:
Filter word list deciding submodule, for determining whether the non-address ingredient hits the filtering vocabulary;
Classifier decision sub-module determines whether the non-address ingredient is classifier;
Door location class word judgment submodule, determines whether the non-address ingredient is a location class word.
The substring constructing module, comprising:
Spatial information acquisition submodule, for obtaining in the modified address query string spatial information of district rank and most
The spatial information of small address fragment;
Level-one substring constructs submodule, below for the spatial information range searching district rank in the district rank
Query string segment, as level-one substring;
Second level substring constructs submodule, for the spatial information range searching lowest address piece in the lowest address segment
Query string segment after section, as second level substring.
The pattern recognition module, comprising:
Address divides submodule, after obtaining non-address ingredient mstart and address element before the address element
Non-address ingredient mend;
Pattern-recognition submodule, for judging whether the mstart and mend is important by pattern-recognition;If described
Mstart be the important and described mend be it is inessential, then the mend is added after the street segment in the address element;
The mend is written into the mstart content;The mstart content is emptied;If the mstart and mend be it is important,
Then the mstart content is emptied;If the mstart be the inessential and described mend be it is important, will be in the mstart
Accommodating is empty.
The pattern-recognition submodule is also used to determine whether the non-address ingredient is chain store's title word or brand
Word, if so, being determined as important;Determine whether the non-address ingredient is classifier, if so, being determined as inessential;To described non-
Address element participle determines that the non-address ingredient adds classifier for regional address word, if so, being determined as important, otherwise, sentences
It is set to inessential.
The embodiment of the present invention is obtained by obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO
It is gone here and there to address;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address
Query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.Pass through parsing
As a result, where what parsing is made in two sheaf spaces, can be more beneficial to expand the recall rate of address searching engine, be obtained
To the more desired result of user;On this basis, the embodiment of the present invention has done the filtering of two layers of what, can guarantee to expand
On the basis of recall rate, recalling for unrelated results is reduced, guarantees the accuracy recalled.
Detailed description of the invention
Fig. 1 is the address lookup string analytic method flow chart that the embodiment of the present invention 1 provides;
Fig. 2 is the address lookup string analytic method flow chart that the embodiment of the present invention 2 provides;
Fig. 3 is the inquiry address string resolver structural schematic diagram that the embodiment of the present invention 5 provides;
Fig. 4 is 34 structural schematic diagram of filtering vocabulary module that the embodiment of the present invention 6 provides;
Fig. 5 is 33 structural schematic diagram of substring constructing module that the embodiment of the present invention 7 provides;
Fig. 6 is 32 structural schematic diagram of pattern recognition module that the embodiment of the present invention 8 provides.
Specific embodiment
The embodiment of the present invention is obtained by obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO
It is gone here and there to address;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address
Query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.Pass through parsing
As a result, where what parsing is made in two sheaf spaces, can be more beneficial to expand the recall rate of address searching engine, be obtained
To the more desired result of user;On this basis, the embodiment of the present invention has done the filtering of two layers of what, can guarantee to expand
On the basis of recall rate, recalling for unrelated results is reduced, guarantees the accuracy recalled.The embodiment of the present invention passes through geocoding system
System (GEO) preliminary analysis goes out the string of the address in query string and non-address string;By GEO parsing result, one construction level-one of progress is non-
Address string and second level non-address string;Inappropriate substring is filtered by pattern-recognition, offline excavate.By the inquiry for analyzing user
String extracts spatial information and search content information in query string, so that reaching needs in correct address space search user
It asks, promotes search quality.
The embodiment of the present invention is described in further detail with reference to the accompanying drawings of the specification.
Embodiment 1
As shown in Figure 1, the address lookup string analytic method flow chart provided for the embodiment of the present invention 1, comprising the following steps:
S101: it obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
S102: by address go here and there in address element and non-address ingredient be modified by pattern-recognition, obtain modified address
Query string;
S103 constructs level-one substring and second level substring according to modified address query string, as parsing result respectively.
The present embodiment passes through GEO parsing result, makes where what parsing in two sheaf spaces, can more be beneficial to
The recall rate for expanding address searching engine, obtains the more desired result of user;On this basis, the embodiment of the present invention has done two
The filtering of layer what can reduce recalling for unrelated results, it is accurate to guarantee to recall on the basis of guaranteeing to expand recall rate
Degree.
Embodiment 2
As shown in Fig. 2, the address lookup string analytic method flow chart provided for the embodiment of the present invention 2, comprising the following steps:
S201 obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtains address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, are corrected by S202
Address lookup string;
S203 constructs level-one substring and second level substring according to the modified address query string respectively;
S204 filters non-address ingredient in level-one substring and second level substring, will filter according to preset filtering vocabulary
Level-one substring and second level substring afterwards is as parsing result.
Here, non-address ingredient in the level-one substring and second level substring is filtered, it is optional to include:
Determine whether non-address ingredient hits filtering vocabulary, if so, directly filtering;Otherwise, step is continued with;
Determine whether non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
Non-address ingredient after this spline filter, it is therefore intended that filtering avoid recalling it is a large amount of it is unrelated be not that user needs
Door location.
Level-one substring and second level substring are constructed respectively, comprising:
Obtain the spatial information of district rank and the spatial information of lowest address segment in modified address query string;
The spatial information range searching district rank query string segment below of rank in district, as level-one substring;
Query string segment after the spatial information range searching lowest address segment of lowest address segment, as second level
Substring.
In the embodiment of the present invention, in that parsed by construction second level substring, in existing construction level-one
On the basis of string, level-one substring is reconstructed, so that the address lookup string after parsing is more clear, is also easier to obtain user
The query result needed.
By address go here and there in address element and non-address ingredient be modified by pattern-recognition, comprising:
The non-address ingredient mend after non-address ingredient mstart and address element before obtaining address element;
By pattern-recognition, judge whether mstart and mend is important;
If mstart be important and mend be it is inessential, by mend be added address element in street segment after;It will
Mend is written in mstart content;Mstart content is emptied;
If mstart and mend be it is important, mstart content is emptied;
If mstart be inessential and mend be it is important, mstart content is emptied.
If mstart content is sky, without pattern-recognition.
Wherein, judge whether mstart and mend is important, comprising:
Determine whether non-address ingredient is chain store's title word or brand word, if so, being determined as important, otherwise, continues such as
Lower step;
Determine whether non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
Non-address ingredient is segmented, determines that non-address ingredient adds classifier for regional address word, if so, determining to attach most importance to
It wants, otherwise, it is determined that being inessential.
In the embodiment of the present invention, by analyzing the query string of user, extract in the spatial information and search in query string
Hold information, searches for user demand in correct address space to reach, promote search quality.
Embodiment 3
In the embodiment of the present invention, GEO system can receive the query string of user, the scan for inquiries in a manner of from left to right
It goes here and there, and identifies address element and non-address ingredient in query string in a manner of from big to small, and provide the other warp of address level
Latitude scope.By taking the address lookup string " Chaoyang District, Beijing City Fu Tong East Street KFC " of user's input as an example, parsed by GEO
Result later are as follows:
Country: China
Province: Beijing
District: Chaoyang District
Street: Fu Tong East Street
City codes: 110105
Non-address string before the string of address: empty
Non-address string after the string of address: KFC
Longitude: 116.475339
Latitude: 39.986453
Range: 1000m
Geocoding identification reliability: 0.705882
The address of minimal level: Fu Tong East Street
The rank of lowest address segment: road
The area code of lowest address segment: 110105
The longitude of lowest address segment: 116.475339
The latitude of lowest address segment: 39.986453
The range of lowest address segment: 1000m
If former query string is " KFC Chaoyang District, Beijing City Fu Tong East Street ", in the result of GEO parsing:
Non-address string before the string of address: KFC
Non-address string after the string of address: empty.
Further, this GEO parsing result is corrected by pattern-recognition.
Because the search of user be accustomed to it is different, the query string of user's input will not usually follow stringent address string from greatly to
The mode of small+non-address string, for example, " KFC (the Fu Tong East Street International Building Fang Heng shop) ", this example is exactly non-address
Ingredient " KFC " is located at before address element, can be parsed into " the non-address ingredient before address element " by GEO, be denoted as
mstart.Similarly, the non-address ingredient after address element, is denoted as mend.
It is found by statistics, it is minimum that the non-address ingredient of user itself appears in the model probabilities among address element, example
Such as, " Chaoyang District KFC Fu Tong East Street ", therefore pattern-recognition of the embodiment of the present invention mainly handles non-address string and goes here and there in address
The form of front.
It is parsed by GEO, if mstart is sky, pattern-recognition need not be carried out.
If mstart be not it is empty, need to differentiate by pattern-recognition mstart whether " important ", if mstart sentences
Not Wei " important ", and mend is determined as " inessential ", need to do following three movements:
" street " bit end (being denoted as address) in GEO becomes: address+mend;
Mend segment in GEO becomes: mstart;
Mstart in GEO becomes: empty.
For example, " KFC Fu Tong East Street south gate " this address is gone here and there, and in this Fu Tong East Street address=, mstart
=KFC, the south gate mend=;After mode determines, discovery " KFC " is determined as " important ", and " south gate " is judged to " not weighing
Want ", then three parameters will be modified are as follows: the Fu Tong East Street address=south gate, mstart=is empty, mend=KFC.
The spatial dimension where what is parsed at " Fu Tong East Street south gate " searches for " KFC ".
If mstart is not sky, mstart and mend are determined as " important " (or " inessential ") simultaneously, then directly will
Mstart becomes empty;If mstart is determined as " inessential ", and mend is determined as " important ", and mstart will also become empty.
For example, " KFC Fu Tong East Street McDonald ", " KFC " and " McDonald " is all " important " herein, then only
It parses " Fu Tong East Street " space search " McDonald ".
Judge the whether important optional following method of non-address ingredient:
If non-address ingredient includes chain store's title or brand name, such as " Adidas ", " Yoshinoya " etc., then
Directly think its " important ";Otherwise further judgement;
If non-address ingredient is the wordings such as " hotel ", " supermarket ", directly think its " inessential " because this belong to it is general
Demand is guessed and is not measured the place that user directly wants to go to;Otherwise further judgement;
Non-address ingredient is segmented later, is judged by segmenting part of speech, if it is the mould of regional address+classifier
Formula, then it is assumed that its is important.Such as " Wangjing hospital ", " Back Long View primary school " etc..
Further construct level-one substring and second level substring.
After the modified result of GEO, need to do 2 layers of where what parsing.
From GEO as a result, in spatial dimension, by taking example as above as an example, our available 2 layers of information:
1, the spatial information of district rank, i.e. " 110105 ", the range refer to area's city codes above county level
adcode.In general, adcode has 6, and provincial latter 4 are 0, and latter 2 of city-level are 0, and 6 of district grade have value.
2, the spatial information of lowest address segment, the i.e. longitude and latitude and radius of lowest address segment.This range one
As meter differed from tens kilometers to tens, be more specifical spatial dimension smaller than " district grade " representated by adcode.This
A range is usually to be determined by road, commercial circle and more famous POI point.
Based on above 2 spatial dimensions, the embodiment of the present invention constructs the parsing of 2 grades of where what:
Spatial dimension (where) more than " area's spatial dimension above county level ", search " district grade query string below
Segment " (what).
At " spatial dimension of lowest address segment " (where), search for " the query string segment after lowest address segment "
(what)。
Still by taking " Chaoyang District, Beijing City Fu Tong East Street KFC " as an example:
First layer resolves to, the spatial dimension in " Chaoyang District, Beijing City ", searches for " Fu Tong East Street KFC ";Search
When, spatial filtering is carried out to result with adcode;
The second layer resolves to, the spatial dimension in " Fu Tong East Street ", searches for " KFC ";When search, with <
116.475339 39.986453,1000 > carry out spatial filtering.
Further filter inappropriate non-address ingredient.
First determine whether that What filters vocabulary either with or without hit.Filtering vocabulary be by excavate, sum up offline come word, this
A little words recall a large amount of unrelated results when what will lead to search engine.From semantically, this kind of word does not represent specifically mostly
Address or poi are to belong to the word of assisted class, such as " east gate ", " Dong Kou ", " West Street " etc..
If having hit filtering vocabulary, need to judge whether the lowest address segment of the part where of the what is area
It is above county level.If it is, one judgement of progress, does not filter out first.Because this kind of word belongs to auxiliary mostly when address rank is relatively low
It helps, but where is when area is above county level, it may be possible to important POI.Such as " Fu Tong East Street east gate ", it is somebody's turn to do at " east gate "
Just belong to complementary word, is not suitable for being what;But if it is " Shenzhen east gate ", what " east gate " is exactly important in Shenzhen
POI, it is intended to retain the what, do in next step judge;
If what is the classifiers such as " hotel ", " supermarket ", filter out;
If what is the word of the doorplates door location such as " No. 2 Building A " class, filter out;Filtering out this kind of word main cause is to recall largely
Unrelated is not the door location that user needs.
So far, available 2 layers of parsing result.
Embodiment 4
By taking user's input address string " Startbuck, south gate, Suzhou Street, Haidian District, Beijing City " as an example.
By GEO coding result are as follows:
Country: China
Province: Beijing
District: Haidian District
Street: Su Zhoujie
City codes: 110108
Non-address string before the string of address: Startbuck
Non-address string after the string of address: south gate
Longitude: 116.304658
Latitude: 39.972414
Range: 1000m
Geocoding identification reliability: 0.684211
The address of minimal level: Su Zhoujie
The rank of lowest address segment: road
The area code of lowest address segment: 110108
The longitude of lowest address segment: 116.304658
The latitude of lowest address segment: 39.972414
The range of lowest address segment: 1000m
In this mstart=Startbuck, the south gate mend=, address=Su Zhoujie.Because mstart belongs to chain brand,
It is determined as " important ", mend is determined as inessential, therefore the Suzhou address=street south gate, and mstart=is empty, mend=star bar
Gram;
The Haidian District, Beijing City first layer where=(adcode=110108), the Suzhou what=street south gate Startbuck;
The Suzhou Street, Haidian District, Beijing City second layer where=south gate (116.304658,39.972414,1000), what=
Startbuck;
The Suzhou first layer what street south gate Startbuck is not classifier, is not belonging to doorplate, cannot hit filtering vocabulary, is protected
It stays;
Second layer what Startbuck is not classifier, is not belonging to doorplate, cannot hit filtering vocabulary, is retained.
Based on the same inventive concept, it additionally provides a kind of inquiry address string in the embodiment of the present invention analytic method is corresponding and look into
Ask address string resolver, since the principle that the device solves the problems, such as is similar to present invention method, the device
Implementation may refer to the implementation of method, and overlaps will not be repeated.
Embodiment 5
As shown in figure 3, the inquiry address string resolver structural schematic diagram provided for the embodiment of the present invention 5, comprising:
GEO parsing module 31, for obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO,
Obtain address string;
Pattern recognition module 32 is repaired for address element in going here and there address and non-address ingredient by pattern-recognition
Just, modified address query string is obtained;
Substring constructing module 33 constructs level-one substring and second level substring according to modified address query string, as parsing respectively
As a result.
Optionally, above-mentioned apparatus further includes filtering vocabulary module 34, for according to preset filtering vocabulary, filtering one
Non-address ingredient in grade substring and second level substring, using filtered level-one substring and second level substring as parsing result.
Embodiment 6
As shown in figure 4, the filtering vocabulary module 34 in address above mentioned query string resolver, comprising:
Filter word list deciding submodule 341, for determining whether non-address ingredient hits filtering vocabulary;
Classifier decision sub-module 342 determines whether non-address ingredient is classifier;
Door location class word judgment submodule 343, determines whether non-address ingredient is a location class word.
Embodiment 7
As shown in figure 5, the substring constructing module 33 in address above mentioned query string resolver, comprising:
Spatial information acquisition submodule 331, for obtaining in modified address query string the spatial information of district rank and most
The spatial information of small address fragment;
Level-one substring constructs submodule 332, below for the spatial information range searching district rank in district rank
Query string segment, as level-one substring;
Second level substring constructs submodule 333, for the spatial information range searching lowest address piece in lowest address segment
Query string segment after section, as second level substring.
Embodiment 8
As shown in fig. 6, the pattern recognition module 32 in address above mentioned query string resolver, comprising:
Address divides submodule 321, after obtaining non-address ingredient mstart and address element before address element
Non-address ingredient mend;
Pattern-recognition submodule 322, for judging whether mstart and mend is important by pattern-recognition;If mstart
Be for important and mend it is inessential, then will mend be added address element in street segment after;Mstart content is written
mend;Mstart content is emptied;If mstart and mend be it is important, mstart content is emptied;If mstart is not
Important and mend be it is important, then mstart content is emptied.
Optionally, above-mentioned pattern-recognition submodule 322 be also used to determine non-address ingredient whether be chain store's title word or
Brand word, if so, being determined as important;Determine whether non-address ingredient is classifier, if so, being determined as inessential;To non-address
Ingredient participle determines that non-address ingredient adds classifier for regional address word, if so, be determined as it is important, otherwise, it is determined that not weigh
It wants.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of device (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (11)
1. a kind of address lookup string analytic method, which is characterized in that this method comprises:
It obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address inquiry
String;
The level-one substring and lowest address within the scope of tectonic province other spatial information at county level are distinguished according to the modified address query string
Second level substring in segment spatial dimension, as parsing result;
It is described to be modified address element in the string of the address and non-address ingredient by pattern-recognition, comprising:
Non-address ingredient mstart before the obtaining the address element and non-address ingredient mend after address element;
By pattern-recognition, judge whether the mstart and mend is important;
If the mstart be the important and described mend be it is inessential, the street in the address element is added in the mend
After segment;The mend is written into the mstart content;The mstart content is emptied;
If the mstart and mend be it is important, the mstart content is emptied;
If the mstart be the inessential and described mend be it is important, the mstart content is emptied.
2. the method as described in claim 1, which is characterized in that the method also includes:
According to preset filtering vocabulary, non-address ingredient in the level-one substring and second level substring is filtered, it will be filtered
The level-one substring and second level substring are as parsing result.
3. method according to claim 2, which is characterized in that described to filter non-address in the level-one substring and second level substring
Ingredient, comprising:
Determine whether the non-address ingredient hits the filtering vocabulary, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
4. the method as described in claim 1, which is characterized in that described to construct level-one substring and second level substring respectively, comprising:
Obtain the spatial information of the spatial information of district rank and lowest address segment in the modified address query string;
In the spatial information range searching district rank query string segment below of the district rank, as level-one substring;
Query string segment after the spatial information range searching lowest address segment of the lowest address segment, as second level
Substring.
5. the method as described in claim 1, which is characterized in that the method also includes:
If the mstart content is sky, without pattern-recognition.
6. method as claimed in claim 5, which is characterized in that it is described by pattern-recognition, judge the mstart and mend
It whether is important, comprising:
Determine whether the non-address ingredient is chain store's title word or brand word, if so, being determined as important, otherwise, continues such as
Lower step;
Determine whether the non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
The non-address ingredient is segmented, determines that the non-address ingredient adds classifier for regional address word, if so, being determined as
It is important, otherwise, it is determined that being inessential.
7. a kind of address lookup string resolver, which is characterized in that the device includes:
GEO parsing module obtains ground for obtaining the address lookup string of user's input and parsing by GIS-Geographic Information System GEO
Location string;
Pattern recognition module, for address element in the string of the address and non-address ingredient to be modified by pattern-recognition,
Obtain modified address query string;
Substring constructing module distinguishes level-one within the scope of tectonic province other spatial information at county level according to the modified address query string
Second level substring in string and lowest address segment spatial dimension, as parsing result;
The pattern recognition module, comprising:
Address divides submodule, for obtain after non-address ingredient mstart and address element before the address element non-ly
Location ingredient mend;
Pattern-recognition submodule, for judging whether the mstart and mend is important by pattern-recognition;If described
Mstart be the important and described mend be it is inessential, then the mend is added after the street segment in the address element;
The mend is written into the mstart content;The mstart content is emptied;If the mstart and mend be it is important,
Then the mstart content is emptied;If the mstart be the inessential and described mend be it is important, will be in the mstart
Accommodating is empty.
8. device as claimed in claim 7, which is characterized in that described device further includes filtering vocabulary module, for according to pre-
The filtering vocabulary first set filters non-address ingredient in the level-one substring and second level substring, by filtered level-one
String and second level substring are as parsing result.
9. device as claimed in claim 8, which is characterized in that the filtering vocabulary module, comprising:
Filter word list deciding submodule, for determining whether the non-address ingredient hits the filtering vocabulary;
Classifier decision sub-module determines whether the non-address ingredient is classifier;
Door location class word judgment submodule, determines whether the non-address ingredient is a location class word.
10. device as claimed in claim 7 or 8, which is characterized in that the substring constructing module, comprising:
Spatial information acquisition submodule, for obtaining in the modified address query string spatial information of district rank and minimally
The spatial information of location segment;
Level-one substring constructs submodule, for the spatial information range searching district rank inquiry below in the district rank
String segment, as level-one substring;
Second level substring construct submodule, for the lowest address segment spatial information range searching lowest address segment it
Query string segment afterwards, as second level substring.
11. device as claimed in claim 10, which is characterized in that the pattern-recognition submodule is also used to determine described non-
Whether address element is chain store's title word or brand word, if so, being determined as important;Determine whether the non-address ingredient is class
Other word, if so, being determined as inessential;The non-address ingredient is segmented, determines the non-address ingredient for regional address word
Add classifier, if so, be determined as it is important, otherwise, it is determined that be inessential.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410174465.9A CN105022747B (en) | 2014-04-28 | 2014-04-28 | A kind of address lookup string analysis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410174465.9A CN105022747B (en) | 2014-04-28 | 2014-04-28 | A kind of address lookup string analysis method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105022747A CN105022747A (en) | 2015-11-04 |
CN105022747B true CN105022747B (en) | 2019-12-03 |
Family
ID=54412729
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410174465.9A Expired - Fee Related CN105022747B (en) | 2014-04-28 | 2014-04-28 | A kind of address lookup string analysis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105022747B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426351B (en) * | 2015-11-11 | 2019-01-25 | 中国建设银行股份有限公司 | A kind of participle processing method and system of customer address information |
CN111177589A (en) * | 2019-12-31 | 2020-05-19 | 税友软件集团股份有限公司 | Address information query method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101980208A (en) * | 2010-11-10 | 2011-02-23 | 百度在线网络技术(北京)有限公司 | Address query method and system |
US8073789B2 (en) * | 2005-03-10 | 2011-12-06 | Microsoft Corporation | Method and system for web resource location classification and detection |
CN102289467A (en) * | 2011-07-22 | 2011-12-21 | 浙江百世技术有限公司 | Method and device for determining target site |
-
2014
- 2014-04-28 CN CN201410174465.9A patent/CN105022747B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8073789B2 (en) * | 2005-03-10 | 2011-12-06 | Microsoft Corporation | Method and system for web resource location classification and detection |
CN101980208A (en) * | 2010-11-10 | 2011-02-23 | 百度在线网络技术(北京)有限公司 | Address query method and system |
CN102289467A (en) * | 2011-07-22 | 2011-12-21 | 浙江百世技术有限公司 | Method and device for determining target site |
Also Published As
Publication number | Publication date |
---|---|
CN105022747A (en) | 2015-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10520326B2 (en) | Driving route matching method and apparatus, and storage medium | |
JP6375293B2 (en) | Method and apparatus for recommending candidate terms based on geographic location | |
CN104679801B (en) | A kind of interest point search method and device | |
WO2016107417A1 (en) | Method and device for exploiting travel route on basis of tourist destination area | |
US8688366B2 (en) | Method of operating a navigation system to provide geographic location information | |
WO2016107371A1 (en) | Method and equipment for searching for tourist destination attractions | |
US9677904B2 (en) | Generating travel time data | |
Wang et al. | Quality analysis of open street map data | |
CN104808932B (en) | Route information acquisition method and terminal | |
CN104572955A (en) | System and method for determining POI name based on clustering | |
CN104699835A (en) | Method and device used for determining webpages including POI (point of interest) data | |
KR101344913B1 (en) | System and method for providing automatically completed query by regional groups | |
CN107203526A (en) | A kind of query string semantic requirement analysis method and device | |
CN110990520B (en) | Address coding method and device, electronic equipment and storage medium | |
CN103049481B (en) | A kind of searching method and search equipment | |
CN105022747B (en) | A kind of address lookup string analysis method and device | |
CN110060472B (en) | Road traffic event positioning method, system, readable storage medium and device | |
CN111931077A (en) | Data processing method and device, electronic equipment and storage medium | |
CN105574019B (en) | Query parameter processing method and device | |
Schockaert et al. | Mining topological relations from the web | |
TW202146850A (en) | Processing apparatus and method for determining road names | |
CN110645997B (en) | Method and device for digging new roads based on track route | |
CN105740246B (en) | Set keyword query method based on diagram data | |
CN113190640B (en) | Method and device for processing point of interest data | |
CN116976308A (en) | Address processing method, address processing device, electronic equipment and computer program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200429 Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province Patentee after: Alibaba (China) Co.,Ltd. Address before: 102200, No. 8, No., Changsheng Road, Changping District science and Technology Park, Beijing, China. 1-5 Patentee before: AUTONAVI SOFTWARE Co.,Ltd. |
|
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20191203 Termination date: 20200428 |