CN105022747B - A kind of address lookup string analysis method and device - Google Patents

A kind of address lookup string analysis method and device Download PDF

Info

Publication number
CN105022747B
CN105022747B CN201410174465.9A CN201410174465A CN105022747B CN 105022747 B CN105022747 B CN 105022747B CN 201410174465 A CN201410174465 A CN 201410174465A CN 105022747 B CN105022747 B CN 105022747B
Authority
CN
China
Prior art keywords
address
level
substring
ingredient
string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410174465.9A
Other languages
Chinese (zh)
Other versions
CN105022747A (en
Inventor
郭涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Autonavi Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Autonavi Software Co Ltd filed Critical Autonavi Software Co Ltd
Priority to CN201410174465.9A priority Critical patent/CN105022747B/en
Publication of CN105022747A publication Critical patent/CN105022747A/en
Application granted granted Critical
Publication of CN105022747B publication Critical patent/CN105022747B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to geographical location search technique field more particularly to a kind of address lookup string analysis method and devices, to solve the problems, such as in the prior art in address searching result inaccuracy.The described method includes: obtaining the address lookup string of user's input and parsing by GIS-Geographic Information System GEO, address string is obtained;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.

Description

A kind of address lookup string analysis method and device
Technical field
The present invention relates to geographical location search technique field more particularly to a kind of address lookup string analysis method and devices.
Background technique
Existing address searching has been the important service that people use map products.By address searching, people can be real Now to the application demand of specific destination, life kind service, road planning.
However address searching has the particularity of itself.In google maps, POI (Point of Interest, Point of interest) etc. data when establishing inverted index, be to be distinguished with space.Correct spatial dimension is found to retrieve user's Demand is most important.
In general, user will not oneself specified spatial dimension out in map search.The parsing of spatial dimension is generally required The query string of user is analyzed by computer to obtain.After parsing spatial dimension, it is also necessary to from the query string of user It extracts user and really wants the POI (or address) of search, then user can be retrieved in correct spatial dimension and really searched Rope demand.It is simply that needing computer to analyze user's " where is thought " searches plain " what ", also referred to as " where what " parses problem.
For example, user query " the Chaoyang District, Beijing City Fu Tong East Street International Center Fang Heng ", user is expectation in " Beijing Search " side perseverance International Center " in the range of Chaoyang District Fu Tong East Street ", and existing address searching can not accurately parse User query are intended to out, it will usually search for " the Chaoyang District Fu Tong East Street International Center Fang Heng " in the range of " Beijing ", no Search result can accurately be obtained, user experience is bad.The where information in query string can be accurately analyzed there is an urgent need for one kind With what information, correct content strings are looked for inquire in correct spatial dimension, to promote search quality.
Summary of the invention
The embodiment of the present invention provides a kind of address lookup string analysis method and device, to solve in the prior art in address The problem of search result inaccuracy.
The embodiment of the present invention provides a kind of address lookup string analytic method, this method comprises:
It obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, modified address is obtained and looks into Ask string;
Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.
The method also includes:
According to preset filtering vocabulary, non-address ingredient in the level-one substring and second level substring is filtered, will be filtered The level-one substring and second level substring afterwards is as parsing result.
It is described to filter non-address ingredient in the level-one substring and second level substring, comprising:
Determine whether the non-address ingredient hits the filtering vocabulary, if so, directly filtering;Otherwise, step is continued with Suddenly;
Determine whether the non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
It is described to construct level-one substring and second level substring respectively, comprising:
Obtain the spatial information of the spatial information of district rank and lowest address segment in the modified address query string;
In the spatial information range searching district rank query string segment below of the district rank, as level-one String;
Query string segment after the spatial information range searching lowest address segment of the lowest address segment, as Second level substring.
It is described to be modified address element in the string of the address and non-address ingredient by pattern-recognition, comprising:
Non-address ingredient mstart before the obtaining the address element and non-address ingredient mend after address element;
By pattern-recognition, judge whether the mstart and mend is important;
If the mstart be the important and described mend be it is inessential, the mend is added in the address element After the segment of street;The mend is written into the mstart content;The mstart content is emptied;
If the mstart and mend be it is important, the mstart content is emptied;
If the mstart be the inessential and described mend be it is important, the mstart content is emptied.
The method also includes:
If the mstart content is sky, without pattern-recognition.
It is described by pattern-recognition, judge whether the mstart and mend is important, comprising:
Determine whether the non-address ingredient is chain store's title word or brand word, if so, be determined as it is important, otherwise, after Continuous following steps;
Determine whether the non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
The non-address ingredient is segmented, determines that the non-address ingredient adds classifier for regional address word, if so, sentencing Be set to it is important, otherwise, it is determined that be inessential.
A kind of address lookup string resolver, the device include:
GEO parsing module is obtained for obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO It is gone here and there to address;
Pattern recognition module, for repairing address element in the string of the address and non-address ingredient by pattern-recognition Just, modified address query string is obtained;
Substring constructing module constructs level-one substring and second level substring according to the modified address query string, as solution respectively Analyse result.
Described device further includes filtering vocabulary module, for filtering level-one according to preset filtering vocabulary Non-address ingredient in string and second level substring, using the filtered level-one substring and second level substring as parsing result.
The filtering vocabulary module, comprising:
Filter word list deciding submodule, for determining whether the non-address ingredient hits the filtering vocabulary;
Classifier decision sub-module determines whether the non-address ingredient is classifier;
Door location class word judgment submodule, determines whether the non-address ingredient is a location class word.
The substring constructing module, comprising:
Spatial information acquisition submodule, for obtaining in the modified address query string spatial information of district rank and most The spatial information of small address fragment;
Level-one substring constructs submodule, below for the spatial information range searching district rank in the district rank Query string segment, as level-one substring;
Second level substring constructs submodule, for the spatial information range searching lowest address piece in the lowest address segment Query string segment after section, as second level substring.
The pattern recognition module, comprising:
Address divides submodule, after obtaining non-address ingredient mstart and address element before the address element Non-address ingredient mend;
Pattern-recognition submodule, for judging whether the mstart and mend is important by pattern-recognition;If described Mstart be the important and described mend be it is inessential, then the mend is added after the street segment in the address element; The mend is written into the mstart content;The mstart content is emptied;If the mstart and mend be it is important, Then the mstart content is emptied;If the mstart be the inessential and described mend be it is important, will be in the mstart Accommodating is empty.
The pattern-recognition submodule is also used to determine whether the non-address ingredient is chain store's title word or brand Word, if so, being determined as important;Determine whether the non-address ingredient is classifier, if so, being determined as inessential;To described non- Address element participle determines that the non-address ingredient adds classifier for regional address word, if so, being determined as important, otherwise, sentences It is set to inessential.
The embodiment of the present invention is obtained by obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO It is gone here and there to address;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address Query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.Pass through parsing As a result, where what parsing is made in two sheaf spaces, can be more beneficial to expand the recall rate of address searching engine, be obtained To the more desired result of user;On this basis, the embodiment of the present invention has done the filtering of two layers of what, can guarantee to expand On the basis of recall rate, recalling for unrelated results is reduced, guarantees the accuracy recalled.
Detailed description of the invention
Fig. 1 is the address lookup string analytic method flow chart that the embodiment of the present invention 1 provides;
Fig. 2 is the address lookup string analytic method flow chart that the embodiment of the present invention 2 provides;
Fig. 3 is the inquiry address string resolver structural schematic diagram that the embodiment of the present invention 5 provides;
Fig. 4 is 34 structural schematic diagram of filtering vocabulary module that the embodiment of the present invention 6 provides;
Fig. 5 is 33 structural schematic diagram of substring constructing module that the embodiment of the present invention 7 provides;
Fig. 6 is 32 structural schematic diagram of pattern recognition module that the embodiment of the present invention 8 provides.
Specific embodiment
The embodiment of the present invention is obtained by obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO It is gone here and there to address;Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address Query string;Level-one substring and second level substring are constructed respectively according to the modified address query string, as parsing result.Pass through parsing As a result, where what parsing is made in two sheaf spaces, can be more beneficial to expand the recall rate of address searching engine, be obtained To the more desired result of user;On this basis, the embodiment of the present invention has done the filtering of two layers of what, can guarantee to expand On the basis of recall rate, recalling for unrelated results is reduced, guarantees the accuracy recalled.The embodiment of the present invention passes through geocoding system System (GEO) preliminary analysis goes out the string of the address in query string and non-address string;By GEO parsing result, one construction level-one of progress is non- Address string and second level non-address string;Inappropriate substring is filtered by pattern-recognition, offline excavate.By the inquiry for analyzing user String extracts spatial information and search content information in query string, so that reaching needs in correct address space search user It asks, promotes search quality.
The embodiment of the present invention is described in further detail with reference to the accompanying drawings of the specification.
Embodiment 1
As shown in Figure 1, the address lookup string analytic method flow chart provided for the embodiment of the present invention 1, comprising the following steps:
S101: it obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
S102: by address go here and there in address element and non-address ingredient be modified by pattern-recognition, obtain modified address Query string;
S103 constructs level-one substring and second level substring according to modified address query string, as parsing result respectively.
The present embodiment passes through GEO parsing result, makes where what parsing in two sheaf spaces, can more be beneficial to The recall rate for expanding address searching engine, obtains the more desired result of user;On this basis, the embodiment of the present invention has done two The filtering of layer what can reduce recalling for unrelated results, it is accurate to guarantee to recall on the basis of guaranteeing to expand recall rate Degree.
Embodiment 2
As shown in Fig. 2, the address lookup string analytic method flow chart provided for the embodiment of the present invention 2, comprising the following steps:
S201 obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtains address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, are corrected by S202 Address lookup string;
S203 constructs level-one substring and second level substring according to the modified address query string respectively;
S204 filters non-address ingredient in level-one substring and second level substring, will filter according to preset filtering vocabulary Level-one substring and second level substring afterwards is as parsing result.
Here, non-address ingredient in the level-one substring and second level substring is filtered, it is optional to include:
Determine whether non-address ingredient hits filtering vocabulary, if so, directly filtering;Otherwise, step is continued with;
Determine whether non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
Non-address ingredient after this spline filter, it is therefore intended that filtering avoid recalling it is a large amount of it is unrelated be not that user needs Door location.
Level-one substring and second level substring are constructed respectively, comprising:
Obtain the spatial information of district rank and the spatial information of lowest address segment in modified address query string;
The spatial information range searching district rank query string segment below of rank in district, as level-one substring;
Query string segment after the spatial information range searching lowest address segment of lowest address segment, as second level Substring.
In the embodiment of the present invention, in that parsed by construction second level substring, in existing construction level-one On the basis of string, level-one substring is reconstructed, so that the address lookup string after parsing is more clear, is also easier to obtain user The query result needed.
By address go here and there in address element and non-address ingredient be modified by pattern-recognition, comprising:
The non-address ingredient mend after non-address ingredient mstart and address element before obtaining address element;
By pattern-recognition, judge whether mstart and mend is important;
If mstart be important and mend be it is inessential, by mend be added address element in street segment after;It will Mend is written in mstart content;Mstart content is emptied;
If mstart and mend be it is important, mstart content is emptied;
If mstart be inessential and mend be it is important, mstart content is emptied.
If mstart content is sky, without pattern-recognition.
Wherein, judge whether mstart and mend is important, comprising:
Determine whether non-address ingredient is chain store's title word or brand word, if so, being determined as important, otherwise, continues such as Lower step;
Determine whether non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
Non-address ingredient is segmented, determines that non-address ingredient adds classifier for regional address word, if so, determining to attach most importance to It wants, otherwise, it is determined that being inessential.
In the embodiment of the present invention, by analyzing the query string of user, extract in the spatial information and search in query string Hold information, searches for user demand in correct address space to reach, promote search quality.
Embodiment 3
In the embodiment of the present invention, GEO system can receive the query string of user, the scan for inquiries in a manner of from left to right It goes here and there, and identifies address element and non-address ingredient in query string in a manner of from big to small, and provide the other warp of address level Latitude scope.By taking the address lookup string " Chaoyang District, Beijing City Fu Tong East Street KFC " of user's input as an example, parsed by GEO Result later are as follows:
Country: China
Province: Beijing
District: Chaoyang District
Street: Fu Tong East Street
City codes: 110105
Non-address string before the string of address: empty
Non-address string after the string of address: KFC
Longitude: 116.475339
Latitude: 39.986453
Range: 1000m
Geocoding identification reliability: 0.705882
The address of minimal level: Fu Tong East Street
The rank of lowest address segment: road
The area code of lowest address segment: 110105
The longitude of lowest address segment: 116.475339
The latitude of lowest address segment: 39.986453
The range of lowest address segment: 1000m
If former query string is " KFC Chaoyang District, Beijing City Fu Tong East Street ", in the result of GEO parsing:
Non-address string before the string of address: KFC
Non-address string after the string of address: empty.
Further, this GEO parsing result is corrected by pattern-recognition.
Because the search of user be accustomed to it is different, the query string of user's input will not usually follow stringent address string from greatly to The mode of small+non-address string, for example, " KFC (the Fu Tong East Street International Building Fang Heng shop) ", this example is exactly non-address Ingredient " KFC " is located at before address element, can be parsed into " the non-address ingredient before address element " by GEO, be denoted as mstart.Similarly, the non-address ingredient after address element, is denoted as mend.
It is found by statistics, it is minimum that the non-address ingredient of user itself appears in the model probabilities among address element, example Such as, " Chaoyang District KFC Fu Tong East Street ", therefore pattern-recognition of the embodiment of the present invention mainly handles non-address string and goes here and there in address The form of front.
It is parsed by GEO, if mstart is sky, pattern-recognition need not be carried out.
If mstart be not it is empty, need to differentiate by pattern-recognition mstart whether " important ", if mstart sentences Not Wei " important ", and mend is determined as " inessential ", need to do following three movements:
" street " bit end (being denoted as address) in GEO becomes: address+mend;
Mend segment in GEO becomes: mstart;
Mstart in GEO becomes: empty.
For example, " KFC Fu Tong East Street south gate " this address is gone here and there, and in this Fu Tong East Street address=, mstart =KFC, the south gate mend=;After mode determines, discovery " KFC " is determined as " important ", and " south gate " is judged to " not weighing Want ", then three parameters will be modified are as follows: the Fu Tong East Street address=south gate, mstart=is empty, mend=KFC. The spatial dimension where what is parsed at " Fu Tong East Street south gate " searches for " KFC ".
If mstart is not sky, mstart and mend are determined as " important " (or " inessential ") simultaneously, then directly will Mstart becomes empty;If mstart is determined as " inessential ", and mend is determined as " important ", and mstart will also become empty.
For example, " KFC Fu Tong East Street McDonald ", " KFC " and " McDonald " is all " important " herein, then only It parses " Fu Tong East Street " space search " McDonald ".
Judge the whether important optional following method of non-address ingredient:
If non-address ingredient includes chain store's title or brand name, such as " Adidas ", " Yoshinoya " etc., then Directly think its " important ";Otherwise further judgement;
If non-address ingredient is the wordings such as " hotel ", " supermarket ", directly think its " inessential " because this belong to it is general Demand is guessed and is not measured the place that user directly wants to go to;Otherwise further judgement;
Non-address ingredient is segmented later, is judged by segmenting part of speech, if it is the mould of regional address+classifier Formula, then it is assumed that its is important.Such as " Wangjing hospital ", " Back Long View primary school " etc..
Further construct level-one substring and second level substring.
After the modified result of GEO, need to do 2 layers of where what parsing.
From GEO as a result, in spatial dimension, by taking example as above as an example, our available 2 layers of information:
1, the spatial information of district rank, i.e. " 110105 ", the range refer to area's city codes above county level adcode.In general, adcode has 6, and provincial latter 4 are 0, and latter 2 of city-level are 0, and 6 of district grade have value.
2, the spatial information of lowest address segment, the i.e. longitude and latitude and radius of lowest address segment.This range one As meter differed from tens kilometers to tens, be more specifical spatial dimension smaller than " district grade " representated by adcode.This A range is usually to be determined by road, commercial circle and more famous POI point.
Based on above 2 spatial dimensions, the embodiment of the present invention constructs the parsing of 2 grades of where what:
Spatial dimension (where) more than " area's spatial dimension above county level ", search " district grade query string below Segment " (what).
At " spatial dimension of lowest address segment " (where), search for " the query string segment after lowest address segment " (what)。
Still by taking " Chaoyang District, Beijing City Fu Tong East Street KFC " as an example:
First layer resolves to, the spatial dimension in " Chaoyang District, Beijing City ", searches for " Fu Tong East Street KFC ";Search When, spatial filtering is carried out to result with adcode;
The second layer resolves to, the spatial dimension in " Fu Tong East Street ", searches for " KFC ";When search, with < 116.475339 39.986453,1000 > carry out spatial filtering.
Further filter inappropriate non-address ingredient.
First determine whether that What filters vocabulary either with or without hit.Filtering vocabulary be by excavate, sum up offline come word, this A little words recall a large amount of unrelated results when what will lead to search engine.From semantically, this kind of word does not represent specifically mostly Address or poi are to belong to the word of assisted class, such as " east gate ", " Dong Kou ", " West Street " etc..
If having hit filtering vocabulary, need to judge whether the lowest address segment of the part where of the what is area It is above county level.If it is, one judgement of progress, does not filter out first.Because this kind of word belongs to auxiliary mostly when address rank is relatively low It helps, but where is when area is above county level, it may be possible to important POI.Such as " Fu Tong East Street east gate ", it is somebody's turn to do at " east gate " Just belong to complementary word, is not suitable for being what;But if it is " Shenzhen east gate ", what " east gate " is exactly important in Shenzhen POI, it is intended to retain the what, do in next step judge;
If what is the classifiers such as " hotel ", " supermarket ", filter out;
If what is the word of the doorplates door location such as " No. 2 Building A " class, filter out;Filtering out this kind of word main cause is to recall largely Unrelated is not the door location that user needs.
So far, available 2 layers of parsing result.
Embodiment 4
By taking user's input address string " Startbuck, south gate, Suzhou Street, Haidian District, Beijing City " as an example.
By GEO coding result are as follows:
Country: China
Province: Beijing
District: Haidian District
Street: Su Zhoujie
City codes: 110108
Non-address string before the string of address: Startbuck
Non-address string after the string of address: south gate
Longitude: 116.304658
Latitude: 39.972414
Range: 1000m
Geocoding identification reliability: 0.684211
The address of minimal level: Su Zhoujie
The rank of lowest address segment: road
The area code of lowest address segment: 110108
The longitude of lowest address segment: 116.304658
The latitude of lowest address segment: 39.972414
The range of lowest address segment: 1000m
In this mstart=Startbuck, the south gate mend=, address=Su Zhoujie.Because mstart belongs to chain brand, It is determined as " important ", mend is determined as inessential, therefore the Suzhou address=street south gate, and mstart=is empty, mend=star bar Gram;
The Haidian District, Beijing City first layer where=(adcode=110108), the Suzhou what=street south gate Startbuck;
The Suzhou Street, Haidian District, Beijing City second layer where=south gate (116.304658,39.972414,1000), what= Startbuck;
The Suzhou first layer what street south gate Startbuck is not classifier, is not belonging to doorplate, cannot hit filtering vocabulary, is protected It stays;
Second layer what Startbuck is not classifier, is not belonging to doorplate, cannot hit filtering vocabulary, is retained.
Based on the same inventive concept, it additionally provides a kind of inquiry address string in the embodiment of the present invention analytic method is corresponding and look into Ask address string resolver, since the principle that the device solves the problems, such as is similar to present invention method, the device Implementation may refer to the implementation of method, and overlaps will not be repeated.
Embodiment 5
As shown in figure 3, the inquiry address string resolver structural schematic diagram provided for the embodiment of the present invention 5, comprising:
GEO parsing module 31, for obtaining the address lookup string of user's input and being parsed by GIS-Geographic Information System GEO, Obtain address string;
Pattern recognition module 32 is repaired for address element in going here and there address and non-address ingredient by pattern-recognition Just, modified address query string is obtained;
Substring constructing module 33 constructs level-one substring and second level substring according to modified address query string, as parsing respectively As a result.
Optionally, above-mentioned apparatus further includes filtering vocabulary module 34, for according to preset filtering vocabulary, filtering one Non-address ingredient in grade substring and second level substring, using filtered level-one substring and second level substring as parsing result.
Embodiment 6
As shown in figure 4, the filtering vocabulary module 34 in address above mentioned query string resolver, comprising:
Filter word list deciding submodule 341, for determining whether non-address ingredient hits filtering vocabulary;
Classifier decision sub-module 342 determines whether non-address ingredient is classifier;
Door location class word judgment submodule 343, determines whether non-address ingredient is a location class word.
Embodiment 7
As shown in figure 5, the substring constructing module 33 in address above mentioned query string resolver, comprising:
Spatial information acquisition submodule 331, for obtaining in modified address query string the spatial information of district rank and most The spatial information of small address fragment;
Level-one substring constructs submodule 332, below for the spatial information range searching district rank in district rank Query string segment, as level-one substring;
Second level substring constructs submodule 333, for the spatial information range searching lowest address piece in lowest address segment Query string segment after section, as second level substring.
Embodiment 8
As shown in fig. 6, the pattern recognition module 32 in address above mentioned query string resolver, comprising:
Address divides submodule 321, after obtaining non-address ingredient mstart and address element before address element Non-address ingredient mend;
Pattern-recognition submodule 322, for judging whether mstart and mend is important by pattern-recognition;If mstart Be for important and mend it is inessential, then will mend be added address element in street segment after;Mstart content is written mend;Mstart content is emptied;If mstart and mend be it is important, mstart content is emptied;If mstart is not Important and mend be it is important, then mstart content is emptied.
Optionally, above-mentioned pattern-recognition submodule 322 be also used to determine non-address ingredient whether be chain store's title word or Brand word, if so, being determined as important;Determine whether non-address ingredient is classifier, if so, being determined as inessential;To non-address Ingredient participle determines that non-address ingredient adds classifier for regional address word, if so, be determined as it is important, otherwise, it is determined that not weigh It wants.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of device (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (11)

1. a kind of address lookup string analytic method, which is characterized in that this method comprises:
It obtains the address lookup string of user's input and is parsed by GIS-Geographic Information System GEO, obtain address string;
Address element in the string of the address and non-address ingredient are modified by pattern-recognition, obtain modified address inquiry String;
The level-one substring and lowest address within the scope of tectonic province other spatial information at county level are distinguished according to the modified address query string Second level substring in segment spatial dimension, as parsing result;
It is described to be modified address element in the string of the address and non-address ingredient by pattern-recognition, comprising:
Non-address ingredient mstart before the obtaining the address element and non-address ingredient mend after address element;
By pattern-recognition, judge whether the mstart and mend is important;
If the mstart be the important and described mend be it is inessential, the street in the address element is added in the mend After segment;The mend is written into the mstart content;The mstart content is emptied;
If the mstart and mend be it is important, the mstart content is emptied;
If the mstart be the inessential and described mend be it is important, the mstart content is emptied.
2. the method as described in claim 1, which is characterized in that the method also includes:
According to preset filtering vocabulary, non-address ingredient in the level-one substring and second level substring is filtered, it will be filtered The level-one substring and second level substring are as parsing result.
3. method according to claim 2, which is characterized in that described to filter non-address in the level-one substring and second level substring Ingredient, comprising:
Determine whether the non-address ingredient hits the filtering vocabulary, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is classifier, if so, directly filtering;Otherwise, step is continued with;
Determine whether the non-address ingredient is a location class word, if so, directly filtering;Otherwise, it does not filter.
4. the method as described in claim 1, which is characterized in that described to construct level-one substring and second level substring respectively, comprising:
Obtain the spatial information of the spatial information of district rank and lowest address segment in the modified address query string;
In the spatial information range searching district rank query string segment below of the district rank, as level-one substring;
Query string segment after the spatial information range searching lowest address segment of the lowest address segment, as second level Substring.
5. the method as described in claim 1, which is characterized in that the method also includes:
If the mstart content is sky, without pattern-recognition.
6. method as claimed in claim 5, which is characterized in that it is described by pattern-recognition, judge the mstart and mend It whether is important, comprising:
Determine whether the non-address ingredient is chain store's title word or brand word, if so, being determined as important, otherwise, continues such as Lower step;
Determine whether the non-address ingredient is classifier, if so, being determined as inessential, otherwise, continues with step;
The non-address ingredient is segmented, determines that the non-address ingredient adds classifier for regional address word, if so, being determined as It is important, otherwise, it is determined that being inessential.
7. a kind of address lookup string resolver, which is characterized in that the device includes:
GEO parsing module obtains ground for obtaining the address lookup string of user's input and parsing by GIS-Geographic Information System GEO Location string;
Pattern recognition module, for address element in the string of the address and non-address ingredient to be modified by pattern-recognition, Obtain modified address query string;
Substring constructing module distinguishes level-one within the scope of tectonic province other spatial information at county level according to the modified address query string Second level substring in string and lowest address segment spatial dimension, as parsing result;
The pattern recognition module, comprising:
Address divides submodule, for obtain after non-address ingredient mstart and address element before the address element non-ly Location ingredient mend;
Pattern-recognition submodule, for judging whether the mstart and mend is important by pattern-recognition;If described Mstart be the important and described mend be it is inessential, then the mend is added after the street segment in the address element; The mend is written into the mstart content;The mstart content is emptied;If the mstart and mend be it is important, Then the mstart content is emptied;If the mstart be the inessential and described mend be it is important, will be in the mstart Accommodating is empty.
8. device as claimed in claim 7, which is characterized in that described device further includes filtering vocabulary module, for according to pre- The filtering vocabulary first set filters non-address ingredient in the level-one substring and second level substring, by filtered level-one String and second level substring are as parsing result.
9. device as claimed in claim 8, which is characterized in that the filtering vocabulary module, comprising:
Filter word list deciding submodule, for determining whether the non-address ingredient hits the filtering vocabulary;
Classifier decision sub-module determines whether the non-address ingredient is classifier;
Door location class word judgment submodule, determines whether the non-address ingredient is a location class word.
10. device as claimed in claim 7 or 8, which is characterized in that the substring constructing module, comprising:
Spatial information acquisition submodule, for obtaining in the modified address query string spatial information of district rank and minimally The spatial information of location segment;
Level-one substring constructs submodule, for the spatial information range searching district rank inquiry below in the district rank String segment, as level-one substring;
Second level substring construct submodule, for the lowest address segment spatial information range searching lowest address segment it Query string segment afterwards, as second level substring.
11. device as claimed in claim 10, which is characterized in that the pattern-recognition submodule is also used to determine described non- Whether address element is chain store's title word or brand word, if so, being determined as important;Determine whether the non-address ingredient is class Other word, if so, being determined as inessential;The non-address ingredient is segmented, determines the non-address ingredient for regional address word Add classifier, if so, be determined as it is important, otherwise, it is determined that be inessential.
CN201410174465.9A 2014-04-28 2014-04-28 A kind of address lookup string analysis method and device Expired - Fee Related CN105022747B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410174465.9A CN105022747B (en) 2014-04-28 2014-04-28 A kind of address lookup string analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410174465.9A CN105022747B (en) 2014-04-28 2014-04-28 A kind of address lookup string analysis method and device

Publications (2)

Publication Number Publication Date
CN105022747A CN105022747A (en) 2015-11-04
CN105022747B true CN105022747B (en) 2019-12-03

Family

ID=54412729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410174465.9A Expired - Fee Related CN105022747B (en) 2014-04-28 2014-04-28 A kind of address lookup string analysis method and device

Country Status (1)

Country Link
CN (1) CN105022747B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426351B (en) * 2015-11-11 2019-01-25 中国建设银行股份有限公司 A kind of participle processing method and system of customer address information
CN111177589A (en) * 2019-12-31 2020-05-19 税友软件集团股份有限公司 Address information query method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980208A (en) * 2010-11-10 2011-02-23 百度在线网络技术(北京)有限公司 Address query method and system
US8073789B2 (en) * 2005-03-10 2011-12-06 Microsoft Corporation Method and system for web resource location classification and detection
CN102289467A (en) * 2011-07-22 2011-12-21 浙江百世技术有限公司 Method and device for determining target site

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8073789B2 (en) * 2005-03-10 2011-12-06 Microsoft Corporation Method and system for web resource location classification and detection
CN101980208A (en) * 2010-11-10 2011-02-23 百度在线网络技术(北京)有限公司 Address query method and system
CN102289467A (en) * 2011-07-22 2011-12-21 浙江百世技术有限公司 Method and device for determining target site

Also Published As

Publication number Publication date
CN105022747A (en) 2015-11-04

Similar Documents

Publication Publication Date Title
US10520326B2 (en) Driving route matching method and apparatus, and storage medium
JP6375293B2 (en) Method and apparatus for recommending candidate terms based on geographic location
CN104679801B (en) A kind of interest point search method and device
WO2016107417A1 (en) Method and device for exploiting travel route on basis of tourist destination area
US8688366B2 (en) Method of operating a navigation system to provide geographic location information
WO2016107371A1 (en) Method and equipment for searching for tourist destination attractions
US9677904B2 (en) Generating travel time data
Wang et al. Quality analysis of open street map data
CN104808932B (en) Route information acquisition method and terminal
CN104572955A (en) System and method for determining POI name based on clustering
CN104699835A (en) Method and device used for determining webpages including POI (point of interest) data
KR101344913B1 (en) System and method for providing automatically completed query by regional groups
CN107203526A (en) A kind of query string semantic requirement analysis method and device
CN110990520B (en) Address coding method and device, electronic equipment and storage medium
CN103049481B (en) A kind of searching method and search equipment
CN105022747B (en) A kind of address lookup string analysis method and device
CN110060472B (en) Road traffic event positioning method, system, readable storage medium and device
CN111931077A (en) Data processing method and device, electronic equipment and storage medium
CN105574019B (en) Query parameter processing method and device
Schockaert et al. Mining topological relations from the web
TW202146850A (en) Processing apparatus and method for determining road names
CN110645997B (en) Method and device for digging new roads based on track route
CN105740246B (en) Set keyword query method based on diagram data
CN113190640B (en) Method and device for processing point of interest data
CN116976308A (en) Address processing method, address processing device, electronic equipment and computer program product

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200429

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: 102200, No. 8, No., Changsheng Road, Changping District science and Technology Park, Beijing, China. 1-5

Patentee before: AUTONAVI SOFTWARE Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191203

Termination date: 20200428