CN101719128A - Fuzzy matching-based Chinese geo-code determination method - Google Patents

Fuzzy matching-based Chinese geo-code determination method Download PDF

Info

Publication number
CN101719128A
CN101719128A CN200910156650A CN200910156650A CN101719128A CN 101719128 A CN101719128 A CN 101719128A CN 200910156650 A CN200910156650 A CN 200910156650A CN 200910156650 A CN200910156650 A CN 200910156650A CN 101719128 A CN101719128 A CN 101719128A
Authority
CN
China
Prior art keywords
address
matching
rule
chinese
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200910156650A
Other languages
Chinese (zh)
Other versions
CN101719128B (en
Inventor
张贵军
吴海涛
洪榛
俞立
郭海峰
何尚秋
陈宁宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN2009101566504A priority Critical patent/CN101719128B/en
Publication of CN101719128A publication Critical patent/CN101719128A/en
Application granted granted Critical
Publication of CN101719128B publication Critical patent/CN101719128B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The method discloses a fuzzy matching-based Chinese geo-code determination method, which comprises the following steps: A1, reading descriptive Chinese address information in and adopting a forward maximum searching method to split an original address to obtain an original address element array in a way that the levels of administrative regions are taken as breakpoints; A2, standardizing original address elements through an address dictionary; and A3, reading a standard address tree, adopting a branch-bound algorithm to match the original address element array, simultaneously, utilizing fuzzy rules to control the matching operation, and after acquiring keywords after the original address is split, taking a matching result with the highest evaluation score as the most approximate matching result to obtain a more accurate matched address. The invention provides the fuzzy matching-based Chinese geo-code determination method, which has the advantages of rational address model, relatively higher matching rate and high speed.

Description

A kind of Chinese geocoding based on fuzzy matching is determined method
Technical field
The present invention relates to a kind of geographic information data processing, computer application field, in particular, a kind of geocoding method based on fuzzy matching.
Background technology
Geocoding is a process of setting up address descriptor and coordinate corresponding relation, that is to say the crossover tool between the description of locus, place and place.Owing to lack the support of effective spatial analysis technology, the analyzing and processing of spatial data can't satisfy the needs of science decision and management, causes the value of spatial data in decision-making management can not embody all the time for a long time.Can realize the fusion of Geographic Information System and spatial information by matching addresses, promote the city space informationization, so more effective, carry out spatial analysis more easily and decision-making is used.
In recent years, along with the continuous development of geographical information technology and perfect, the geocoding technology is also being updated.External research in this respect is comparative maturity, a kind of theory of multi-mode cross bearing has been proposed as Davis, but just at the zone that has the geocoding standard, and a plurality of spatial information database also caused the spatial information redundancy, reduced matching efficiency; Duncan has proposed homalographic cell Unified coding scheme, but the geocoding standard of Chinese city each department has nothing in common with each other, and the coding criterion of this complexity is once formation, in case the large-scale change that changes and will involve, cost is too high; People such as Bakshi have proposed a kind of geocoding technology based on text mark splitting scheme, this matching scheme has been obtained effect preferably concerning English address, but because Chinese typing mode and English exists than big-difference, therefore for the matching addresses effect of Chinese and not obvious.For domestic, the matching addresses technology is at the early-stage, has only done many work in application facet.As Beijing " addressing god " of Computer Company longways, the Map Searcher of Founder Digit etc., but this type of application system exists in the application to town problems such as the address model is single, matching rate is high inadequately.
Therefore, existing technology exists defective at the Chinese address encoding context at the town, needs to improve.
Summary of the invention
Single for the address model that overcomes existing Chinese geographic position coding method, matching rate is not high enough, slow-footed deficiency, the invention provides the Chinese geocoding that a kind of address model is reasonable, matching rate is higher, rapidity is good and determine method based on fuzzy matching.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of Chinese geocoding based on fuzzy matching is determined method, may further comprise the steps:
A1, reading in descriptive Chinese address information, is breakpoint with the administrative area rank, adopts the forward maximum searching method, and original address is carried out cutting, obtains the original address element array;
A2, the original address element is carried out standardization by the address dictionary;
A3, read normal address tree, adopt branch-bound algorithm, the original address element array is mated: the address database of setting up the number of addresses storage format, stratification according to the china administration district is divided, set up tree-shaped address storage tree, highest-ranking administrative area unit is as the root node of number of addresses, and preserve as child node in its subordinate administrative area; Foundation is to address key element and number after the cutting of descriptive Chinese address information, in matching process, at first read normal address tree R, judge by other key word of highest line political affairs level in the candidate site key element after the cutting, the address node of setting the corresponding administrative grade of R with the normal address mates, give up uncorrelated branch tree after the match is successful, keep the correlated branch tree and carry out next administrative grade coupling;
Simultaneously, using fuzzy rule controls matching operation: behind the key word after obtaining the original address cutting, also comprise:
Adopt the fuzzy matching rule that matching operation is optimized, the fuzzy matching rule definition is as follows: the supposition matching field is character string address, and length is h; Criteria field is character string std_address, and length is H; The std_address set that address ∩ std_address ≠ Φ is satisfied in definition is the set of Satisfying Matching Conditions, wherein, address ∩ std_address ≠ Φ represents that character string address and criteria field character string std_address occur simultaneously not for empty, keep the high set element of degree of membership at last; Be defined as follows matched rule:
1. standard characters std_address is identical with i character among the matched character string address, and then degree of membership is i/H;
2. standard characters std_address comprises matched character string address, and then degree of membership is 1;
Obtain after the degree of membership, set μ and be the coupling degree of membership, be converted into the quantification score value according to mapping ruler f:sc → μ, mapping function: f (μ)=10 * μ, with the evaluation score of sc as this candidate record;
The most close matching result of conduct that evaluation score is the highest promptly obtains more accurate match address.
As preferred a kind of scheme: described Chinese geocoding determines that method also comprises:
If the number that the A4 match address comprises is carried out space orientation: set the urban road number with following regular distribution: according to the both sides of odd or even number regular distribution in road, be odd numbers just to the left, the right side is an even numbers; Be odd numbers just to the right, the left side is an even numbers; Record road flex point number with and geographic coordinate information, after obtaining the number information in the original address, judge to be between any two flex points, suppose that the match address number is between flex point A, B, with A, B is reference point, carry out the least square method linear interpolation, obtain the particular geographic coordinates that this number is positioned at road, navigate to map at last.
Further, in the described steps A 3, by normalizing operation, the candidate site array define of obtaining after the original address standardization is address[i], 0<i<N; The normal address node is made as sc with the coupling score value of corresponding level candidate element i, i represents the affiliated level of this node, N represents the degree of depth of initial address tree; It is as follows that coupling is passed judgment on rule:
Rule 1: number of addresses node and candidate's element accurately mate, Y → accurately mate N → fuzzy matching;
Rule 2: accurately search feasible solution after the coupling, Y → matching algorithm moves down, N → return the upper level node to search approximate solution;
Rule 3: judge whether to exist default, Y → preservation upper level branch tree, the current level of N → preservation branch tree;
Rule 4: judge whether to exist default, sc i=0, i is default the place number of plies;
Rule 5: the candidate record final score is its each layer node matching score sum:
sc=∑sc i
Further again, in the described steps A 3, auxiliary geographical name data bank is set, use comparatively frequent geographic position to build the storehouse separately simultaneously for having the important of the second feature identity.
In steps A 1, the original address that obtains, first character with original address is a starting point, address database search is searched corresponding normal address title, exist and then read the address information reservation, simultaneously this character is excised in the original address character string, otherwise read next character and last character composition character string, corresponding normal address title is searched in continuation in address database, read successively, determines the address key element of all administrative grades.
In steps A 2, if there be default in the candidate site array after the cutting,, obtain its higher level address at address database according to other address element of next stage, write in the candidate site key element array.
In steps A 2, be called for short the design address, the another name information database, preserves the specialized information database of current all normal address information and its another name, abbreviation.
In steps A 2, the wrongly written or mispronounced characters error correction of the address element after the cutting, suppose in the address information of typing and have wrongly written or mispronounced characters, it is address element after the cutting can't find complete correspondence in the dictionary of address normal address title, get the normal address title the most close and return, and replace the address information of typing with the address information of typing.
Technical conceive of the present invention is: at first obtain original typing address information, adopt then and divide word algorithm that the original address of words input is carried out cutting, obtain the description key word with the corresponding locus of original address; The normal address data in city are pitched tree-like formula with K stores, wherein the K value is by the concrete quantity decision of each rank administrative unit, the key word that obtains is mated in the tree of normal address, adopt branch-bound algorithm that matching algorithm is optimized in the matching process, use simultaneously that fuzzy rule is accurately controlled matching operation and to the matching result screening of marking, obtain at least one and conform to fully with original address or be similar to the address information that conforms to.Application has reduced the scale of number of addresses based on the branch-and-bound matching algorithm of tree-shaped address information memory module, has optimized the algorithm complex of matching addresses process, has improved the efficient and the accuracy rate of address.
Beneficial effect of the present invention mainly shows: the present invention has optimized the algorithm complex of geocoding process, has improved the efficient and the accuracy rate of geocoding.
Description of drawings
The Chinese geocoding that Fig. 1 is based on fuzzy matching is determined the process flow diagram of method.
Fig. 2 is the synoptic diagram of normal address tree.
Fig. 3 is the synoptic diagram of matched rule.
Fig. 4 is the synoptic diagram of the odd or even number regular distribution of road.
Fig. 5 loads the initial address tree, and back extraction that accurately the match is successful is the branch tree of root node with " Zhejiang ", the synoptic diagram of deletion invalid branch tree.
Fig. 6 judges address[2]=" Hangzhou ", after accurately the match is successful, extracting with " Hangzhou " was the branch tree of root node; Judge address[3 again]=" East Lake ", after accurately the match is successful, extracting with " East Lake " was the synoptic diagram of the branch tree of root node.
Fig. 7 judges address[4]=" staying ", current branch tree does not have feasible solution, returns the father node in current root node " East Lake ", enables the fuzzy matching pattern, be met the branch tree of part matching condition, mate the synoptic diagram that keyword " stays " again.
Fig. 8 judges address[5]=" stay and close ", the child node of current branch tree root node can't accurately mate, and starts the fuzzy matching pattern, obtains part coupling branch tree, judge address[6]=" 288 ", the synoptic diagram that all part coupling branch trees mate.
Embodiment
Below in conjunction with accompanying drawing the present invention is further described.
With reference to Fig. 1~Fig. 8,
A kind of Chinese geocoding method based on fuzzy matching as shown in Figure 1, wherein comprises following steps:
A1, reading in descriptive Chinese address information, is breakpoint with the administrative area rank, adopts the forward maximum searching method, and original address is carried out cutting, obtains the original address element array.A2, the original address element is carried out standardization by the address dictionary, obtain through being called for short or another name is corrected, misspelling is revised, address element array behind default normalizing operation such as filling.A3, read normal address tree, adopt branch-bound algorithm, the original address element array is mated, use fuzzy rule simultaneously matching operation is controlled, obtain more accurate match address.A4, the number that comprises for match address adopt flex point to carry out space orientation with reference to interpolation algorithm.
Described method, wherein, in steps A 1, at Chinese address information, with reference to china administration area dividing standard, established standards typing pattern:
Administrative address pattern: province (municipality directly under the Central Government) → city → district (county, county-level city); Regional address pattern: street (town) → village (road) term position → number.As normal address information: Hangzhou, Zhejiang province city Xihu District stays the town and stays and No. 288, North Road.
Described method, wherein, in steps A 1, the original address that obtains is a starting point with first character of original address, and address database search is searched corresponding normal address title, exist and then read the address information reservation, simultaneously this character is excised in the original address character string, otherwise read next character and last character composition character string, continue the corresponding normal address of search title in address database.Read successively, determine the address key element of all administrative grades.
Described method wherein, in steps A 2, if there be default in the candidate site array after the cutting, according to other address element of next stage, is obtained its higher level address at address database, writes in the candidate site key element array.
Described method, wherein, in steps A 2, be called for short the design address, the another name information database, preserves the specialized information database of current all normal address information and its another name, abbreviation.If there is another name in the candidate site after the cutting or is called for short, distinguish and it be standardized as standard name that as " Shandong " is standardized as " Shandong ", " Shanghai " is standardized as " Shanghai ".
Described method, wherein, in steps A 2, the wrongly written or mispronounced characters error correction of the address element after the cutting, suppose in the address information of typing and have wrongly written or mispronounced characters, be address element after the cutting can't find complete correspondence in the dictionary of address normal address title, get the normal address title the most close and return, and replace the address information of typing with the address information of typing.As typing " Liu Helu ", do not exist in the dictionary of address " Liu Helu ", only there be " Liu Helu ", get " Liu Helu " replacement " Liu Helu ".
Described method, wherein, in steps A 3, comprise following steps, read address database, and address database is stored with the number of addresses form, highest-ranking administrative area unit is as the root node of number of addresses, and preserve as child node in its subordinate administrative area, as shown in Figure 2.
Described method, wherein, in steps A 3, also comprise following steps, under address information tree-like storage prerequisite, adopt branch-bound algorithm that matching process is optimized, the address information of corresponding level during promptly at first other key word of highest line political affairs level in the matching candidate address element is set with corresponding address, matched nodes and branch thereof that the match is successful then keeps in the corresponding address tree set, and give up other uncorrelated address information nodes at the same level and branch thereof tree.By normalizing operation, the candidate site array define of obtaining after the original address standardization is address[i], 0<i<N.The normal address node is made as sc with the coupling score value of corresponding level candidate element i, i represents the affiliated level of this node, N represents the degree of depth of initial address tree.It is as follows that coupling is passed judgment on rule:
Rule 1: number of addresses node and candidate's element accurately mate, Y → accurately mate N → fuzzy matching;
Rule 2: accurately search feasible solution after the coupling, Y → matching algorithm moves down, N → return the upper level node to search approximate solution;
Rule 3: judge whether to exist default, Y → preservation upper level branch tree, the current level of N → preservation branch tree;
Rule 4: judge whether to exist default, sc i=0, i is default the place number of plies;
Rule 5: the candidate record final score is its each layer node matching score sum:
sc=∑sc i
Described method wherein, in steps A 3, also comprises following steps, uses fuzzy rule control matching operation, if can't mate achievement fully for address information node at the same level in the number of addresses, then enables fuzzy rule, obtains the approximate match result.As typing key word at county level is " East Lake ", and only there be " West Lake " in node at county level in the number of addresses, then obtains node " West Lake " and branch thereof tree and keeps as matching result, gives up other nodes at the same level and branch thereof tree.
Described method wherein, in steps A 3, also comprises following steps, and matching result is quantized scoring.Coupling is given different score values with approximate match fully, and the most close matching result of conduct that score value is high returns, and the comparatively close matching result of the conduct that score value is low returns.Quantizing rule is as follows:
Suppose that matching field is character string address, length is h; Criteria field is character string std_address, and length is H.The std_address set that address ∩ std_address ≠ Φ is satisfied in definition is the set of Satisfying Matching Conditions, wherein, address ∩ std_address ≠ Φ represents that character string address and criteria field character string std_address occur simultaneously not for empty, keep the high set element of degree of membership at last.Be defined as follows matched rule Fig. 3):
1. standard characters std_address is identical with i character among the matched character string address, and then degree of membership is i/H;
2. standard characters std_address comprises matched character string address, and then degree of membership is 1.
Obtain after the degree of membership, set μ and be the coupling degree of membership, be converted into the quantification score value according to mapping ruler f:sc → μ, mapping function: f (μ)=1O * μ, with the evaluation score of sc as this candidate record.
Described method, wherein, in steps A 3, also comprise following steps, auxiliary geographical name data bank is set, having the important of the second feature identity for some uses comparatively frequent geographic position to build the storehouse separately simultaneously, the second feature identity as " Hangzhou, Zhejiang province city Xihu District stays the town and stays and No. 288, road " is " Zhejiang Polytechnical University Ping Feng school district ", if typing original address information is " Zhejiang Polytechnical University Ping Feng school district ", then directly navigate to the geographic position of " Hangzhou, Zhejiang province city Xihu District stays the town and stays and No. 288, road ".
Described method wherein, in steps A 4, comprises following steps, obtain final matching results after, carry out space interpolation location according to number information.If there is no number information then navigates to the region geometry center of the minimum administrative unit of original address information, is accurate to the street as original address information, then with the geometric space center of location positioning to this street.If there is number information, sets road and set the urban road number with following regular distribution: according to the both sides of odd or even number regular distribution in road: be odd numbers just to the left, the right side is an even numbers; Be odd numbers just to the right, the left side is even numbers (Fig. 4).Record road flex point number with and geographic coordinate information, after obtaining the number information in the original address, judge to be between any two flex points, suppose that the match address number is between flex point A, B, with A, B is reference point, carry out the least square method linear interpolation, obtain the particular geographic coordinates that this number is positioned at road, last space and geographical coordinate setting is to map.
Branch-and-bound matching algorithm average time complexity based on tree-shaped address information memory module among the present invention is log K N, wherein N represents the leafy node number of K fork number of addresses.
In the present embodiment, set original typing address information and after cutting, obtain candidate site array address[for " Hangzhou, Zhejiang province city Donghu District stays to press down to stay and closes the road No. 288 " original address] (table 1).
Table 1 candidate site array
Level Economize The city The district The town The road Number
Codomain Zhejiang Hangzhou East Lake Stay Stay and close ??288
Consider better expression algorithm thought, add some in the match address tree and upset data that matching process is as follows behind the introducing branch and bound algorithms:
Step1: load the initial address tree, judge address[1]=" Zhejiang ", after accurately the match is successful, extraction is the branch tree of root node with " Zhejiang ", deletion invalid branch tree, wherein sc represents the PTS after each node and candidate site speech section are mated, as shown in Figure 5.
Step2: judge address[2]=" Hangzhou ", after accurately the match is successful, extracting with " Hangzhou " was the branch tree of root node.Judge address[3]=" East Lake ", after accurately the match is successful, extracting with " East Lake " was the branch tree of root node, as shown in Figure 6.
Step3: judge address[4]=" staying ", current branch tree does not have feasible solution, returns the father node in current root node " East Lake ", enable the fuzzy matching pattern, be met the branch tree of part matching condition, mate keyword again and " stay ", as shown in Figure 7.
Step4: judge address[5]=" stay and close ", the child node of current branch tree root node can't accurately mate, and starts the fuzzy matching pattern, obtain part coupling branch tree, judge address[6]=" 288 ", all part coupling branch trees mate, as shown in Figure 8.
After all speech section couplings were finished in the candidate site array, the last evaluation score that each address is write down sorted, and the address record that obtains marking the highest returns as final matching results, shown in Fig. 9 solid line part.
Step5: obtain number information, read the geographical information in final match address information Middle St road, comprise flex point number data, as shown in Figure 9.Judge that initial number " No. 288 " is positioned between flex point A " No. 268 " and the flex point B " No. 296 ".With flex point A, B is that reference point carries out the least square method interpolation, obtains the locus of original number in the street, sees " * " position among Figure 10.
What more than set forth is the good optimization effect that a embodiment that the present invention provides shows, obviously the present invention not only is fit to the foregoing description, can do many variations to it under the prerequisite of the related content of flesh and blood of the present invention and is implemented not departing from essence spirit of the present invention and do not exceed.

Claims (8)

1. the Chinese geocoding based on fuzzy matching is determined method, it is characterized in that: described Chinese geocoding determines that method may further comprise the steps:
A1, reading in descriptive Chinese address information, is breakpoint with the administrative area rank, adopts the forward maximum searching method, and original address is carried out cutting, obtains the original address element array;
A2, the original address element is carried out standardization by the address dictionary;
A3, read normal address tree, adopt branch-bound algorithm, the original address element array is mated: the address database of setting up the number of addresses storage format, stratification according to the china administration district is divided, set up tree-shaped address storage tree, highest-ranking administrative area unit is as the root node of number of addresses, and preserve as child node in its subordinate administrative area; Foundation is to address key element and number after the cutting of descriptive Chinese address information, in matching process, at first read normal address tree R, judge by other key word of highest line political affairs level in the candidate site key element after the cutting, the address node of setting the corresponding administrative grade of R with the normal address mates, give up uncorrelated branch tree after the match is successful, keep the correlated branch tree and carry out next administrative grade coupling;
Simultaneously, using fuzzy rule controls matching operation: behind the key word after obtaining the original address cutting, also comprise:
Adopt the fuzzy matching rule that matching operation is optimized, the fuzzy matching rule definition is as follows: the supposition matching field is character string address, and length is h; Criteria field is character string std_address, and length is H; The std_address set that address ∩ std_address ≠ Φ is satisfied in definition is the set of Satisfying Matching Conditions, wherein, address ∩ std_address ≠ Φ represents that character string address and criteria field character string std_address occur simultaneously not for empty, keep the high set element of degree of membership at last; Be defined as follows matched rule:
1. standard characters std_address is identical with i character among the matched character string address, and then degree of membership is i/H;
2. standard characters std_address comprises matched character string address, and then degree of membership is 1;
Obtain after the degree of membership, set μ and be the coupling degree of membership, be converted into the quantification score value according to mapping ruler f:sc → μ, mapping function: f (μ)=10 * μ, with the evaluation score of sc as this candidate record;
The most close matching result of conduct that evaluation score is the highest promptly obtains more accurate match address.
2. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 1 is determined method, and it is characterized in that: described Chinese geocoding determines that method also comprises:
If the number that the A4 match address comprises is carried out space orientation: set the urban road number with following regular distribution: according to the both sides of odd or even number regular distribution in road, be odd numbers just to the left, the right side is an even numbers; Be odd numbers just to the right, the left side is an even numbers; Record road flex point number with and geographic coordinate information, after obtaining the number information in the original address, judge to be between any two flex points, suppose that the match address number is between flex point A, B, with A, B is reference point, carry out the least square method linear interpolation, obtain the particular geographic coordinates that this number is positioned at road, navigate to map at last.
3. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 1 or 2 is determined method, it is characterized in that: in the described steps A 3, by normalizing operation, the candidate site array define of obtaining after the original address standardization is address[i], 0<i<N; The normal address node is made as sc with the coupling score value of corresponding level candidate element i, i represents the affiliated level of this node, N represents the degree of depth of initial address tree; It is as follows that coupling is passed judgment on rule:
Rule 1: number of addresses node and candidate's element accurately mate, Y → accurately mate N → fuzzy matching;
Rule 2: accurately search feasible solution after the coupling, Y → matching algorithm moves down, N → return the upper level node to search approximate solution;
Rule 3: judge whether to exist default, Y → preservation upper level branch tree, the current level of N → preservation branch tree;
Rule 4: judge whether to exist default, sc i=0, i is default the place number of plies;
Rule 5: the candidate record final score is its each layer node matching score sum:
sc=∑sc i
4. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 1 or 2 is determined method, it is characterized in that: in the described steps A 3, auxiliary geographical name data bank is set, uses comparatively frequent geographic position to build the storehouse separately simultaneously for having the important of the second feature identity.
5. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 1 or 2 is determined method, it is characterized in that: in steps A 1, the original address that obtains, first character with original address is a starting point, address database search is searched corresponding normal address title, exist and then read the address information reservation, simultaneously this character is excised in the original address character string, otherwise read next character and last character composition character string, corresponding normal address title is searched in continuation in address database, read successively, determine the address key element of all administrative grades.
6. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 1 or 2 is determined method, it is characterized in that: in steps A 2, if there be default in the candidate site array after the cutting, according to other address element of next stage, obtain its higher level address at address database, write in the candidate site key element array.
7. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 6 is determined method, it is characterized in that: in steps A 2, be called for short the design address, the another name information database, preserves the specialized information database of current all normal address information and its another name, abbreviation.
8. a kind of Chinese geocoding based on fuzzy matching as claimed in claim 7 is determined method, it is characterized in that: in steps A 2, the wrongly written or mispronounced characters error correction of the address element after the cutting, suppose in the address information of typing and have wrongly written or mispronounced characters, it is address element after the cutting can't find complete correspondence in the dictionary of address normal address title, get the normal address title the most close and return, and replace the address information of typing with the address information of typing.
CN2009101566504A 2009-12-31 2009-12-31 Fuzzy matching-based Chinese geo-code determination method Expired - Fee Related CN101719128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101566504A CN101719128B (en) 2009-12-31 2009-12-31 Fuzzy matching-based Chinese geo-code determination method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101566504A CN101719128B (en) 2009-12-31 2009-12-31 Fuzzy matching-based Chinese geo-code determination method

Publications (2)

Publication Number Publication Date
CN101719128A true CN101719128A (en) 2010-06-02
CN101719128B CN101719128B (en) 2012-05-23

Family

ID=42433702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101566504A Expired - Fee Related CN101719128B (en) 2009-12-31 2009-12-31 Fuzzy matching-based Chinese geo-code determination method

Country Status (1)

Country Link
CN (1) CN101719128B (en)

Cited By (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980208A (en) * 2010-11-10 2011-02-23 百度在线网络技术(北京)有限公司 Address query method and system
CN101996247A (en) * 2010-11-10 2011-03-30 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN102024024A (en) * 2010-11-10 2011-04-20 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN102169498A (en) * 2011-04-14 2011-08-31 中国测绘科学研究院 Address model constructing method and address matching method and system
CN102289467A (en) * 2011-07-22 2011-12-21 浙江百世技术有限公司 Method and device for determining target site
CN102298585A (en) * 2010-06-24 2011-12-28 高德软件有限公司 Address splitting and level marking method and device
CN102393937A (en) * 2011-10-12 2012-03-28 深圳市络道科技有限公司 Address matching method and system of address tree based on backward production
CN102402533A (en) * 2010-09-13 2012-04-04 方正国际软件有限公司 Address matching method and system
CN102446186A (en) * 2010-10-13 2012-05-09 上海众恒信息产业股份有限公司 Chinese geographic coding and decoding method and device adopting same
CN102880650A (en) * 2012-08-27 2013-01-16 中国工商银行股份有限公司 Data matching method and device
CN102955832A (en) * 2011-08-31 2013-03-06 深圳市华傲数据技术有限公司 Correspondence address identifying and standardizing system
CN103383682A (en) * 2012-05-01 2013-11-06 刘龙 Geographic coding method, and position inquiring system and method
CN103413215A (en) * 2013-07-12 2013-11-27 广州银联网络支付有限公司 Electronic bank code matching method based on matrix similarity algorithm
CN103440311A (en) * 2013-08-27 2013-12-11 深圳市华傲数据技术有限公司 Method and system for identifying geographical name entities
CN103558926A (en) * 2013-11-12 2014-02-05 金蝶软件(中国)有限公司 Geographical name entry method and geographical name entry device
CN103593468A (en) * 2013-11-27 2014-02-19 北京金和软件股份有限公司 Audio content pushing method
CN104021184A (en) * 2014-06-10 2014-09-03 广州品唯软件有限公司 Positioning method and system
CN104092613A (en) * 2014-07-15 2014-10-08 山东超越数控电子有限公司 Rapid table lookup method based on fuzzy matching
CN104182509A (en) * 2014-08-20 2014-12-03 国家电网公司 Object-oriented address modeling method
CN104182510A (en) * 2014-08-20 2014-12-03 国家电网公司 Object-oriented address modeling method
WO2016050088A1 (en) * 2014-09-30 2016-04-07 华为技术有限公司 Address search method and device
CN105659637A (en) * 2013-09-30 2016-06-08 三星电子株式会社 Caching of locations on a device
CN105760360A (en) * 2014-12-16 2016-07-13 高德软件有限公司 Address correction method and device
WO2016165538A1 (en) * 2015-04-13 2016-10-20 阿里巴巴集团控股有限公司 Address data management method and device
CN106055635A (en) * 2016-05-30 2016-10-26 深圳市华傲数据技术有限公司 Address information searching method and address information searching device
CN106296209A (en) * 2015-06-05 2017-01-04 阿里巴巴集团控股有限公司 Address input control method and device
CN106502978A (en) * 2016-09-19 2017-03-15 浪潮软件股份有限公司 A kind of Chinese address segmenting method and device
CN106528605A (en) * 2016-09-27 2017-03-22 武汉工程大学 A rule-based Chinese address resolution method
CN106649464A (en) * 2016-09-26 2017-05-10 深圳市数字城市工程研究中心 Method of building Chinese address tree and device
CN106709065A (en) * 2017-01-19 2017-05-24 国家电网公司 Standardization processing method and standardized processing device for address information
CN106874384A (en) * 2017-01-10 2017-06-20 广东精规划信息科技股份有限公司 A kind of isomery address standard handovers and matching process
CN106875264A (en) * 2017-03-31 2017-06-20 北京京东尚科信息技术有限公司 Sequence information management method, device and order sorting system
CN107748778A (en) * 2017-10-20 2018-03-02 浪潮软件股份有限公司 A kind of method and device for extracting address
CN108369582A (en) * 2018-03-02 2018-08-03 福建联迪商用设备有限公司 A kind of address error correction method and terminal
CN108959244A (en) * 2018-06-07 2018-12-07 北京京东尚科信息技术有限公司 The method and apparatus of address participle
CN109254964A (en) * 2018-08-20 2019-01-22 中国平安人寿保险股份有限公司 Address Standardization method, apparatus, computer equipment and storage medium
CN109255564A (en) * 2017-07-13 2019-01-22 菜鸟智能物流控股有限公司 Pick-up point address recommendation method and device
CN109344213A (en) * 2018-08-28 2019-02-15 浙江工业大学 A kind of Chinese Geocoding based on dictionary tree
CN109784308A (en) * 2019-02-01 2019-05-21 腾讯科技(深圳)有限公司 A kind of address error correction method, device and storage medium
CN109933797A (en) * 2019-03-21 2019-06-25 东南大学 Geocoding and system based on Jieba participle and address dictionary
CN110099246A (en) * 2019-02-18 2019-08-06 深度好奇(北京)科技有限公司 Monitoring and scheduling method, apparatus, computer equipment and storage medium
CN110674367A (en) * 2019-09-09 2020-01-10 广州易起行信息技术有限公司 Single Chinese character retrieval method and device based on travel industry products
CN110704564A (en) * 2019-09-27 2020-01-17 北京沃东天骏信息技术有限公司 Address error correction method and device
CN110895651A (en) * 2018-08-23 2020-03-20 北京京东金融科技控股有限公司 Address standardization processing method, device, equipment and computer readable storage medium
CN111144117A (en) * 2019-12-26 2020-05-12 同济大学 Knowledge graph Chinese address disambiguation method
CN111291277A (en) * 2020-01-14 2020-06-16 浙江邦盛科技有限公司 Address standardization method based on semantic recognition and high-level language search
CN111414357A (en) * 2019-01-07 2020-07-14 阿里巴巴集团控股有限公司 Address data processing method, device, system and storage medium
CN111753515A (en) * 2020-06-24 2020-10-09 广东科杰通信息科技有限公司 Address information extraction and matching method for realizing entity positioning
CN111859849A (en) * 2020-07-01 2020-10-30 邦道科技有限公司 Power utilization address management method and device
CN112052413A (en) * 2020-08-28 2020-12-08 上海谋乐网络科技有限公司 URL fuzzy matching method, device and system
CN112364113A (en) * 2020-11-13 2021-02-12 北京明略软件系统有限公司 Address error correction method and system
CN112417179A (en) * 2020-11-23 2021-02-26 杭州橙鹰数据技术有限公司 Address processing method and device
CN112925922A (en) * 2019-12-06 2021-06-08 农业农村部信息中心 Method, device, electronic equipment and medium for obtaining address
CN113204606A (en) * 2021-04-30 2021-08-03 武汉大学 Address position presumption method based on semantic position network
CN113656450A (en) * 2021-07-12 2021-11-16 大箴(杭州)科技有限公司 Address processing method and device, electronic equipment and storage medium
CN114091454A (en) * 2021-11-29 2022-02-25 重庆市地理信息和遥感应用中心 Method for extracting place name information and positioning space in internet text
CN116910386A (en) * 2023-09-14 2023-10-20 深圳市智慧城市科技发展集团有限公司 Address completion method, terminal device and computer-readable storage medium
CN117874309A (en) * 2024-03-12 2024-04-12 北京全路通信信号研究设计院集团有限公司 Train control data processing method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350013A (en) * 2007-07-18 2009-01-21 北京灵图软件技术有限公司 Method and system for searching geographical information
CN101350012B (en) * 2007-07-18 2013-01-16 北京灵图软件技术有限公司 Method and system for matching address
CN100535907C (en) * 2007-08-21 2009-09-02 北京大学 Method for extracting entity address message in text context
CN101393544A (en) * 2008-10-07 2009-03-25 南京师范大学 Chinese address semantic parsing method facing address encode

Cited By (91)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298585A (en) * 2010-06-24 2011-12-28 高德软件有限公司 Address splitting and level marking method and device
CN102298585B (en) * 2010-06-24 2016-01-13 高德软件有限公司 A kind of address cutting and rank mask method and address cutting and rank annotation equipment
CN102402533A (en) * 2010-09-13 2012-04-04 方正国际软件有限公司 Address matching method and system
CN102446186B (en) * 2010-10-13 2016-03-30 上海众恒信息产业股份有限公司 Chinese geocoding and coding/decoding method and device
CN102446186A (en) * 2010-10-13 2012-05-09 上海众恒信息产业股份有限公司 Chinese geographic coding and decoding method and device adopting same
CN101980208A (en) * 2010-11-10 2011-02-23 百度在线网络技术(北京)有限公司 Address query method and system
CN102024024A (en) * 2010-11-10 2011-04-20 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN101996247A (en) * 2010-11-10 2011-03-30 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN101996247B (en) * 2010-11-10 2013-02-20 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN102024024B (en) * 2010-11-10 2013-07-10 百度在线网络技术(北京)有限公司 Method and device for constructing address database
CN102169498A (en) * 2011-04-14 2011-08-31 中国测绘科学研究院 Address model constructing method and address matching method and system
CN102289467A (en) * 2011-07-22 2011-12-21 浙江百世技术有限公司 Method and device for determining target site
CN102955832A (en) * 2011-08-31 2013-03-06 深圳市华傲数据技术有限公司 Correspondence address identifying and standardizing system
CN102955832B (en) * 2011-08-31 2015-11-25 深圳市华傲数据技术有限公司 A kind of address identification, standardized system
CN102393937A (en) * 2011-10-12 2012-03-28 深圳市络道科技有限公司 Address matching method and system of address tree based on backward production
CN103383682B (en) * 2012-05-01 2017-12-26 刘龙 A kind of Geocoding, position enquiring system and method
CN103383682A (en) * 2012-05-01 2013-11-06 刘龙 Geographic coding method, and position inquiring system and method
CN102880650B (en) * 2012-08-27 2015-11-18 中国工商银行股份有限公司 A kind of data matching method and device
CN102880650A (en) * 2012-08-27 2013-01-16 中国工商银行股份有限公司 Data matching method and device
CN103413215A (en) * 2013-07-12 2013-11-27 广州银联网络支付有限公司 Electronic bank code matching method based on matrix similarity algorithm
CN103413215B (en) * 2013-07-12 2017-02-08 广州银联网络支付有限公司 Electronic bank code matching method based on matrix similarity algorithm
CN103440311A (en) * 2013-08-27 2013-12-11 深圳市华傲数据技术有限公司 Method and system for identifying geographical name entities
WO2015027836A1 (en) * 2013-08-27 2015-03-05 深圳市华傲数据技术有限公司 Method and system for place name entity recognition
CN105659637A (en) * 2013-09-30 2016-06-08 三星电子株式会社 Caching of locations on a device
CN103558926A (en) * 2013-11-12 2014-02-05 金蝶软件(中国)有限公司 Geographical name entry method and geographical name entry device
CN103593468A (en) * 2013-11-27 2014-02-19 北京金和软件股份有限公司 Audio content pushing method
CN103593468B (en) * 2013-11-27 2016-11-16 北京金和软件股份有限公司 A kind of audio content method for pushing
CN104021184B (en) * 2014-06-10 2017-07-11 广州品唯软件有限公司 A kind of localization method and system
CN104021184A (en) * 2014-06-10 2014-09-03 广州品唯软件有限公司 Positioning method and system
CN104092613A (en) * 2014-07-15 2014-10-08 山东超越数控电子有限公司 Rapid table lookup method based on fuzzy matching
CN104182510A (en) * 2014-08-20 2014-12-03 国家电网公司 Object-oriented address modeling method
CN104182509A (en) * 2014-08-20 2014-12-03 国家电网公司 Object-oriented address modeling method
US10783171B2 (en) 2014-09-30 2020-09-22 Huawei Technologies Co., Ltd. Address search method and device
CN105528372A (en) * 2014-09-30 2016-04-27 华为技术有限公司 An address search method and apparatus
WO2016050088A1 (en) * 2014-09-30 2016-04-07 华为技术有限公司 Address search method and device
CN105760360A (en) * 2014-12-16 2016-07-13 高德软件有限公司 Address correction method and device
CN105760360B (en) * 2014-12-16 2018-09-11 高德软件有限公司 A kind of address correcting method and device
CN106156145A (en) * 2015-04-13 2016-11-23 阿里巴巴集团控股有限公司 The management method of a kind of address date and device
WO2016165538A1 (en) * 2015-04-13 2016-10-20 阿里巴巴集团控股有限公司 Address data management method and device
CN106296209A (en) * 2015-06-05 2017-01-04 阿里巴巴集团控股有限公司 Address input control method and device
CN106296209B (en) * 2015-06-05 2021-02-02 菜鸟智能物流控股有限公司 Address input control method and device
CN106055635B (en) * 2016-05-30 2019-11-19 深圳市华傲数据技术有限公司 Address information lookup method and device
CN106055635A (en) * 2016-05-30 2016-10-26 深圳市华傲数据技术有限公司 Address information searching method and address information searching device
CN106502978A (en) * 2016-09-19 2017-03-15 浪潮软件股份有限公司 A kind of Chinese address segmenting method and device
CN106649464A (en) * 2016-09-26 2017-05-10 深圳市数字城市工程研究中心 Method of building Chinese address tree and device
CN106649464B (en) * 2016-09-26 2019-08-30 深圳市数字城市工程研究中心 A kind of construction method and device of Chinese address tree
CN106528605A (en) * 2016-09-27 2017-03-22 武汉工程大学 A rule-based Chinese address resolution method
CN106874384B (en) * 2017-01-10 2020-12-04 航天精一(广东)信息科技有限公司 Heterogeneous address standard conversion and matching method
CN106874384A (en) * 2017-01-10 2017-06-20 广东精规划信息科技股份有限公司 A kind of isomery address standard handovers and matching process
CN106709065A (en) * 2017-01-19 2017-05-24 国家电网公司 Standardization processing method and standardized processing device for address information
CN106709065B (en) * 2017-01-19 2020-08-04 国家电网公司 Address information standardization processing method and device
CN106875264A (en) * 2017-03-31 2017-06-20 北京京东尚科信息技术有限公司 Sequence information management method, device and order sorting system
CN109255564A (en) * 2017-07-13 2019-01-22 菜鸟智能物流控股有限公司 Pick-up point address recommendation method and device
CN107748778A (en) * 2017-10-20 2018-03-02 浪潮软件股份有限公司 A kind of method and device for extracting address
CN107748778B (en) * 2017-10-20 2021-03-23 浪潮软件股份有限公司 Method and device for extracting address
CN108369582B (en) * 2018-03-02 2021-06-25 福建联迪商用设备有限公司 Address error correction method and terminal
WO2019165644A1 (en) * 2018-03-02 2019-09-06 福建联迪商用设备有限公司 Address error correction method and terminal
CN108369582A (en) * 2018-03-02 2018-08-03 福建联迪商用设备有限公司 A kind of address error correction method and terminal
CN108959244B (en) * 2018-06-07 2022-08-09 北京京东尚科信息技术有限公司 Address word segmentation method and device
CN108959244A (en) * 2018-06-07 2018-12-07 北京京东尚科信息技术有限公司 The method and apparatus of address participle
CN109254964A (en) * 2018-08-20 2019-01-22 中国平安人寿保险股份有限公司 Address Standardization method, apparatus, computer equipment and storage medium
CN110895651B (en) * 2018-08-23 2024-02-02 京东科技控股股份有限公司 Address standardization processing method, device, equipment and computer readable storage medium
CN110895651A (en) * 2018-08-23 2020-03-20 北京京东金融科技控股有限公司 Address standardization processing method, device, equipment and computer readable storage medium
CN109344213A (en) * 2018-08-28 2019-02-15 浙江工业大学 A kind of Chinese Geocoding based on dictionary tree
CN109344213B (en) * 2018-08-28 2021-06-18 浙江工业大学 Chinese geocoding method based on dictionary tree
CN111414357A (en) * 2019-01-07 2020-07-14 阿里巴巴集团控股有限公司 Address data processing method, device, system and storage medium
CN109784308A (en) * 2019-02-01 2019-05-21 腾讯科技(深圳)有限公司 A kind of address error correction method, device and storage medium
CN109784308B (en) * 2019-02-01 2020-09-29 腾讯科技(深圳)有限公司 Address error correction method, device and storage medium
CN110099246A (en) * 2019-02-18 2019-08-06 深度好奇(北京)科技有限公司 Monitoring and scheduling method, apparatus, computer equipment and storage medium
CN109933797A (en) * 2019-03-21 2019-06-25 东南大学 Geocoding and system based on Jieba participle and address dictionary
CN110674367A (en) * 2019-09-09 2020-01-10 广州易起行信息技术有限公司 Single Chinese character retrieval method and device based on travel industry products
CN110674367B (en) * 2019-09-09 2022-02-01 广州易起行信息技术有限公司 Single Chinese character retrieval method and device based on travel industry products
CN110704564A (en) * 2019-09-27 2020-01-17 北京沃东天骏信息技术有限公司 Address error correction method and device
CN112925922A (en) * 2019-12-06 2021-06-08 农业农村部信息中心 Method, device, electronic equipment and medium for obtaining address
CN111144117B (en) * 2019-12-26 2023-08-29 同济大学 Method for disambiguating Chinese address of knowledge graph
CN111144117A (en) * 2019-12-26 2020-05-12 同济大学 Knowledge graph Chinese address disambiguation method
CN111291277A (en) * 2020-01-14 2020-06-16 浙江邦盛科技有限公司 Address standardization method based on semantic recognition and high-level language search
CN111753515A (en) * 2020-06-24 2020-10-09 广东科杰通信息科技有限公司 Address information extraction and matching method for realizing entity positioning
CN111859849A (en) * 2020-07-01 2020-10-30 邦道科技有限公司 Power utilization address management method and device
CN111859849B (en) * 2020-07-01 2023-11-24 邦道科技有限公司 Management method and device for electricity utilization address
CN112052413A (en) * 2020-08-28 2020-12-08 上海谋乐网络科技有限公司 URL fuzzy matching method, device and system
CN112052413B (en) * 2020-08-28 2024-02-13 上海谋乐网络科技有限公司 URL fuzzy matching method, device and system
CN112364113A (en) * 2020-11-13 2021-02-12 北京明略软件系统有限公司 Address error correction method and system
CN112417179A (en) * 2020-11-23 2021-02-26 杭州橙鹰数据技术有限公司 Address processing method and device
CN113204606A (en) * 2021-04-30 2021-08-03 武汉大学 Address position presumption method based on semantic position network
CN113656450A (en) * 2021-07-12 2021-11-16 大箴(杭州)科技有限公司 Address processing method and device, electronic equipment and storage medium
CN114091454A (en) * 2021-11-29 2022-02-25 重庆市地理信息和遥感应用中心 Method for extracting place name information and positioning space in internet text
CN116910386B (en) * 2023-09-14 2024-02-02 深圳市智慧城市科技发展集团有限公司 Address completion method, terminal device and computer-readable storage medium
CN116910386A (en) * 2023-09-14 2023-10-20 深圳市智慧城市科技发展集团有限公司 Address completion method, terminal device and computer-readable storage medium
CN117874309A (en) * 2024-03-12 2024-04-12 北京全路通信信号研究设计院集团有限公司 Train control data processing method and device, electronic equipment and storage medium
CN117874309B (en) * 2024-03-12 2024-05-24 北京全路通信信号研究设计院集团有限公司 Train control data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN101719128B (en) 2012-05-23

Similar Documents

Publication Publication Date Title
CN101719128B (en) Fuzzy matching-based Chinese geo-code determination method
CN103914544A (en) Method for quickly matching Chinese addresses in multi-level manner on basis of address feature words
Mark Geographic information science: Defining the field
CN108369582B (en) Address error correction method and terminal
CN106156082B (en) A kind of ontology alignment schemes and device
CN112612863B (en) Address matching method and system based on Chinese word segmentation device
CN112528174B (en) Address trimming and complementing method based on knowledge graph and multiple matching and application
CN104679801B (en) A kind of interest point search method and device
CN1590964A (en) Iterative logical renewal of navigable map database
CN109933797A (en) Geocoding and system based on Jieba participle and address dictionary
CN102375807A (en) Method and device for proofing characters
CN103345496A (en) Multimedia information searching method and system
CN105209858A (en) Non-deterministic disambiguation and matching of business locale data
CN108009265B (en) Spatial data indexing method in cloud computing environment
CN111291099B (en) Address fuzzy matching method and system and computer equipment
CN101493340B (en) Method for quickly searching interested point information in navigation system for vehicles
CN103970842A (en) Water conservancy big data access system and method for field of flood control and disaster reduction
CN111311173A (en) National county level unit economic arrangement and spatialization method
CN104391908A (en) Locality sensitive hashing based indexing method for multiple keywords on graphs
CN114780680A (en) Retrieval and completion method and system based on place name and address database
CN104598887B (en) Recognition methods for non-canonical format handwritten Chinese address
CN102999548B (en) Geographical name data extended method and device in electronic chart
CN113505190B (en) Address information correction method, device, computer equipment and storage medium
Machanavajjhala et al. Collective extraction from heterogeneous web lists
CN114201480A (en) Multi-source POI fusion method and device based on NLP technology and readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120523