CN106959961A - A kind of Address Recognition method and device - Google Patents

A kind of Address Recognition method and device Download PDF

Info

Publication number
CN106959961A
CN106959961A CN201610016741.8A CN201610016741A CN106959961A CN 106959961 A CN106959961 A CN 106959961A CN 201610016741 A CN201610016741 A CN 201610016741A CN 106959961 A CN106959961 A CN 106959961A
Authority
CN
China
Prior art keywords
address
original
upper level
name
administrative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610016741.8A
Other languages
Chinese (zh)
Inventor
田国超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cainiao Smart Logistics Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610016741.8A priority Critical patent/CN106959961A/en
Publication of CN106959961A publication Critical patent/CN106959961A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Character Discrimination (AREA)

Abstract

This application provides a kind of Address Recognition method and device, the minimum address name of administrative grade is extracted from original address, and searched and the minimum address name identical destination address of political affairs rank from default database, judge whether the target upper level address set of the original upper level address set in original address and destination address meets matching condition, if, using the address chain of the address composition in destination address and the target upper level address set as the recognition result of original address, matching condition includes:Original upper level address collection is combined into the subset of target upper level address set, and the relative ranks between the relative ranks between the address name in original upper level address set and identical address name in the target upper level address set are identical, so, can according to default address database by nonstandard Address Recognition be meet address sorting rule address, so as to be laid the foundation for follow-up automatic sorting.

Description

A kind of Address Recognition method and device
Technical field
The application is related to computer realm, more particularly to a kind of Address Recognition method and device.
Background technology
Sorting refers to according to the address write on mail or express mail, according to route (the i.e. road of default sorting To) gradually it is divided into the process of related bin or code heap.In order to improve sort efficiency, automatic sorting by It gradually instead of manual sorting.
Ship-to is usually following data structure { " province " with being supplied to Automated Sorting System:" Zhejiang Save ", " city ":" Hangzhou ", " area ":" Yuhang District ", " town ":" Cang Qian streets ", " address_detail ": " test address " }, wherein, " province ", " city ", " area " and " town " is administrative division."province" Represent first order administrative division:Province, municipality directly under the Central Government, autonomous region or special administrative region, " city " represents the second level Administrative division:Prefecture-level city, autonomous prefecture, districts under city administration or linchpin county etc. of municipality directly under the Central Government are directly under the jurisdiction of, " area " represents the third level Administrative division:County, county-level city, region districts under city administration etc., " town " represents fourth stage administrative division:Township, town, Street etc., " address_detail " represents better address.
Conventional automatic sorting mode using minor details address as sorting unit, final stage address refer to country, save, In the administrative division such as city, area (county), street (small towns), the minimum administrative division of non-NULL.It can be seen that, Automatic sorting highly relies on ship-to, so, once filling in for ship-to does not meet automatic sorting Rule, automatic sorting is to fail, so, how to improve the success rate of automatic sorting turns at present urgently The problem of solution.
The content of the invention
This application provides a kind of Address Recognition method and device, it is therefore intended that how solution improves automatic point The problem of success rate picked.
To achieve these goals, this application provides following technical scheme:
A kind of Address Recognition method, including:
The minimum address name of administrative grade is extracted from original address, the original address is included describedly Location title and original upper level address set, the original upper level address set are removed in the original address The set of other address names beyond the address name;
The address name phase minimum with the administrative grade that extracts is searched from default address database Same destination address;
Judge in the original upper level address set in the original address and the target of the destination address Whether level address set meets matching condition, if it is, by the destination address and the target higher level In address set address composition address chain as the original address recognition result;
Wherein, to be combined into the destination address upper in the address database for the target upper level address collection Level address carries out gathering formed by the sequence of tandem according to administrative grade, and the matching condition includes: The original upper level address collection is combined into the subset of the target upper level address set, and the original higher level Relative ranks between address name in address set and identical in the target upper level address set Relative ranks between the title of location are identical.
Alternatively, the original upper level address set in the judgement original address and the mesh Whether the target upper level address set of mark address is met before matching condition, is also included:
Determine that the administrative grade of the destination address meets the requirement of sorting rule.
Alternatively, the minimum address name of rank that extracted from original address includes:
Address name is extracted from the administrative division of each non-NULL in the original address;
The minimum address name of administrative grade is selected from each address name extracted.
Alternatively, the default database includes address data structure, each address data structure Including:Numbering, upper level address numbering and address name;
The minimum address name of the administrative grade searched and extracted from default address database Claim identical destination address, including:
The minimum address name of the administrative grade searching and extract from the default address database Claim identical address name, be used as destination address;
The target upper level address set of the destination address is determined, including:
According to the upper level address numbering corresponding to the destination address, from the default address database The upper level address of the destination address is found, and by that analogy, the institute until finding the destination address There is upper level address, thus constitute the target upper level address set.
Alternatively, also include:
If the original upper level address set and the target upper level address set are unsatisfactory for described matching bar Part, character string is spliced into by the address name in each administrative division and better address in the original address;
Word segmentation processing is carried out to the character string, each participle is obtained;
The address administrative division of each participle matching is searched from default address dictionary, will be described each The address administrative division of individual participle matching as the original address recognition result.
Alternatively, it is administrative in the address that each participle matching is searched from default address dictionary After zoning, also include:
Enter row address using the recognition result of the original address to sort.
A kind of address recognition unit, including:
Extraction module, it is described original for extracting the minimum address name of administrative grade from original address Address includes the address name and original upper level address set, and the original upper level address set is institute State the set of other address names in addition to the address name in original address;
Searching modul, for the administrative grade searching and extract from default address database most Low address name identical destination address;
First identification module, for judging the original upper level address set and institute in the original address Whether the target upper level address set for stating destination address meets matching condition, if it is, by the target In address and the target upper level address set address composition address chain as the original address knowledge Other result;
Wherein, to be combined into the destination address upper in the address database for the target upper level address collection Level address carries out gathering formed by the sequence of tandem according to administrative grade, and the matching condition includes: The original upper level address collection is combined into the subset of the target upper level address set, and the original higher level Relative ranks between address name in address set and identical in the target upper level address set Relative ranks between the title of location are identical.
Alternatively, first identification module is additionally operable to:
The original upper level address set and the destination address in the judgement original address Whether the set of target upper level address is met before matching condition, determines the administrative grade symbol of the destination address Close the requirement of sorting rule.
Alternatively, the extraction module is used to extract the minimum address name of administrative grade from original address Including:
The extraction module from the administrative division of each non-NULL in the original address specifically for carrying Take address name;The minimum address name of administrative grade is selected from each address name extracted.
Alternatively, the default database includes address data structure, each address data structure Including:Numbering, upper level address numbering and address name;
The extraction module
Specifically for the administrative grade searched and extracted from the default address database is most Low address name identical address name, is used as destination address;
The extraction module is additionally operable to:According to the upper level address numbering corresponding to the destination address, from institute The upper level address that the destination address is found in default address database is stated, and by that analogy, until All upper level address of the destination address are found, the target upper level address set is thus constituted.
Alternatively, also include:
Second identification module, if for the original upper level address set and the target upper level address collection Conjunction is unsatisfactory for the matching condition, by the ground in each administrative division and better address in the original address Location title is spliced into character string;Word segmentation processing is carried out to the character string, each participle is obtained;From default Address dictionary in search the address administrative division of each participle matching, each participle matching Address administrative area divides the recognition result of the original address into.
Alternatively, also include:
Module is sorted in address, described each for being searched in second identification module from default address dictionary After the address administrative division of individual participle matching, row address point is entered using the recognition result of the original address Pick.
Address Recognition method and device described herein, extracts the minimum ground of administrative grade from original address Location title, and searched and the minimum address name identical destination address of political affairs rank from default database, Judge whether the original upper level address set in original address and the target upper level address set of destination address expire Sufficient matching condition, if it is, the address in destination address and the target upper level address set is constituted Address chain as original address recognition result because target upper level address collection is combined into destination address on ground Upper level address in the database of location carries out gathering formed by the sequence of tandem according to administrative grade, former Beginning upper level address set is the set of other address names in addition to address name in original address, matches bar Part includes:Original upper level address collection is combined into the subset of target upper level address set, and original upper level address The relative ranks between address name in set and identical address name in the target upper level address set Relative ranks between referred to as are identical, thus it is possible to according to default address database by nonstandard address The address of address sorting rule is identified as conforming to, so as to be laid the foundation for follow-up automatic sorting.
Brief description of the drawings
, below will be to reality in order to illustrate more clearly of the embodiment of the present application or technical scheme of the prior art The accompanying drawing to be used needed for example or description of the prior art is applied to be briefly described, it should be apparent that, below Accompanying drawing in description is only some embodiments of the present application, for those of ordinary skill in the art, On the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
A kind of flow chart for Address Recognition method that Fig. 1 provides for the embodiment of the present application;
Fig. 2 is a kind of flow chart of address method for sorting disclosed in the embodiment of the present application;
Fig. 3 is a kind of structural representation of address recognition unit disclosed in the embodiment of the present application.
Embodiment
The embodiment of the present application provides a kind of Address Recognition method, can apply before automatic sorting, mesh Be that the regular original address for not meeting automatic sorting is identified, obtain meeting automatic sorting rule Address then, then automatic sorting is carried out to the address after identification, to improve the success rate of automatic sorting.
Further, the Address Recognition method that the embodiment of the present application is provided, is more suitable for administrative division dislocation Or excalation or administrative district address mistake but the correct scene of better address.
So-called administrative division dislocation refers to, in address data structure, and the title of administrative addresses at different levels is correct, But the title of administrative address and administrative name occur in that dislocation, such as province between being referred to as:Hangzhou, city:Yuhang District, area:Cang Qian streets, wherein, the titles of administrative addresses at different levels is correct, but by the second level The title " Hangzhou " of administrative division, which has been inserted, should fill in the position of first order administrative division, " city " and " area " situation is similar.
So-called administrative division excalation refers to that the address name of each administrative grade is imperfect, such as address Middle city " city " is sky.
So-called administrative district address mistake refers to, in address data structure, and administrative division is all or part of Missing, for example, province, city and area are empty (not filling in), or, administrative district address There is nonstandard suffix after title, for example, city:Yuhang city (is filled out " area " mistake for " city ").
Better address correctly refers to that better address includes the address name of administrative divisions at different levels.
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is carried out Clearly and completely describe, it is clear that described embodiment is only some embodiments of the present application, and The embodiment being not all of.Based on the embodiment in the application, those of ordinary skill in the art are not doing Go out the every other embodiment obtained under the premise of creative work, belong to the scope of the application protection.
Fig. 1 show a kind of Address Recognition method of the embodiment of the present application offer, comprises the following steps:
S101:Obtain original address.
So-called original address refers to the address without standardization processing, may due to personnel's typing mistake or Address administrative division dislocation or missing or administrative region address that mistake is caused are adapted between different system It is comprised in a series of medium wrong addresses of better address.
In the present embodiment, to get following original address data instance:{province:Hangzhou, city:It is remaining Hang Qu, area:Cang Qian streets, " address_detail ":" test address " }.
S102:The minimum address name of administrative grade is extracted from original address.
Original address includes the minimum address name of administrative grade and original upper level address set, wherein, Original upper level address set be in original address in addition to the minimum address name of administrative grade other addresses names The set of title.
Specifically, the operating process for extracting the minimum address name of administrative grade is:From original address Address name is extracted in administrative division, more preferably, from the administrative division of each non-NULL in original address Extract address name, such as original address data { province:Hangzhou, city:Yuhang District, area:Storehouse Front St Road } in the administrative division of non-NULL be " province ", " city " and " area ", therefrom extraction address name " Hangzhoupro State ", " Yuhang District " and " Cang Qian streets ".After the address name for extracting administrative divisions at different levels, then from each Selected in address name in the minimum address name of administrative grade, practical application, can be according to pre-setting Address database such as China address storehouse in the administrative addresses at different levels that set determine the row of each address name Political affairs rank.The address name of " Hangzhou ", " Yuhang District " and the lowest level in " Cang Qian streets " is " storehouse Front St Road ".
The purpose for extracting the minimum address name of administrative grade is that the minimum address of political affairs rank is in address chain End, and most can accurately navigate to specific address.
S103:The administrative grade lowest address title phase with extracting is searched from default address database Same destination address.
Example is connected, two destination addresses are inquired from address database, its title is in " Cang Qian streets ".
S104:Judge whether the administrative grade of destination address meets the requirement of sorting rule, if it is, holding Row S105, if it is not, then performing S109.
The requirement of the sorting rule is to refer to realize that automatic sorting (or is artificial in some cases Sorting) required for minimum administrative grade.For example, in one example, automatic sorting can be according to " area " Administrative grade distinguishes different routes, you can so that the mail of not same district to be assigned to different bins, then such as The administrative grade of fruit destination address is " city ", even if having identified correct address, can not also realize and successfully divide Pick, so, S104 purpose is, if final automatic sorting can not succeed, and interrupt address is known Other process, so as to save resource.
Here exemplified by sorting rule and require that minimum administrative grade is " area " and is following, example is connected, " storehouse Front St Road " is " area " following rank, so, meet sorting rule.It will be appreciated by the appropriately skilled person that Under other performances, automatic sorting can be only required to minimum administrative grade and be " city " or be " province ".
S105:The set of the upper level address of destination address is obtained from default address database.
In the present embodiment, the collection of the upper level address of destination address is combined into destination address in address database Upper level address carries out gathering formed by the sequence of tandem according to administrative grade, referred to as target higher level Address set.
Example is connected, is ranked up from low to high with administrative grade, two inquired from address database The set of the upper level address in " Cang Qian streets " is respectively:Yuhang District → Hangzhou → Zhejiang Province and Cangshan District → Fuzhou City → Fujian Province.
S106:The set of the upper level address of the address name extracted is obtained from original address.
In the present embodiment, the collection of the upper level address of the address name extracted is combined into non-NULL in original address In administrative division, other address names sort according to administrative grade in addition to the address name extracted The set of formation, referred to as original upper level address set.
Example is connected, original upper level address collection is combined into Yuhang District → Hangzhou.
S107:Judge original upper level address set whether be target upper level address set subset, it is and original Relative ranks between address name in upper level address set and identical in target upper level address set Relative ranks between the title of location are identical, if it is, S108 is performed, if not, performing S109.
Original upper level address set refers to identical address name in target upper level address set, it is original on Level address set includes address name A, and address name A is also included in target upper level address set.
So-called relative ranks, refer to the order between the address name two-by-two in set, illustrate:It is original The relative ranks that upper level address collection is combined between { b, c, e }, address name refer to b before c, and c is in e Before, b is before e.Assuming that target upper level address collection is combined into { a, b, c, d, e }, address name it Between relative ranks refer to a before b, a is before c, and a is before d, and a is before e, and b is in c Before, b is before d, and b is before e etc..It can be seen that, the phase of address name b, c, e in set 1 It is identical to relative ranks of the order with address name b, c, e in set 2.
It should be noted that in order to increase the probability that the match is successful, further to improve the general of Address Recognition Rate, before being judged, the administrative grade sortord that target upper level address set can be used to use Address name in original upper level address set is ranked up, that is to say, that two set are all pressed It is ranked up, or is all ranked up from low to high according to administrative grade from high to low according to administrative grade.
S108:By the address chain of the address composition in destination address and target upper level address set for primitively The recognition result of location.
Connect example, original upper level address set:Yuhang District → Hangzhou is target upper level address set Yuhang In the subset in area → Hangzhou → Zhejiang Province, also, original upper level address set the sequence of two addresses with The relative ranks of the two addresses are identical in target upper level address set, so, the original of administrative division dislocation Beginning address province:Hangzhou, city:Yuhang District, area:The recognition result in Cang Qian streets is:province: Zhejiang, city:Hangzhou, area:Yuhang District, town:Cang Qian streets.
It is the Address Recognition carried out that misplaced for administrative division above, if said process cannot recognize that just True address, then proceed following steps, attempts to recognize from better address:
S109:Better address is obtained from original address.
For example, original address data are { province:Sky, city:Sky, area:Sky, " address_detail ": Hangzhou, Zhejiang province city Yuhang District Cang Qian streets }, it cannot recognize that address using said process.In original address Better address be " address_detail ":Hangzhou, Zhejiang province city Yuhang District Cang Qian streets.
S110:Address name in the administrative division of each in original address and better address is spliced into character String.
For example, by address date { province:Sky, city:Sky, area:Sky, " address_detail ":Zhejiang Jiang Sheng Hangzhous Yuhang District Cang Qian streets } in address name in each administrative division and better address carry out The character string formed after splicing is " Hangzhou, Zhejiang province city Yuhang District Cang Qian streets ".
S111:Word segmentation processing is carried out to character string, each participle is obtained.
For example, obtaining participle after " Hangzhou, Zhejiang province city Yuhang District Cang Qian streets " is carried out into word segmentation processing:" Zhejiang Jiang Sheng ", " Hangzhou ", " Yuhang District " and " Cang Qian streets ".
Specific segmentation methods may refer to first in technology, repeat no more here.
S112:The address administrative division of each participle matching is searched from default address dictionary.
Example is connected, the address area that can find each participle is divided into:province:Zhejiang, city:Hangzhou, area: Yuhang District, town:Cang Qian streets.
The Address Recognition method described in the present embodiment is can be seen that from the step shown in Fig. 1, can be by not The address conversion for meeting automatic sorting rule is to meet the address of address sorting rule, thus for it is follow-up from Dynamic sorting lays the foundation.
In above-mentioned implementation procedure, S102 and S106 can be combined and performed in step s 102, the Hes of S 103 S105 can also be combined in step S103 and performed.
It should be noted that generally, for the ease of operating and distinguishing, the ground in default address database Location is represented using unique numbering, for example, the address that address database includes uses following data knot Structure is represented:
Wherein areaID is unique encodings of the administrative division in address base, and its value is 123456, name Address name, its value is Hangzhou, abbName be address referred to as, its value is Hangzhou, and parentID is The unique encodings of the upper level administrative division of current administrative division are administered, its value is 234567, in address base In be referred to as the parent object of current address object.
In data above structure, between areaID, name, parentID and abbName have pair It should be related to.
Therefore, the S103 in Fig. 1:The address name searched and extracted from default address database The specific implementation of identical destination address can be:Searched from default address database with extracting The address name identical address name arrived, is used as destination address.
S104:Whether judge destination address is that the specific implementation of " area " and following administrative grade can be with For:Judge whether it is " area " and following administrative grade by the coding corresponding to destination address.
Wherein, being encoded to corresponding to destination address, same address data structure is in destination address In areaID value.For example in upper example, destination address title " Hangzhou " is corresponding to be encoded to " 123456 ".
S105:The specific reality of the set of the upper level address of destination address is obtained from default address database Now mode can be:The upper level address of destination address is found according to the parentID values corresponding to destination address.
Specifically, generally, an address is only included in an address data structure, and this address is only One upper level address.So, tool can be found according to the value of parentID in target address data structure There are the address data structure with parentID value identical areaID value, the address data structure found In name value be destination address upper level address, the like, pass through address data structure step by step " parentID " sequentially find whole upper level address, and according in each address data structure found " parentID " value ultimately forms address chain.
For example in upper example, the corresponding parentID values of destination address title " Hangzhou " are " 234567 ", then AreaID value is searched in address database as follows for the address data structure of " 234567 ":
Wherein, name value is " Zhejiang Province " in the address data structure that areaID value is 234567. It will recognize, " Zhejiang Province " is the upper level address of " Hangzhou ".Thus, this just have found target The corresponding upper level address of address name " Hangzhou ".
Further, then according to this areaID value it is the parentID in 234567 address data structure " 000086 " continues to search for the upper level address data structure in " Zhejiang Province ".It is not repeated herein.With This analogizes, all upper level address until finding destination address.
Based on the step shown in Fig. 1, Fig. 2 is a kind of flow of address method for sorting, including in detail below Step:
S201:Judge whether original address can successfully be sorted, if it is, sorted, if It is no, perform S203.
In the present embodiment, successfully sorted except including automatic sorting can be realized described in a upper embodiment Outside minimum administrative grade required for (or being manual sorting in some cases), also including original Address meets other requirements of sorting, is lost for example, administrative division is perfect, administrative division dislocation-free etc..
S202:Enter the identification of row address chain to original address.
Specifically, realized by performing S102~S108.
S203:Judge whether to identify the address chain of original address, if it is, S205 is performed, if not, Perform S204.
S204:Participle identification is carried out to original address.
Specifically, realized by performing S109~S112.
S205:Enter row address using recognition result to be sorted.
If going out to meet the regular address of sorting by address chain identification and participle identification are unidentified, need Other manner such as manual sorting is used to sort mail.
In summary, the address chain recognition methods address incoming to user does not require integrality, can be province Town, provinces and regions, urban district, the information such as Qu Zhen, and do not require that position is correct, as long as being sorted according to administrative grade Correct and minor details address is sorting unit, you can obtain correct administrative district address chain to recognize.Participle Recognition methods can be to administrative district address mistake but the correct original address progress administrative division of better address Identification.Practice have shown that, after the method shown in Fig. 3 has been used, sorting matching rate is by original 91.8% Heighten to 97.5%.
Fig. 3 be a kind of address recognition unit disclosed in the embodiment of the present application, including extraction module 301, search The identification module 303 of module 302 and first.
Wherein, extraction module 301 is used to extract the minimum address name of administrative grade from original address, The original address includes the address name and original upper level address set, the original upper level address Set is the set of other address names in addition to the address name in the original address.
Further, extraction module extracts the specific of the minimum address name of administrative grade from original address Implementation can be:Address name is extracted from the administrative division of each non-NULL in the original address, And the minimum address name of administrative grade is selected from each address name extracted.
Searching modul 302, for the administrative grade searched and extracted from default address database Minimum address name identical destination address.
First identification module 303, for judge the original upper level address set in the original address with Whether the target upper level address set of the destination address meets matching condition, if it is, by the mesh The address chain of address composition in mark address and the target upper level address set is used as the original address Recognition result.
Wherein, to be combined into the destination address upper in the address database for the target upper level address collection Level address carries out gathering formed by the sequence of tandem according to administrative grade, and the matching condition includes: The original upper level address collection is combined into the subset of the target upper level address set, and the original higher level Relative ranks between address name in address set and identical in the target upper level address set Relative ranks between the title of location are identical.
Alternatively, the first identification module 303 is additionally operable to:It is described original in the original address is judged Whether upper level address set is met with the set of target upper level address before matching condition, with determining the target The administrative grade of location meets the requirement of sorting rule.
Alternatively, the present embodiment described device can also include:
Second identification module 304, if for the original upper level address set and the target upper level address Set is unsatisfactory for the matching condition, by each administrative division and better address in the original address Address name is spliced into character string;Word segmentation processing is carried out to the character string, each participle is obtained;From pre- If address dictionary in search the address administrative division of each participle matching, each participle matching Address administrative area divide the recognition result of the original address into.
Alternatively, in the device described in the present embodiment, it can also include:Module is sorted (in Fig. 3 in address It is not drawn into), for searching each described participle from default address dictionary in second identification module After the address administrative division of matching, enter row address using the recognition result of the original address and sort.
Further, as shown in being implemented preceding method, the default database includes address date Structure, each address data structure includes:Numbering, upper level address numbering and address name.Thus, It may be constructed a sufficient address data structure.In the case, extraction module is from default number of addresses According to the specific of the minimum address name identical destination address of the administrative grade searched and extracted in storehouse Implementation is:The minimum address name of the administrative grade searching and extract from default address database Claim identical address name, be used as destination address.
Also, based on above address data structure, extraction module is additionally operable to:According to the destination address institute Corresponding upper level address numbering, finds the higher level of the destination address from the default address database Address, and by that analogy, thus all upper level address until finding the destination address constitute institute State target upper level address set.
Address recognition unit described in the present embodiment, can identify nonstandard address, be conducive to improving The success rate of sorting.Address is identified the present embodiment described device idiographic flow and with sorting The combination of journey, may refer to above method embodiment, repeats no more here.
If the function described in the embodiment of the present application method is realized using in the form of SFU software functional unit and as solely Vertical production marketing in use, can be stored in a computing device read/write memory medium.It is based on It is such to understand, part or the portion of the technical scheme that the embodiment of the present application contributes to prior art Dividing can be embodied in the form of software product, and the software product is stored in a storage medium, bag Some instructions are included to so that a computing device (can be personal computer, server, mobile computing Equipment or the network equipment etc.) perform all or part of step of the application each embodiment methods described. And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or light Disk etc. is various can be with the medium of store program codes.
The embodiment of each in this specification is described by the way of progressive, and each embodiment is stressed Be between the difference with other embodiments, each embodiment same or similar part mutually referring to.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or use The application.A variety of modifications to these embodiments will be aobvious and easy for those skilled in the art See, generic principles defined herein can in the case where not departing from spirit herein or scope, Realize in other embodiments.Therefore, the application is not intended to be limited to the embodiments shown herein, And it is to fit to the most wide scope consistent with features of novelty with principles disclosed herein.

Claims (12)

1. a kind of Address Recognition method, it is characterised in that including:
The minimum address name of administrative grade is extracted from original address, the original address is included describedly Location title and original upper level address set, the original upper level address set are removed in the original address The set of other address names beyond the address name;
The address name phase minimum with the administrative grade that extracts is searched from default address database Same destination address;
Judge in the original upper level address set in the original address and the target of the destination address Whether level address set meets matching condition, if it is, by the destination address and the target higher level In address set address composition address chain as the original address recognition result;
Wherein, to be combined into the destination address upper in the address database for the target upper level address collection Level address carries out gathering formed by the sequence of tandem according to administrative grade, and the matching condition includes: The original upper level address collection is combined into the subset of the target upper level address set, and the original higher level Relative ranks between address name in address set and identical in the target upper level address set Relative ranks between the title of location are identical.
2. according to the method described in claim 1, it is characterised in that judge the original address described In the target upper level address set of the original upper level address set and the destination address whether meet Before condition, also include:
Determine that the administrative grade of the destination address meets the requirement of sorting rule.
3. method according to claim 2, it is characterised in that described that level is extracted from original address Not minimum address name includes:
Address name is extracted from the administrative division of each non-NULL in the original address;
The minimum address name of administrative grade is selected from each address name extracted.
4. method according to claim 3, it is characterised in that the default database includes Address data structure, each address data structure includes:Numbering, upper level address numbering and address name; It is described that the address name phase minimum with the administrative grade that extracts is searched from default address database Same destination address, including:
The minimum address name of the administrative grade searching and extract from the default address database Claim identical address name, be used as destination address;
The target upper level address set of the destination address is determined, including:According to the destination address, institute is right The upper level address numbering answered, the higher level of the destination address is found from the default address database Location, and by that analogy, thus all upper level address until finding the destination address are constituted described Target upper level address set.
5. the method according to any one of Claims 1-4, it is characterised in that also include:
If the original upper level address set and the target upper level address set are unsatisfactory for described matching bar Part, character string is spliced into by the address name in each administrative division and better address in the original address;
Word segmentation processing is carried out to the character string, each participle is obtained;
The address administrative division of each participle matching is searched from default address dictionary, will be described each The address administrative division of individual participle matching as the original address recognition result.
6. method according to claim 5, it is characterised in that described from default address dictionary After the address administrative division that each participle described in middle lookup is matched, also include:
Enter row address using the recognition result of the original address to sort.
7. a kind of address recognition unit, it is characterised in that including:
Extraction module, it is described original for extracting the minimum address name of administrative grade from original address Address includes the address name and original upper level address set, and the original upper level address set is institute State the set of other address names in addition to the address name in original address;
Searching modul, for the administrative grade searching and extract from default address database most Low address name identical destination address;
First identification module, for judging the original upper level address set and institute in the original address Whether the target upper level address set for stating destination address meets matching condition, if it is, by the target In address and the target upper level address set address composition address chain as the original address knowledge Other result;
Wherein, to be combined into the destination address upper in the address database for the target upper level address collection Level address carries out gathering formed by the sequence of tandem according to administrative grade, and the matching condition includes: The original upper level address collection is combined into the subset of the target upper level address set, and the original higher level Relative ranks between address name in address set and identical in the target upper level address set Relative ranks between the title of location are identical.
8. device according to claim 7, it is characterised in that first identification module is additionally operable to:
The original upper level address set and the destination address in the judgement original address Whether the set of target upper level address is met before matching condition, determines the administrative grade symbol of the destination address Close the requirement of sorting rule.
9. device according to claim 8, it is characterised in that the extraction module is used for from original The minimum address name of administrative grade is extracted in address to be included:
The extraction module from the administrative division of each non-NULL in the original address specifically for carrying Take address name;The minimum address name of administrative grade is selected from each address name extracted.
10. device according to claim 9, it is characterised in that wrapped in the default database Address data structure is included, each address data structure includes:Numbering, upper level address numbering and address name Claim;
The extraction module is specifically for the institute searched and extracted from the default address database The minimum address name identical address name of administrative grade is stated, destination address is used as;
The extraction module is additionally operable to:According to the upper level address numbering corresponding to the destination address, from institute The upper level address that the destination address is found in default address database is stated, and by that analogy, until All upper level address of the destination address are found, the target upper level address set is thus constituted.
11. the device according to any one of claim 7 to 10, it is characterised in that also include:
Second identification module, if for the original upper level address set and the target upper level address collection Conjunction is unsatisfactory for the matching condition, by the ground in each administrative division and better address in the original address Location title is spliced into character string;Word segmentation processing is carried out to the character string, each participle is obtained;From default Address dictionary in search the address administrative division of each participle matching, each participle matching Address administrative area divides the recognition result of the original address into.
12. device according to claim 11, it is characterised in that also include:
Module is sorted in address, described for being searched in second identification module from default address dictionary After the address administrative division of each participle matching, the recognition result using the original address enters row address Sorting.
CN201610016741.8A 2016-01-11 2016-01-11 A kind of Address Recognition method and device Pending CN106959961A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610016741.8A CN106959961A (en) 2016-01-11 2016-01-11 A kind of Address Recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610016741.8A CN106959961A (en) 2016-01-11 2016-01-11 A kind of Address Recognition method and device

Publications (1)

Publication Number Publication Date
CN106959961A true CN106959961A (en) 2017-07-18

Family

ID=59480760

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610016741.8A Pending CN106959961A (en) 2016-01-11 2016-01-11 A kind of Address Recognition method and device

Country Status (1)

Country Link
CN (1) CN106959961A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038090A (en) * 2017-12-26 2018-05-15 北京明朝万达科技股份有限公司 A kind for the treatment of method and apparatus of Text Address
CN108415930A (en) * 2018-01-19 2018-08-17 大象慧云信息技术有限公司 Data analysis method and device
CN109961259A (en) * 2019-03-28 2019-07-02 上海中通吉网络技术有限公司 Address Standardization processing method and equipment
CN110765773A (en) * 2019-10-31 2020-02-07 北京金堤科技有限公司 Address data acquisition method and device
CN110852080A (en) * 2018-08-01 2020-02-28 北京京东尚科信息技术有限公司 Order address identification method, system, equipment and storage medium
CN111639493A (en) * 2020-05-22 2020-09-08 上海微盟企业发展有限公司 Address information standardization method, device, equipment and readable storage medium
CN112395377A (en) * 2019-08-19 2021-02-23 中国电信股份有限公司 Address recognition method, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882163A (en) * 2010-06-30 2010-11-10 中国科学院地理科学与资源研究所 Fuzzy Chinese address geographic evaluation method based on matching rule
CN102393937A (en) * 2011-10-12 2012-03-28 深圳市络道科技有限公司 Address matching method and system of address tree based on backward production
CN103559177A (en) * 2013-11-12 2014-02-05 金蝶软件(中国)有限公司 Geographical name identification method and geographical name identification device
US20140149341A1 (en) * 2012-11-29 2014-05-29 Electronics And Telecommunications Research Institute System and method for refining address database for improving performance of automated mail sorting machine
CN104166679A (en) * 2014-07-08 2014-11-26 北京迪威特科技有限公司 Address matching method for sorting
CN104537062A (en) * 2014-12-29 2015-04-22 北京牡丹电子集团有限责任公司数字电视技术中心 Address information extracting method and system
CN105069056A (en) * 2015-07-24 2015-11-18 湖北文理学院 Character string matching based method and system for analyzing address information of identification card

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882163A (en) * 2010-06-30 2010-11-10 中国科学院地理科学与资源研究所 Fuzzy Chinese address geographic evaluation method based on matching rule
CN102393937A (en) * 2011-10-12 2012-03-28 深圳市络道科技有限公司 Address matching method and system of address tree based on backward production
US20140149341A1 (en) * 2012-11-29 2014-05-29 Electronics And Telecommunications Research Institute System and method for refining address database for improving performance of automated mail sorting machine
CN103559177A (en) * 2013-11-12 2014-02-05 金蝶软件(中国)有限公司 Geographical name identification method and geographical name identification device
CN104166679A (en) * 2014-07-08 2014-11-26 北京迪威特科技有限公司 Address matching method for sorting
CN104537062A (en) * 2014-12-29 2015-04-22 北京牡丹电子集团有限责任公司数字电视技术中心 Address information extracting method and system
CN105069056A (en) * 2015-07-24 2015-11-18 湖北文理学院 Character string matching based method and system for analyzing address information of identification card

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038090A (en) * 2017-12-26 2018-05-15 北京明朝万达科技股份有限公司 A kind for the treatment of method and apparatus of Text Address
CN108415930A (en) * 2018-01-19 2018-08-17 大象慧云信息技术有限公司 Data analysis method and device
CN108415930B (en) * 2018-01-19 2021-07-09 大象慧云信息技术有限公司 Data analysis method and device
CN110852080A (en) * 2018-08-01 2020-02-28 北京京东尚科信息技术有限公司 Order address identification method, system, equipment and storage medium
CN109961259A (en) * 2019-03-28 2019-07-02 上海中通吉网络技术有限公司 Address Standardization processing method and equipment
CN109961259B (en) * 2019-03-28 2021-07-27 上海中通吉网络技术有限公司 Address standardization processing method and equipment
CN112395377A (en) * 2019-08-19 2021-02-23 中国电信股份有限公司 Address recognition method, device and storage medium
CN110765773A (en) * 2019-10-31 2020-02-07 北京金堤科技有限公司 Address data acquisition method and device
CN111639493A (en) * 2020-05-22 2020-09-08 上海微盟企业发展有限公司 Address information standardization method, device, equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN106959961A (en) A kind of Address Recognition method and device
CN109145169B (en) Address matching method based on statistical word segmentation
CN105468679A (en) Tourism information processing and plan providing method
CN102024024B (en) Method and device for constructing address database
CN101996247B (en) Method and device for constructing address database
CN103186524B (en) A kind of place name identification method and apparatus
CN105069056B (en) Identity certificate address information analytic method and system based on string matching
CN106528526B (en) A kind of Chinese address semanteme marking method based on Bayes's segmentation methods
CN106033460A (en) Address data processing method and apparatus
CN106055650A (en) Address standardization method and device
CN105159949A (en) Chinese address word segmentation method and system
CN102089761A (en) Automatic discovery of popular landmarks
CN102405622A (en) Methods and devices for binary tree construction, compression and lookup
CN106021336A (en) A method for automatic administrative district division for mass address information
CN108763215A (en) A kind of address storage method, device and computer equipment based on address participle
CN103902521A (en) Chinese statement identification method and device
CN107025232A (en) The processing method and processing device of address information in logistics system
CN111522901A (en) Method and device for processing address information in text
WO2022100154A1 (en) Artificial intelligence-based address standardization method and apparatus, device and storage medium
CN102033947A (en) Region recognizing device and method based on retrieval word
CN106777118B (en) A kind of quick abstracting method of geographical vocabulary based on fuzzy dictionary tree
CN108038506A (en) A kind of library automatic classification method
CN103049481B (en) A kind of searching method and search equipment
CN106021556A (en) Address information processing method and device
CN111625732A (en) Address matching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180403

Address after: Four story 847 mailbox of the capital mansion of Cayman Islands, Cayman Islands, Cayman

Applicant after: CAINIAO SMART LOGISTICS HOLDING Ltd.

Address before: Cayman Islands Grand Cayman capital building a four storey No. 847 mailbox

Applicant before: ALIBABA GROUP HOLDING Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20170718

RJ01 Rejection of invention patent application after publication