CN110442603A - Address matching method, apparatus, computer equipment and storage medium - Google Patents
Address matching method, apparatus, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110442603A CN110442603A CN201910601364.8A CN201910601364A CN110442603A CN 110442603 A CN110442603 A CN 110442603A CN 201910601364 A CN201910601364 A CN 201910601364A CN 110442603 A CN110442603 A CN 110442603A
- Authority
- CN
- China
- Prior art keywords
- address
- matching
- participle
- segmentation
- segmentations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000011218 segmentation Effects 0.000 claims abstract description 153
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 32
- 238000003058 natural language processing Methods 0.000 claims description 39
- 238000012549 training Methods 0.000 claims description 38
- 238000004590 computer program Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 7
- 238000013507 mapping Methods 0.000 claims description 6
- 238000005538 encapsulation Methods 0.000 claims description 5
- 238000012856 packing Methods 0.000 claims description 3
- 240000006248 Broussonetia kazinoki Species 0.000 claims 1
- 235000019082 Osmanthus Nutrition 0.000 description 28
- 241000333181 Osmanthus Species 0.000 description 28
- 238000005516 engineering process Methods 0.000 description 13
- 238000011161 development Methods 0.000 description 8
- 230000018109 developmental process Effects 0.000 description 8
- 241000208340 Araliaceae Species 0.000 description 6
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 6
- 235000003140 Panax quinquefolius Nutrition 0.000 description 6
- 235000008434 ginseng Nutrition 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2468—Fuzzy queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Abstract
This application discloses address matching method, apparatus, computer equipment and storage mediums, wherein the first address of address matching method is the address to be retrieved of user's input, second address is stored in index server, method includes: to call preset matching algorithm, the first address and the second address are segmented according to the first preset rules respectively, obtain the corresponding second participle group of the corresponding first participle group in the first address and the second address, wherein preset matching algorithm includes participle calculating and matching primitives;The first address is divided into multiple first segmentations according to first participle group, the second address is divided by multiple second segmentations according to the second participle group;The matching result of the first segmentation with the second segmentation is obtained according to the second preset rules, and judges whether the first address and the second address are identical.It for sectional address first four administrative grade address, is accurately matched according to national county and town, province, city and region address base (tree-shaped), effective completion is carried out for excalation.
Description
Technical field
This application involves computer field is arrived, address matching method, apparatus, computer equipment and storage are especially related to
Medium.
Background technique
Traditional address fuzzy matching often carries out fuzzy matching based on NLP using address as a complete individual, but
There are following defects for this mode: 1) structure of address is the tree structure of address name, closer to the bottom of tree structure
The similar ability of layer more closely, but matched address name is that parallel construction compares as a whole, compare and do not meet address name
Actual distribution structure;2) can be poor for short address comparative effectiveness, but most of short address is that have compared with sound value.3) same
The address name of a address is inconsistent in practice, such as Shenzhen/Nanshan District/rises as word individual value congruency
Mansion is interrogated, wherein address name Tencent mansion obviously can be more valuable as effective address.
Summary of the invention
The main purpose of the application is to provide address matching method, it is intended to solve the technology of existing address matching existing defects
Problem.
The application proposes a kind of address matching method, and the first address is the address to be retrieved of user's input, and the second address is deposited
It is stored in index server, method includes:
Preset matching algorithm is called, respectively carries out first address and second address according to the first preset rules
Participle, obtains the corresponding second participle group of the corresponding first participle group in first address and second address, wherein described
Preset matching algorithm includes participle calculating and matching primitives;
First address is divided into multiple first segmentations according to the first participle group, according to the second participle group
Second address is divided into multiple second segmentations;
The matching result of all first segmentations with all second segmentations is obtained according to the second preset rules;
Judge whether first address and second address are identical according to the matching result.
Present invention also provides a kind of address matching device, the first address is the address to be retrieved of user's input, the second ground
Location is stored in index server, and device includes:
Word segmentation module, for calling preset matching algorithm, respectively by first address and second address according to
One preset rules are segmented, and obtain the corresponding first participle group in first address and second address is second point corresponding
Phrase, wherein the preset matching algorithm includes participle calculating and matching primitives;
Division module, for first address to be divided into multiple first segmentations according to the first participle group, according to
Second address is divided into multiple second segmentations by the second participle group;
Second obtains module, for obtaining all first segmentations and all described second points according to the second preset rules
The matching result of section;
Judgment module, for judging whether first address and second address are identical according to the matching result.
Present invention also provides a kind of computer equipment, including memory and processor, the memory is stored with calculating
The step of machine program, the processor realizes the above method when executing the computer program.
Present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer
The step of above-mentioned method is realized when program is executed by processor.
The application is for sectional address first four administrative grade address, according to national county and town, province, city and region address base (tree-shaped)
It is accurately matched, in addition, carrying out effective completion for excalation.Pre-stored data are in the index server of the application
Unstructured data, storage mode are the column storage forms of key-value pair, and unstructured data refers to text, image, voice etc.
Based on the column storage that NoSQL memory technology is formed, data volume is very big, needs to carry out using the NoSQL technology of distributed structure/architecture
Storage and calculating, index server are just being combined with the distributed structure/architecture storage of NoSQL and index structure and are realizing to magnanimity number
According to it is real-time quick inquiry and calculating, propose the configurable weight address matching model based on address GradeNDivision, first pass through
Natural Language Processing Models carry out participle to address name and form participle group, and participle phrase is divided ingredient according to administrative grade
Section, and be the node in tree by subsection compression, the tree of address is fully considered, by address according to administrative grade
It carries out classification and draws section, each administrative grade two stage cultivation difference weight, the fine-tuning weight of practical business scene.The application by pair
The mass data prestored in index server establishes index structure, in conjunction with Elastic search component itself computing architecture with
And powerful distributed computation ability, it realizes to the first address in default index structure, is inquired real-time, quickly.The application
Default-weight by training pattern training obtain, by constantly regulate training parameter in the training process, keep model training defeated
Similarity out is consistent with the similarity value marked in advance, or within the scope of predetermined deviation, and above-mentioned training parameter includes each weight
Value keeps weight setting more reliable with each weighted value of determination.
Detailed description of the invention
The address matching method flow schematic diagram of one embodiment of Fig. 1 the application;
The address matching apparatus structure schematic diagram of one embodiment of Fig. 2 the application;
The computer equipment schematic diagram of internal structure of one embodiment of Fig. 3 the application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Referring to Fig.1, the address matching method of one embodiment of the application, first address be user input to be retrievedly
Location, second address are stored in index server, and method includes:
S1: calling preset matching algorithm, respectively segment the first address and the second address according to the first preset rules,
Obtain the corresponding second participle group of the corresponding first participle group in first address and second address, wherein described default
Matching algorithm includes participle calculating and matching primitives.
In the present embodiment, for comparing the first address and two address similitude, above-mentioned first address and the second ground
It from high to low, is write by range to specific mode for foundation administrative grade location.The first preset rules root of the present embodiment
There is different word segmentation regulations, such as national general province/city/area, county/township, town according to administrative grade difference locating in address
The corresponding participle of four administrative grades usually borrows national general address database and is segmented.Such as Fushan City, Guangdong Province south
Sea area osmanthus cities and towns, word segmentation result are as follows: Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns.For above-mentioned province/city/area, county/township, town
Address information except four administrative grades is segmented by way of semanteme participle.
S2: first address is divided by multiple first segmentations according to the first participle group, according to described second point
Second address is divided into multiple second segmentations by phrase.
The present embodiment is segmented to address and/or is divided administrative hierarchy, Mei Gefen according to the corresponding participle phrase in address
Section or the corresponding one or more participles of each administrative hierarchy.Each first segmentation, the second address are corresponded to for convenience of the first address is distinguished
Corresponding each second segmentation, " first " of the present embodiment, " second " etc. are only used for distinguishing, and are not used in restriction, the similar use of elsewhere
Language effect is identical, does not repeat.Participle group is that the participle of actual address arranges, and is formed according to the writing order of raw address.Such as name
Claim in long " certain city development zone ", corresponding two participles " certain city/development zone ", but segmentation is on the basis of participle according to row
The segmentation that political affairs grade carries out, such as " certain city development zone " belong to a segmentation.
S3: the matching result of all first segmentations with all second segmentations is obtained according to the second preset rules.
The present embodiment according to the corresponding relationship of administrative grade, obtains the first segmentation and the second segmentation after being matched one by one
Matching result.Citing ground, corresponding first segmentation of province's rank of the first address are second point corresponding with two address province's rank
Section compares, to improve the symmetry and reliability of information comparison.
S4: judge whether first address and second address are identical according to the matching result.
The present embodiment is by the corresponding relationship of administrative grade, and one-to-one correspondence compares the first address and the second address, when first
Address and two address matching rate reach preset range, then determine that the first address is identical with the second address, otherwise different.This Shen
It please not require nothing more than matching rate and reach preset range, and require the specified corresponding two stage cultivation degree of administrative grade in other embodiments
Reach 100%, can determine that the first address is identical with the second address, it is otherwise different, to improve matching accuracy.
First address of the present embodiment is the address to be checked of user's input, and the data composed structure of the first address does not limit
It is fixed, the matching primitives to address to be checked can be achieved, improve flexibility ratio and freedom degree that user uses.For example, the first address
Including according to province, city/area/county/town, township/road, cell, mansion/and the data group successively arranged of six administrative grades of number
At, or the composition of the data including lacking some or certain several administrative grades.The preset matching condition of the present embodiment includes matching rate
The flag data reached in preset threshold or the first address reaches 100% matching etc..Above-mentioned flag data refers to energy in the first address
It is described in detail the data information of geographical position, such as title, the title of certain mansion of some cell.Such as the first address Zhong Bao
" Jiangnan name occupies cell Rong Yuan " included is flag data.The flag data of first address of another embodiment of the application be " town,
After township " administrative grade, the data information before " and number " is flag data.
Further, first address and second address respectively include range address and tag addresses, the tune
With preset matching algorithm, the first address and the second address are segmented according to the first preset rules respectively, obtain described first
The step S1 of the corresponding second participle group of the corresponding first participle group in address and second address, comprising:
S11: by the corresponding range address in first address and second address, according to natural language processing mould
Pre-association address dictionary is segmented in type, respectively obtains the corresponding first participle part in first address and second ground
The corresponding first participle part in location.
The range address of the present embodiment is including at least an administrative grade in province/city/area, county/township, four, town administrative grade
Not.The range address of the present embodiment is segmented by pre-association address dictionary, and address above mentioned dictionary is national address database
In corresponding dictionary, address name is segmented by being associated with Natural Language Processing Models in advance.The present embodiment is preset
Matching algorithm includes analytical calculation and matching primitives, in order to improve address matching precision, by open source segmentation methods packet
Jieba carry out participle calculate when, be added to crawler address base, be used in combination with national address base treat participle address carry out school
Just, it is then segmented according to administrative grade, improves the accuracy rate of participle.The administrative grade for being included by judging current address
It whether is the corresponding administrative grade of call address dictionary, if so, call address dictionary carries out participle calculating.Citing ground, address:
Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan name occupies in cell Rong Yuan 1 306, including the corresponding level Four row of call address dictionary
Political affairs rank then segments the corresponding level Four administrative grade in address according to address dictionary, and word segmentation result is as follows: Guangdong Province/Buddhist
Mountain city/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell Rong Yuan 1 306.Then first participle part correspond to Guangdong Province/Foshan City/
Nanhai District/osmanthus cities and towns.
S12: by the corresponding tag addresses in first address and second address, according to natural language processing mould
The first syntactic model in type is segmented, and first address corresponding second participle part and second ground are respectively obtained
Location corresponding second participle part.
The tag addresses of the present embodiment include the information of geographical position can be described in detail, such as some cell title, certain
The title of mansion.Such as " Jiangnan name occupies cell Rong Yuan " in address above mentioned.The present embodiment is according to Natural Language Processing Models
In the first syntactic model tag addresses are segmented, above-mentioned first syntactic model include but is not limited to " so-and-so cell ", " certain
Certain mansion " etc..For example " osmanthus cities and towns Jiangnan name occupies cell Rong Yuan 1 306 ", corresponding second participle part are " Gui Cheng/Jiangnan name
Occupy cell/Rong Yuan ".First syntactic model of another embodiment of the application is after extracting " town, township ", before " and number "
Character be tag addresses.
S13: by the corresponding first participle part in first address and first address corresponding second participle part
The corresponding first participle group in first address is formed, by the corresponding first participle part in second address and second ground
Location corresponding second participle part forms the corresponding second participle group in second address.
First address or the second address of the present embodiment include range address and tag addresses, and from left to right successively
Arrangement the first address of composition or the second address.Citing ground, the first address are that " Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan name occupies
Cell Rong Yuan ";Second address is " the Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan garden Ming Jurong ";First address corresponding first
Participle group is that " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell/Rong Yuan " and the second address are second point corresponding
Phrase is " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name residence/Rong Yuan ".
Further, first address and second address respectively further comprise details address, described by described first
The corresponding tag addresses in address and second address are carried out according to the first syntactic model in Natural Language Processing Models
Participle respectively obtains first address corresponding second participle part the second participle corresponding with second address part
After step S12, comprising:
S14: by the corresponding details address in first address and second address, according to natural language processing mould
The second syntactic model in type is segmented, and first address corresponding third participle part and second ground are respectively obtained
The corresponding third in location segments part.
The details address of the present embodiment is specific " and number ", is had for two address similitudes of matching small
Effect and influence, or even this partial content can be ignored in other embodiments.But essence is needed for certain specific application scenarios
Standard arrives details address, to meet business demand.Second syntactic model of the present embodiment include but is not limited to " certain ", " certain certain
Floor ", " certain certain floor room " etc..
S15: by the corresponding first participle part in first address, first address corresponding second participle part with
And the corresponding third participle in the first address part forms the corresponding first participle group in first address, by second ground
The corresponding first participle part in location, second address corresponding second participle part and the corresponding third in second address
Participle part forms the corresponding second participle group in second address.
First address or the second address of the present embodiment include range address, tag addresses and details address, and from
Left-to-right successively arrangement the first address of composition or the second address.Citing ground, the first address are " Fushan City, Guangdong Province Nanhai District Gui Cheng
Zhenjiang Nan Mingju cell Rong Yuan 1 306 ";Second address is " 1, the Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan garden Ming Jurong
502";The corresponding first participle group in first address is that " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell/honor
The corresponding second participle group in garden/1/306 " and the second address be " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies/
Rong Yuan/1/502 ", to be segmented or divided administrative grade to the first address or the second address according to above-mentioned participle phrase.
Further, the range address includes province/city/area, county/township, four, town administrative grade, the tag addresses packet
Cell name or building name are included, it is described to obtain all first segmentations and all described second points according to the second preset rules
The step S3 of the matching result of section, comprising:
S31: all first segmentations are segmented respectively with all described second according to administrative grade from high to low suitable
Sequence is mapped as two mutually isostructural structure trees, wherein the structure tree includes multiple nodes, and each node is respectively with each described
One segmentation or each second segmentation correspond.
The present embodiment passes through corresponding all first segmentations in the first address or the second address are all second points corresponding
Section, is two mutually isostructural structure trees according to the Sequential Mapping of administrative grade from high to low, and a node is at least one corresponding
Segmentation or a node correspond to multiple participles of same administrative grade.Such as the highest administrative grade that will contain in the first address
" province " corresponding participle " Guangdong Province " is used as root node, is then sequentially connected the corresponding participle " Foshan of next stage child node " city "
City ", then and so on be connected to endpoint node " 1 502 " etc..According to the difference of specific address information, root node and end
The corresponding administrative grade of node is different, can be the full address for covering all administrative grades, is also possible to covering part branch
The short address of political affairs rank.
S32: the corresponding matching value of each node of structure tree of acquisition two.
The matching primitives of the present embodiment are the corresponding relationships according to administrative grade, map the intermediate nodes of two structure trees with
Corresponding relationship between node, and obtained according to above-mentioned corresponding relationship and calculate the corresponding matching value of each node, matching value packet
Matching segmentation is included divided by the corresponding all segmentations of the node.Citing ground, the corresponding node in the first address, and saved for " province "
Point is assigned a value of in " Guangdong ", and corresponding " province " node valuation in the second address is also that " Guangdong " is then matching, is otherwise mismatched.
S33: obtain respectively corresponding first weight of the range address, corresponding second weight of the tag addresses and
The corresponding third weight in the details address.
Influence of the present embodiment according to the corresponding segmentation of each administrative grade to address is different, different weights is arranged, to mention
Height meets the flexibility ratio of business demand.Such as corresponding second weight of tag addresses is higher than corresponding first power of the range address
Again etc..
S34: matching rate is calculated multiplied by respective weights according to matching value, it is first corresponding to respectively obtain the range address
With rate, the corresponding third matching rate of corresponding second matching rate of the tag addresses and the details address.
The calculation formula of the present embodiment matching rate are as follows: each each segmented configuration weight of two stage cultivation result * is equal to each segmentation
Matching rate sums up the matching rate of each segmentation, obtains the first address and two address matching result.
S35: by the adduction of first matching rate, second matching rate and the third matching rate, as the institute
There is the matching result of first segmentation with all second segmentations.
Further, the step S32 of the corresponding matching value of each node of structure tree of acquisition two, comprising:
S321:, and will be in second address by corresponding each first segmentation of the range address in first address
Corresponding each second segmentation of range address, corresponds according to node corresponding relationship and carries out precisely full matching, obtain each first
With value.
The matching process of the different administrative grade corresponding nodes of the present embodiment is different, province/city/area, county/township, four, town row
Political affairs rank is matched by complete matched accurate corresponded manner, i.e., it is then to match, otherwise that it is identical, which correspond to the correspondence of character 100%,
It mismatches.For example, corresponding " province " node valuation in the first address is " Guangdong ", corresponding " province " node valuation in the first address is also
" Guangdong " is then matching.
S322:, and will be in second address by corresponding each first segmentation of the tag addresses in first address
Corresponding each second segmentation of tag addresses, corresponds according to node corresponding relationship and carries out model keyword match, obtain each the
Two matching values.
The present embodiment passes through NLP (Natural Language Processing, natural language to tag addresses corresponding segments
Speech processing) mode of Model Matching realizes matching, including or comprising matching relationship can be realized.For example " Jiangnan name occupies cell/honor
Garden " includes in " Jiangnan name occupies cell " although not having complete matched peer-to-peer on character with " Jiangnan name residence/Rong Yuan "
Character " Jiangnan name occupies ", still has one-to-one matching relationship.
S323:, and will be in second address by corresponding each first segmentation in the details address in first address
Corresponding each second segmentation in details address, corresponds according to node corresponding relationship and carries out digital matching, obtain each third matching
Value.
The details address of the present embodiment includes the segmentation of the first specified quantity, but the number of fragments for meeting matching relationship is the
Two specified quantities, then the corresponding matching value in details address is the second specified quantity divided by the first specified quantity.
S324: summarize each first matching value, each second matching value and each third matching value, obtain two
The corresponding matching value of described each node of structure tree.
For example, the corresponding participle phrase in the first address are as follows: the Guangdong/Foshan City/South Sea/Gui Cheng/Jiangnan name occupies cell/honor
Garden/1/306;The corresponding participle phrase in second address are as follows: the Guangdong/Foshan City/South Sea/Gui Cheng/Jiangnan name residence/Rong Yuan/1/502;
The first address and the second address are divided into six administrative grades after segmentation, including province/city/area, county/town, township/road, cell, big
Tall building/and number, respectively correspond and be divided into six nodes, each node default-weight be respectively " 0.1/0.1/0.1/0.1/0.5/
0.1".Preceding four administrative hierarchy is the matching of character 100%: the Guangdong/Foshan City/South Sea/Gui Cheng, matching result is respectively 0.1*1/
0.1*1/0.1*1/0.1*1;Fifth line political affairs ratings match is the Model Matching of character inclusion relation: Jiangnan name occupies cell/Rong Yuan
It is 0.5*1 with Jiangnan name residence/Rong Yuan matching result;The matching of 6th administrative hierarchy is fuzzy matching: 1/306 and 1/502 matching
In, only one field of corresponding two fields has matching relationship, and 306 and 502 mismatch, then corresponding matching value is 0.5,
Then matching result be 0.5*0.1, i.e., 0.05.Then above-mentioned first address and two address matching rate are as follows: 0.1+0.1+0.1+0.1
+ 0.5+0.05=0.95.
Further, described to obtain corresponding first weight of the range address, the tag addresses corresponding respectively
Before the step S33 of the corresponding third weight of two weights and the details address, comprising:
S331: by the training sample of the specified quantity of pre- mark similarity value, the Natural Language Processing Models are input to
In be trained.
S332: by adjusting training parameter to the first parameter, make the similarity value of the Natural Language Processing Models output
It is consistent with the pre- mark similarity value.
S333: by corresponding weighted value in first parameter, described first is corresponded to according to node corresponding relationship respectively
Weight, second weight and the third weight.
The default-weight of the present embodiment is obtained by training pattern training, by constantly regulate trained ginseng in the training process
Number, the similarity for exporting model training is consistent with the similarity value marked in advance, or within the scope of predetermined deviation, above-mentioned training
Parameter includes each weighted value, with each weighted value of determination.The application other embodiments can also will be adjusted according to specific application scenarios
One or more of default-weight makes Matching Model be more in line with current application scene.
Further, described by the corresponding range address in first address and second address, according to nature
Pre-association address dictionary is segmented in Language Processing model, respectively obtain the corresponding first participle part in first address and
Before the step S11 of the corresponding first participle part in second address, comprising:
S10: call address database according to third preset rules, respectively to first address and second address into
Row address amendment.
First address or the second address of the present embodiment can be the address date not met in national address database, can
Address correction, including address completion, removal determiner etc. are carried out by call address database.When the completion of the present embodiment address,
It can upward completion Foshan City according to child node completion root node, such as Nanhai District;Or according to front and back node completion intermediate node, such as
Foshan City and osmanthus cities and towns can carry out address completion in a manner of intermediate completion Nanhai District etc..
Further, preset matching algorithm is called, it is respectively that first address and second address is pre- according to first
If rule is segmented, the corresponding second participle group of the corresponding first participle group in first address and second address is obtained
Step S1 before, comprising:
S1a: by the index server be pre-stored specified quantity non-structured being indexed of address date,
To obtain the default index structure.
The data being pre-stored in the index server of the present embodiment are unstructured data, and storage mode is key-value pair
Column storage form, unstructured data refer to the column storage that text, image, voice etc. are formed based on NoSQL memory technology, data
Amount is very big, needs to be stored and calculated using the NoSQL technology of distributed structure/architecture, index server is just combined with
The distributed structure/architecture of NoSQL stores and index structure realizes real-time quick inquiry and calculating to mass data.NOSQL, that is, non-
Relevant database is open source technology.Storage mode of the elasticsearch based on Key-value key-value pair and inverted index,
Calculate then it is main a large amount of based on memory, realize and quickly calculate in real time.
S1b: receiving the interface card being uploaded under the specified directory of the index server, wherein the interface card is logical
It crosses after the preset matching algorithm is carried out packing encapsulation and is formed.
The index server of the present embodiment is open source component, supports plug-in unit mode, interface card can be inherited to its rg. rope
Draw server .plugins.Plugin class, carries out the customized address matching algorithm groupware expanded and developed, restart index server
Use can be loaded.
S1c: the configuration parameter of the interface card is obtained.
S1d: the default index structure is associated with interface card foundation calculating by running the configuration parameter
Relationship.
The present embodiment by preset matching algorithm development it is complete after, upload to index server specified directory after being packaged encapsulation and go forward side by side
The configuration of row relevant configured parameter is inserted the default index structure with the interface with realizing through load operating configuration parameter
Part, which is established, calculates incidence relation, realizes by calling address matching algorithm in plug-in unit, by the first address in default index structure
Matching primitives are completed, to realize that address date is inquired.
The index server of the present embodiment is that (Elastic search is for being distributed for the Elastic search component of open source
Formula full-text search), the full-text search engine of distributed computation ability is provided based on RESTful web interface, it can be to magnanimity
Data are inquired real-time, quickly.Query steps include: the number of (1) by the address of magnanimity address base according to elasticsearch
The bottom storage of elasticsearch is imported in the form of key-value key-value pair according to introducting interface, and key is established and is indexed.
(2) by the ground of (1) Matching Model is transformed according to the customized extension search model of elasticsearch, and is added to
Elasticsearch host node expansion module, and restart elasticsearch, making can be based on utilization
The address matching model that the distributed storage and high concurrent of elasticsearch calculates.(3) self-definition model, In are utilized
One-to-many magnanimity address matching interface is developed on elasticsearch.(4) it is connect by developing upper layer on elasticsearch
Mouthful, so that a new address can be inputted, and select magnanimity address base and self-definition model to be matched, it can be based on
Elasticsearch realizes the quick real calculating of address in new address and magnanimity address base, and returns to the most like address TOPN,
Wherein N can program setting biography ginseng.The present embodiment by establishing index structure to the mass data prestored in index server, in conjunction with
The computing architecture of Elasticsearch component itself and powerful distributed computation ability are realized to the first address default
In index structure, inquired real-time, quickly.
The present embodiment is different for the matching process of the corresponding different segmentations of the first address difference administrative grade, Matching Model
Difference, and it is also different to be respectively segmented corresponding matching weight.First address of the present embodiment is divided into six segmentations, respectively corresponds six
A administrative grade corresponds to six nodes in tree construction, and the Matching Model of first four administrative grade is identical in six administrative grades,
It corresponds and matches for character;5th administrative grade by the inclusion of or including fuzzy matching model;6th administrative grade
It is matched by digital Matching Model.The present embodiment by the way that strobe utility is arranged during matching primitives, first to " province/city,
Four, area/county/town, township, road " the corresponding target segment of administrative grade is precisely matched by character matched mode one by one
It calculates, when the matching primitives result of target segment corresponding for aforementioned four administrative grade is lower than preset threshold, determines institute
It states there is no the address date for meeting preset matching condition with first address in default index structure, directly output matching is tied
By to reduce matching primitives amount, raising response speed.The present embodiment by setting strobe utility, can filter at least 90% with
On address.Make an address finally only need to be matched entirely with the address of residue 10% or so in this way, is greatly saved
Computing resource.
Referring to Fig. 2, the address matching device of one embodiment of the application, first address be user's input to be retrievedly
Location, second address are stored in index server, and device includes:
Word segmentation module 1, it is respectively that the first address and the second address is pre- according to first for calling the preset matching algorithm
If rule is segmented, corresponding second participle of the corresponding first participle group in first address and second address is obtained
Group, wherein the preset matching algorithm includes participle calculating and matching primitives.
In the present embodiment, for comparing the first address and two address similitude, above-mentioned first address and the second ground
It from high to low, is write by range to specific mode for foundation administrative grade location.The first preset rules root of the present embodiment
There is different word segmentation regulations, such as national general province/city/area, county/township, town according to administrative grade difference locating in address
The corresponding participle of four administrative grades usually borrows national general address database and is segmented.Such as Fushan City, Guangdong Province south
Sea area osmanthus cities and towns, word segmentation result are as follows: Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns.For above-mentioned province/city/area, county/township, town
Address information except four administrative grades is segmented by way of semanteme participle.
Division module 2, for first address to be divided into multiple first segmentations according to the first participle group, according to
Second address is divided into multiple second segmentations by the second participle group.
The present embodiment is segmented to address and/or is divided administrative hierarchy, Mei Gefen according to the corresponding participle phrase in address
Section or the corresponding one or more participles of each administrative hierarchy.Each first segmentation, the second address are corresponded to for convenience of the first address is distinguished
Corresponding each second segmentation, " first " of the present embodiment, " second " etc. are only used for distinguishing, and are not used in restriction, the similar use of elsewhere
Language effect is identical, does not repeat.Participle group is that the participle of actual address arranges, and is formed according to the writing order of raw address.Such as name
Claim in long " certain city development zone ", corresponding two participles " certain city/development zone ", but segmentation is on the basis of participle according to row
The segmentation that political affairs grade carries out, such as " certain city development zone " belong to a segmentation.
First obtains module 3, for obtaining all first segmentations and all described second according to the second preset rules
The matching result of segmentation.
The present embodiment according to the corresponding relationship of administrative grade, obtains the first segmentation and the second segmentation after being matched one by one
Matching result.Citing ground, corresponding first segmentation of province's rank of the first address are second point corresponding with two address province's rank
Section compares, to improve the symmetry and reliability of information comparison.
Judgment module 4, for judging whether first address and second address are identical according to the matching result.
The present embodiment is by the corresponding relationship of administrative grade, and one-to-one correspondence compares the first address and the second address, when first
Address and two address matching rate reach preset range, then determine that the first address is identical with the second address, otherwise different.This Shen
It please not require nothing more than matching rate and reach preset range, and require the specified corresponding two stage cultivation degree of administrative grade in other embodiments
Reach 100%, can determine that the first address is identical with the second address, it is otherwise different, to improve matching accuracy.
First address of the present embodiment is the address to be checked of user's input, and the data composed structure of the first address does not limit
It is fixed, the matching primitives to address to be checked can be achieved, improve flexibility ratio and freedom degree that user uses.For example, the first address
Including according to province, city/area/county/town, township/road, cell, mansion/and the data group successively arranged of six administrative grades of number
At, or the composition of the data including lacking some or certain several administrative grades.The preset matching condition of the present embodiment includes matching rate
The flag data reached in preset threshold or the first address reaches 100% matching etc..Above-mentioned flag data refers to energy in the first address
It is described in detail the data information of geographical position, such as title, the title of certain mansion of some cell.Such as the first address Zhong Bao
" Jiangnan name occupies cell Rong Yuan " included is flag data.The flag data of first address of another embodiment of the application be " town,
After township " administrative grade, the data information before " and number " is flag data.
Further, the word segmentation module 1, comprising:
First participle unit, for by the corresponding range address in first address and second address, according to
Pre-association address dictionary is segmented in Natural Language Processing Models, respectively obtains the corresponding first participle portion in first address
Divide first participle part corresponding with second address.
The range address of the present embodiment is including at least an administrative grade in province/city/area, county/township, four, town administrative grade
Not.The range address of the present embodiment is segmented by pre-association address dictionary, and address above mentioned dictionary is national address database
In corresponding dictionary, address name is segmented by being associated with Natural Language Processing Models in advance.The present embodiment is in order to mention
High address matching precision is added to crawler address base when calculating by carrying out participle in open source segmentation methods packet jieba, with
National address base be used in combination treat participle address be corrected, then segmented according to administrative grade, improve participle
Accuracy rate.By judging whether the administrative grade that current address is included is the corresponding administrative grade of call address dictionary, if so,
Then call address dictionary is segmented.Citing ground, address: Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan name occupies cell Rong Yuan 1
In seat 306, including the corresponding level Four administrative grade of call address dictionary, then by the corresponding level Four administrative grade in address according to address
Dictionary is segmented, and word segmentation result is as follows: Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell Rong Yuan 1 306.
Then first participle part corresponds to Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns.
Second participle unit, for by the corresponding tag addresses in first address and second address, according to
The first syntactic model in Natural Language Processing Models is segmented, and corresponding second participle in the first address portion is respectively obtained
Divide the second participle corresponding with second address part.
The tag addresses of the present embodiment include the information of geographical position can be described in detail, such as some cell title, certain
The title of mansion.Such as " Jiangnan name occupies cell Rong Yuan " in address above mentioned.The present embodiment is according to Natural Language Processing Models
In the first syntactic model tag addresses are segmented, above-mentioned first syntactic model include but is not limited to " so-and-so cell ", " certain
Certain mansion " etc..For example " osmanthus cities and towns Jiangnan name occupies cell Rong Yuan 1 306 ", corresponding second participle part are " Gui Cheng/Jiangnan name
Occupy cell/Rong Yuan ".First syntactic model of another embodiment of the application is after extracting " town, township ", before " and number "
Character be tag addresses.
First component units, for the corresponding first participle part in first address and first address is corresponding
Second participle part forms the corresponding first participle group in first address, by the corresponding first participle part in second address
The second participle corresponding with second address part forms the corresponding second participle group in second address.
First address or the second address of the present embodiment include range address and tag addresses, and from left to right successively
Arrangement the first address of composition or the second address.Citing ground, the first address are that " Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan name occupies
Cell Rong Yuan ";Second address is " the Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan garden Ming Jurong ";First address corresponding first
Participle group is that " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell/Rong Yuan " and the second address are second point corresponding
Phrase is " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name residence/Rong Yuan ".
Further, first address and second address respectively further comprise details address, the word segmentation module 1,
Include:
Third participle unit, for by the corresponding details address in first address and second address, according to
The second syntactic model in Natural Language Processing Models is segmented, and the corresponding third participle in the first address portion is respectively obtained
Third corresponding with second address is divided to segment part.
The details address of the present embodiment is specific " and number ", is had for two address similitudes of matching small
Effect and influence, or even this partial content can be ignored in other embodiments.But essence is needed for certain specific application scenarios
Standard arrives details address, to meet business demand.Second syntactic model of the present embodiment include but is not limited to " certain ", " certain certain
Floor ", " certain certain floor room " etc..
Second component units, for the corresponding first participle part in first address, first address is corresponding
Second participle part and first address corresponding third participle part form the corresponding first participle in first address
Group, by the corresponding first participle part in second address, second address corresponding second participle part and described the
Double-address corresponding third participle part forms the corresponding second participle group in second address.
First address or the second address of the present embodiment include range address, tag addresses and details address, and from
Left-to-right successively arrangement the first address of composition or the second address.Citing ground, the first address are " Fushan City, Guangdong Province Nanhai District Gui Cheng
Zhenjiang Nan Mingju cell Rong Yuan 1 306 ";Second address is " 1, the Fushan City, Guangdong Province Nanhai District osmanthus cities and towns Jiangnan garden Ming Jurong
502";The corresponding first participle group in first address is that " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies cell/honor
The corresponding second participle group in garden/1/306 " and the second address be " Guangdong Province/Foshan City/Nanhai District/osmanthus cities and towns/Jiangnan name occupies/
Rong Yuan/1/502 ", to be segmented or divided administrative grade to the first address or the second address according to above-mentioned participle phrase.
Further, the range address includes province/city/area, county/township, four, town administrative grade, and described first obtains mould
Block 3, comprising:
Map unit, for being segmented all first segmentations respectively according to administrative grade by height with all described second
Be two mutually isostructural structure trees to low Sequential Mapping, wherein the structure tree includes multiple nodes, each node respectively with
Each first segmentation or each second segmentation correspond.
The present embodiment passes through corresponding all first segmentations in the first address or the second address are all second points corresponding
Section, is two mutually isostructural structure trees according to the Sequential Mapping of administrative grade from high to low, and a node is at least one corresponding
Segmentation or a node correspond to multiple participles of same administrative grade.Such as the highest administrative grade that will contain in the first address
" province " corresponding participle " Guangdong Province " is used as root node, is then sequentially connected the corresponding participle " Foshan of next stage child node " city "
City ", then and so on be connected to endpoint node " 1 502 " etc..According to the difference of specific address information, root node and end
The corresponding administrative grade of node is different, can be the full address for covering all administrative grades, is also possible to covering part branch
The short address of political affairs rank.
First acquisition unit is used for the corresponding matching value of each node of structure tree of acquisition two.
The present embodiment maps corresponding between the intermediate node of two structure trees and node according to the corresponding relationship of administrative grade
Relationship, and the corresponding matching value of each node is obtained according to above-mentioned corresponding relationship, matching value includes matching segmentation divided by the section
The corresponding all segmentations of point.Citing ground, the corresponding node in the first address, and be " province " node, it is assigned a value of in " Guangdong ", second
Corresponding " province " node valuation in address is also that " Guangdong " is then matching, is otherwise mismatched.
Second acquisition unit, it is corresponding for obtaining corresponding first weight of the range address, the tag addresses respectively
The corresponding third weight of the second weight and the details address.
Influence of the present embodiment according to the corresponding segmentation of each administrative grade to address is different, different weights is arranged, to mention
Height meets the flexibility ratio of business demand.Such as corresponding second weight of tag addresses is higher than corresponding first power of the range address
Again etc..
Computing unit respectively obtains the range address pair for calculating matching rate multiplied by respective weights according to matching value
The first matching rate, the corresponding third matching rate of corresponding second matching rate of the tag addresses and the details address answered.
The calculation formula of the present embodiment matching rate are as follows: each each segmented configuration weight of two stage cultivation result * is equal to each segmentation
Matching rate sums up the matching rate of each segmentation, obtains the first address and two address matching result.
Unit is summed it up, for making the adduction of first matching rate, second matching rate and the third matching rate
For the matching result of all first segmentations and all second segmentations.
Further, the first acquisition unit, comprising:
First coupling subelement, for by corresponding each first segmentation of the range address in first address, and by institute
Corresponding each second segmentation of range address in the second address is stated, is corresponded according to node corresponding relationship and carries out precisely complete
Match, obtains each first matching value.
The matching process of the different administrative grade corresponding nodes of the present embodiment is different, province/city/area, county/township, four, town row
Political affairs rank is matched by complete matched accurate corresponded manner, i.e., it is then to match, otherwise that it is identical, which correspond to the correspondence of character 100%,
It mismatches.For example, corresponding " province " node valuation in the first address is " Guangdong ", corresponding " province " node valuation in the first address is also
" Guangdong " is then matching.
Second coupling subelement, for by corresponding each first segmentation of the tag addresses in first address, and by institute
Corresponding each second segmentation of tag addresses in the second address is stated, is corresponded according to node corresponding relationship and carries out model keyword
Matching, obtains each second matching value.
The present embodiment passes through NLP (Natural Language Processing, natural language to tag addresses corresponding segments
Speech processing) mode of Model Matching realizes matching, including or comprising matching relationship can be realized.For example " Jiangnan name occupies cell/honor
Garden " includes in " Jiangnan name occupies cell " although not having complete matched peer-to-peer on character with " Jiangnan name residence/Rong Yuan "
Character " Jiangnan name occupies ", still has one-to-one matching relationship.
Third coupling subelement, for by corresponding each first segmentation in the details address in first address, and by institute
Corresponding each second segmentation in details address in the second address is stated, is corresponded according to node corresponding relationship and carries out digital matching,
Obtain each third matching value.
The details address of the present embodiment includes the segmentation of the first specified quantity, but the number of fragments for meeting matching relationship is the
Two specified quantities, then the corresponding matching value in details address is the second specified quantity divided by the first specified quantity.
Summarize subelement, for summarizing each first matching value, each second matching value and each third
With value, the corresponding matching value of two each nodes of structure tree is obtained.
For example, the corresponding participle phrase in the first address are as follows: the Guangdong/Foshan City/South Sea/Gui Cheng/Jiangnan name occupies cell/honor
Garden/1/306;The corresponding participle phrase in second address are as follows: the Guangdong/Foshan City/South Sea/Gui Cheng/Jiangnan name residence/Rong Yuan/1/502;
The first address and the second address are divided into six administrative grades after segmentation, including province/city/area, county/town, township/road, cell, big
Tall building/and number, respectively correspond and be divided into six nodes, each node default-weight be respectively " 0.1/0.1/0.1/0.1/0.5/
0.1".Preceding four administrative hierarchy is the matching of character 100%: the Guangdong/Foshan City/South Sea/Gui Cheng, matching result is respectively 0.1*1/
0.1*1/0.1*1/0.1*1;Fifth line political affairs ratings match is the Model Matching of character inclusion relation: Jiangnan name occupies cell/Rong Yuan
It is 0.5*1 with Jiangnan name residence/Rong Yuan matching result;The matching of 6th administrative hierarchy is fuzzy matching: 1/306 and 1/502 matching
In, only one field of corresponding two fields has matching relationship, and 306 and 502 mismatch, then corresponding matching value is 0.5,
Then matching result be 0.5*0.1, i.e., 0.05.Then above-mentioned first address and two address matching rate are as follows: 0.1+0.1+0.1+0.1
+ 0.5+0.05=0.95.
Further, described first module 3 is obtained, comprising:
Input unit, for being input to the natural language for the training sample of the specified quantity of pre- mark similarity value
It is trained in processing model.
Adjustment unit, for making the Natural Language Processing Models output by adjusting training parameter to the first parameter
Similarity value is consistent with the pre- mark similarity value.
Corresponding unit, for being corresponded to according to node corresponding relationship respectively by corresponding weighted value in first parameter
First weight, second weight and the third weight.
The default-weight of the present embodiment is obtained by training pattern training, by constantly regulate trained ginseng in the training process
Number, the similarity for exporting model training is consistent with the similarity value marked in advance, or within the scope of predetermined deviation, above-mentioned training
Parameter includes each weighted value, with each weighted value of determination.The application other embodiments can also will be adjusted according to specific application scenarios
One or more of default-weight makes Matching Model be more in line with current application scene.
Further, the word segmentation module 1, comprising:
Call unit, for call address database according to third preset rules, respectively to first address and described
Second address carries out address correction.
First address or the second address of the present embodiment can be the address date not met in national address database, can
Address correction, including address completion, removal determiner etc. are carried out by call address database.When the completion of the present embodiment address,
It can upward completion Foshan City according to child node completion root node, such as Nanhai District;Or according to front and back node completion intermediate node, such as
Foshan City and osmanthus cities and towns can carry out address completion in a manner of intermediate completion Nanhai District etc..
Further, address matching device, further includes:
Index module, for will in the index server be pre-stored specified quantity non-structured address date into
Line index, to obtain the default index structure.
The data being pre-stored in the index server of the present embodiment are unstructured data, and storage mode is key-value pair
Column storage form, unstructured data refer to the column storage that text, image, voice etc. are formed based on NoSQL memory technology, data
Amount is very big, needs to be stored and calculated using the NoSQL technology of distributed structure/architecture, index server is just combined with
The distributed structure/architecture of NoSQL stores and index structure realizes real-time quick inquiry and calculating to mass data.NOSQL, that is, non-
Relevant database is open source technology.Storage mode of the elasticsearch based on Key-value key-value pair and inverted index,
Calculate then it is main a large amount of based on memory, realize and quickly calculate in real time.
Receiving module, for receiving the interface card being uploaded under the specified directory of the index server, wherein described
Interface card after the preset matching algorithm is carried out packing encapsulation by forming.
The index server of the present embodiment is open source component, supports plug-in unit mode, interface card can be inherited to its rg. rope
Draw server .plugins.Plugin class, carries out the customized address matching algorithm groupware expanded and developed, restart index server
Use can be loaded.
Second obtains module, for obtaining the configuration parameter of the interface card.
Module is established, for establishing the default index structure and the interface card by running the configuration parameter
Calculate incidence relation.
The present embodiment by address matching algorithm development it is complete after, upload to index server specified directory after being packaged encapsulation and go forward side by side
The configuration of row relevant configured parameter is inserted the default index structure with the interface with realizing through load operating configuration parameter
Part, which is established, calculates incidence relation, realizes by calling address matching algorithm in plug-in unit, by the first address in default index structure
Matching primitives are completed, to realize that address date is inquired.
The index server of the present embodiment is that (Elastic search is for being distributed for the Elastic search component of open source
Formula full-text search), the full-text search engine of distributed computation ability is provided based on RESTful web interface, it can be to magnanimity
Data are inquired real-time, quickly.Query steps include: the number of (1) by the address of magnanimity address base according to elasticsearch
The bottom storage of elasticsearch is imported in the form of key-value key-value pair according to introducting interface, and key is established and is indexed.
(2) by the ground of (1) Matching Model is transformed according to the customized extension search model of elasticsearch, and is added to
Elasticsearch host node expansion module, and restart elasticsearch, making can be based on utilization
The address matching model that the distributed storage and high concurrent of elasticsearch calculates.(3) self-definition model, In are utilized
One-to-many magnanimity address matching interface is developed on elasticsearch.(4) it is connect by developing upper layer on elasticsearch
Mouthful, so that a new address can be inputted, and select magnanimity address base and self-definition model to be matched, it can be based on
Elasticsearch realizes the quick real calculating of address in new address and magnanimity address base, and returns to the most like address TOPN,
Wherein N can program setting biography ginseng.The present embodiment by establishing index structure to the mass data prestored in index server, in conjunction with
The computing architecture of Elasticsearch component itself and powerful distributed computation ability are realized to the first address default
In index structure, inquired real-time, quickly.
The present embodiment is different for the matching process of the corresponding different segmentations of the first address difference administrative grade, Matching Model
Difference, and it is also different to be respectively segmented corresponding matching weight.First address of the present embodiment is divided into six segmentations, respectively corresponds six
A administrative grade corresponds to six nodes in tree construction, and the Matching Model of first four administrative grade is identical in six administrative grades,
It corresponds and matches for character;5th administrative grade by the inclusion of or including fuzzy matching model;6th administrative grade
It is matched by digital Matching Model.The present embodiment by the way that strobe utility is arranged during matching primitives, first to " province/city,
Four, area/county/town, township, road " the corresponding target segment of administrative grade is precisely matched by character matched mode one by one
It calculates, when the matching primitives result of target segment corresponding for aforementioned four administrative grade is lower than preset threshold, determines institute
It states there is no the address date for meeting preset matching condition with first address in default index structure, directly output matching is tied
By to reduce matching primitives amount, raising response speed.The present embodiment by setting strobe utility, can filter at least 90% with
On address.Make an address finally only need to be matched entirely with the address of residue 10% or so in this way, is greatly saved
Computing resource.
Referring to Fig. 3, a kind of computer equipment is also provided in the embodiment of the present application, which can be server,
Its internal structure can be as shown in Figure 3.The computer equipment includes processor, the memory, network connected by system bus
Interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment is deposited
Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program
And database.The internal memory provides environment for the operation of operating system and computer program in non-volatile memory medium.It should
The database of computer equipment is used for all data that storage address matching process needs.The network interface of the computer equipment is used
It is communicated in passing through network connection with external end.To realize address matching method when the computer program is executed by processor.
Above-mentioned processor executes address above mentioned matching process, and the first address is the address to be retrieved of user's input, the second ground
Location is stored in index server, and method includes: to call preset matching algorithm, respectively by first address and second ground
Location is segmented according to the first preset rules, obtains the corresponding first participle group in first address and second address is corresponding
The second participle group;First address is divided into multiple first segmentations according to the first participle group, according to described second
Second address is divided into multiple second segmentations by participle group;According to the second preset rules obtain it is all it is described first segmentation with
The matching result of all second segmentations;Judge whether are first address and second address according to the matching result
It is identical.
Above-mentioned computer equipment, the data being pre-stored in index server are unstructured data, and storage mode is key
The column storage form of value pair, the column that unstructured data refers to that text, image, voice etc. are formed based on NoSQL memory technology are deposited
Storage, data volume is very big, needs to be stored and calculated using the NoSQL technology of distributed structure/architecture, index server is exactly tied
The distributed structure/architecture storage and index structure of having closed NoSQL realize real-time quick inquiry and calculating to mass data, propose
Configurable weight address matching model based on address GradeNDivision, first pass through Natural Language Processing Models to address name into
Row participle forms participle group, participle phrase is divided into segmentation according to administrative grade, and be in tree by subsection compression
Node has fully considered the tree of address, and address is carried out classification according to administrative grade and draws section, each administrative grade segmentation
Match different weights, the fine-tuning weight of practical business scene.By establishing index to the mass data prestored in index server
Structure, computing architecture and powerful distributed computation ability in conjunction with Elastic search component itself are realized to first
Address is inquired real-time, quickly in default index structure.For sectional address first four administrative grade address, according to the whole nation
Province, city and region's county and town's address base (tree-shaped) is accurately matched, in addition, carrying out effective completion for excalation.Default-weight is logical
Training pattern training is crossed to obtain, by constantly regulate training parameter in the training process, the similarity that exports model training with
The similarity value marked in advance is consistent, or within the scope of predetermined deviation, and above-mentioned training parameter includes each weighted value, with each power of determination
Weight values keep weight setting more reliable.
In one embodiment, first address and second address respectively include range address and tag addresses,
Above-mentioned processor calls the preset matching algorithm, is respectively divided the first address and the second address according to the first preset rules
Word, the step of obtaining the corresponding first participle group in first address and the corresponding second participle group in second address, comprising:
By the corresponding range address in first address and second address, according to pre-association in Natural Language Processing Models
Location dictionary is segmented, and the corresponding first participle part in first address and second address corresponding first are respectively obtained
Segment part;By the corresponding tag addresses in first address and second address, according to Natural Language Processing Models
In the first syntactic model segmented, respectively obtain first address it is corresponding second participle part and second address
Corresponding second participle part;The corresponding first participle part in first address and first address is second point corresponding
Word part forms the corresponding first participle group in first address, by the corresponding first participle part in second address and described
Second address corresponding second participle part forms the corresponding second participle group in second address.
In one embodiment, first address and second address respectively further comprise details address, above-mentioned processing
Device is by the corresponding tag addresses in first address and second address, according to first in Natural Language Processing Models
Syntactic model is segmented, and first address corresponding second participle part corresponding with second address the is respectively obtained
After the step of two participle parts, comprising: by the corresponding details address in first address and second address, according to
The second syntactic model in Natural Language Processing Models is segmented, and the corresponding third participle in the first address portion is respectively obtained
Third corresponding with second address is divided to segment part;By the corresponding first participle part in first address, described first
It is corresponding that address corresponding second participle part and first address corresponding third participle part form first address
First participle group, by the corresponding first participle part in second address, second address corresponding second participle part
And the corresponding third participle in the second address part forms the corresponding second participle group in second address.
In one embodiment, the range address includes province, city/area, the township Xian He/four, town administrative grade, the mark
Will address includes cell name or building name, above-mentioned processor according to the second preset rules obtain all first segmentations with
The step of matching result of all second segmentations, comprising: by all first segmentations and all second segmentations point
It is not two mutually isostructural structure trees according to the Sequential Mapping of administrative grade from high to low, wherein the structure tree includes more
A node, each node are corresponded with each first segmentation or each second segmentation respectively;Obtain two structure trees
The corresponding matching value of each node;It is corresponding that corresponding first weight of the range address, the tag addresses are obtained respectively
The corresponding third weight of second weight and the details address;Matching rate is calculated multiplied by respective weights according to matching value, respectively
Obtain corresponding first matching rate of the range address, corresponding second matching rate of the tag addresses and the details address
Corresponding third matching rate;By the adduction of first matching rate, second matching rate and the third matching rate, as institute
State the matching result of all first segmentations with all second segmentations.
In one embodiment, above-mentioned processor obtains the step of the corresponding matching value of two each nodes of structure tree
Suddenly, comprising: by corresponding each first segmentation of the range address in first address, and by the range in second address
Corresponding each second segmentation in location, corresponds according to node corresponding relationship and carries out precisely full matching, obtain each first matching value;It will
Corresponding each first segmentation of tag addresses in first address, it is corresponding each with by the tag addresses in second address
Second segmentation, corresponds according to node corresponding relationship and carries out model keyword match, obtain each second matching value;By described
Corresponding each first segmentation in details address in one address, it is each second point corresponding with by the details address in second address
Section corresponds according to node corresponding relationship and carries out digital matching, obtains each third matching value;Summarize each first matching
Value, each second matching value and each third matching value, obtain two described each node of structure tree corresponding
With value.
In one embodiment, above-mentioned processor obtains corresponding first weight of the range address, the mark respectively
Before the step of corresponding second weight in address and the corresponding third weight in the details address, comprising: will mark in advance similar
The training sample of the specified quantity of angle value is input in the Natural Language Processing Models and is trained;By adjusting training ginseng
Number keeps the similarity value of the Natural Language Processing Models output consistent with the pre- mark similarity value to the first parameter;It will
Corresponding weighted value in first parameter corresponds to first weight, second power according to node corresponding relationship respectively
Weight and the third weight.
It will be understood by those skilled in the art that structure shown in Fig. 3, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme.
One embodiment of the application also provides a kind of computer readable storage medium, is stored thereon with computer program, calculates
Address matching method is realized when machine program is executed by processor, the first address is the address to be retrieved of user's input, the second address
It is stored in index server, method includes: to call preset matching algorithm, respectively by first address and second address
It is segmented according to the first preset rules, obtains the corresponding first participle group in first address and second address is corresponding
Second participle group;First address is divided into multiple first segmentations according to the first participle group, according to described second point
Second address is divided into multiple second segmentations by phrase;All first segmentations and institute are obtained according to the second preset rules
There is the matching result of second segmentation;According to the matching result judge first address and second address whether phase
Together.
Above-mentioned computer readable storage medium, the data being pre-stored in index server are unstructured data, storage
Mode is the column storage form of key-value pair, and unstructured data refers to that text, image, voice etc. are based on NoSQL memory technology shape
At column storage, data volume is very big, needs the NoSQL technology using distributed structure/architecture to be stored and calculated, index service
Device is just being combined with the distributed structure/architecture storage of NoSQL and index structure and is realizing to the real-time quick inquiry of mass data and meter
It calculates, proposes the configurable weight address matching model based on address GradeNDivision, first pass through Natural Language Processing Models over the ground
Location title carries out participle and forms participle group, participle phrase is divided into segmentation according to administrative grade, and be tree-shaped by subsection compression
Node in structure has fully considered the tree of address, and address is carried out classification according to administrative grade and draws section, each administration
Rank two stage cultivation difference weight, the fine-tuning weight of practical business scene.By to the mass data prestored in index server
Index structure is established, computing architecture and powerful distributed computation ability in conjunction with Elastic search component itself are real
Now the first address is inquired real-time, quickly in default index structure.For sectional address first four administrative grade address,
It is accurately matched according to national county and town, province, city and region address base (tree-shaped), in addition, carrying out effective completion for excalation.It is silent
Recognize weight to obtain by training pattern training, by constantly regulate training parameter in the training process, exports model training
Similarity is consistent with the similarity value marked in advance, or within the scope of predetermined deviation, and above-mentioned training parameter includes each weighted value, with
It determines each weighted value, keeps weight setting more reliable.
In one embodiment, first address and second address respectively include range address and tag addresses,
Above-mentioned processor calls the preset matching algorithm, is respectively divided the first address and the second address according to the first preset rules
Word, the step of obtaining the corresponding first participle group in first address and the corresponding second participle group in second address, comprising:
By the corresponding range address in first address and second address, according to pre-association in Natural Language Processing Models
Location dictionary is segmented, and the corresponding first participle part in first address and second address corresponding first are respectively obtained
Segment part;By the corresponding tag addresses in first address and second address, according to Natural Language Processing Models
In the first syntactic model segmented, respectively obtain first address it is corresponding second participle part and second address
Corresponding second participle part;The corresponding first participle part in first address and first address is second point corresponding
Word part forms the corresponding first participle group in first address, by the corresponding first participle part in second address and described
Second address corresponding second participle part forms the corresponding second participle group in second address.
In one embodiment, first address and second address respectively further comprise details address, above-mentioned processing
Device is by the corresponding tag addresses in first address and second address, according to first in Natural Language Processing Models
Syntactic model is segmented, and first address corresponding second participle part corresponding with second address the is respectively obtained
After the step of two participle parts, comprising: by the corresponding details address in first address and second address, according to
The second syntactic model in Natural Language Processing Models is segmented, and the corresponding third participle in the first address portion is respectively obtained
Third corresponding with second address is divided to segment part;By the corresponding first participle part in first address, described first
It is corresponding that address corresponding second participle part and first address corresponding third participle part form first address
First participle group, by the corresponding first participle part in second address, second address corresponding second participle part
And the corresponding third participle in the second address part forms the corresponding second participle group in second address.
In one embodiment, the range address includes province, city/area, the township Xian He/four, town administrative grade, the mark
Will address includes cell name or building name, above-mentioned processor according to the second preset rules obtain all first segmentations with
The step of matching result of all second segmentations, comprising: by all first segmentations and all second segmentations point
It is not two mutually isostructural structure trees according to the Sequential Mapping of administrative grade from high to low, wherein the structure tree includes more
A node, each node are corresponded with each first segmentation or each second segmentation respectively;Obtain two structure trees
The corresponding matching value of each node;It is corresponding that corresponding first weight of the range address, the tag addresses are obtained respectively
The corresponding third weight of second weight and the details address;Matching rate is calculated multiplied by respective weights according to matching value, respectively
Obtain corresponding first matching rate of the range address, corresponding second matching rate of the tag addresses and the details address
Corresponding third matching rate;By the adduction of first matching rate, second matching rate and the third matching rate, as institute
State the matching result of all first segmentations with all second segmentations.
In one embodiment, above-mentioned processor obtains the step of the corresponding matching value of two each nodes of structure tree
Suddenly, comprising: by corresponding each first segmentation of the range address in first address, and by the range in second address
Corresponding each second segmentation in location, corresponds according to node corresponding relationship and carries out precisely full matching, obtain each first matching value;It will
Corresponding each first segmentation of tag addresses in first address, it is corresponding each with by the tag addresses in second address
Second segmentation, corresponds according to node corresponding relationship and carries out model keyword match, obtain each second matching value;By described
Corresponding each first segmentation in details address in one address, it is each second point corresponding with by the details address in second address
Section corresponds according to node corresponding relationship and carries out digital matching, obtains each third matching value;Summarize each first matching
Value, each second matching value and each third matching value, obtain two described each node of structure tree corresponding
With value.
In one embodiment, above-mentioned processor obtains corresponding first weight of the range address, the mark respectively
Before the step of corresponding second weight in address and the corresponding third weight in the details address, comprising: will mark in advance similar
The training sample of the specified quantity of angle value is input in the Natural Language Processing Models and is trained;By adjusting training ginseng
Number keeps the similarity value of the Natural Language Processing Models output consistent with the pre- mark similarity value to the first parameter;It will
Corresponding weighted value in first parameter corresponds to first weight, second power according to node corresponding relationship respectively
Weight and the third weight.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, above-mentioned computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
Any reference used in provided herein and embodiment to memory, storage, database or other media,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, device, article or the method that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, device, article or method institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, device of element, article or method.
The foregoing is merely preferred embodiment of the present application, are not intended to limit the scope of the patents of the application, all utilizations
Equivalent structure or equivalent flow shift made by present specification and accompanying drawing content is applied directly or indirectly in other correlations
Technical field, similarly include in the scope of patent protection of the application.
Claims (10)
1. a kind of address matching method, which is characterized in that the first address is the address to be retrieved of user's input, the storage of the second address
In index server, method includes:
Preset matching algorithm is called, is respectively divided first address and second address according to the first preset rules
Word obtains the corresponding second participle group of the corresponding first participle group in first address and second address, wherein described pre-
If matching algorithm includes participle calculating and matching primitives;
First address is divided into multiple first segmentations according to the first participle group, according to the second participle group by institute
It states the second address and is divided into multiple second segmentations;
The matching result of all first segmentations with all second segmentations is obtained according to the second preset rules;
Judge whether first address and second address are identical according to the matching result.
2. address matching method according to claim 1, which is characterized in that first address and second address point
Not Bao Kuo range address and tag addresses, the calling preset matching algorithm, respectively by the first address and the second address according to
One preset rules are segmented, and obtain the corresponding first participle group in first address and second address is second point corresponding
The step of phrase, comprising:
By the corresponding range address in first address and second address, according to being closed in advance in Natural Language Processing Models
Connection address dictionary is segmented, and respectively obtains the corresponding first participle part in first address and second address is corresponding
First participle part;
By the corresponding tag addresses in first address and second address, according in Natural Language Processing Models
One syntactic model is segmented, and it is corresponding with second address to respectively obtain first address corresponding second participle part
Second participle part;
It will be described in the composition of the corresponding first participle part in first address and first address corresponding second participle part
The corresponding first participle group in first address, the corresponding first participle part in second address and second address is corresponding
Second participle part forms the corresponding second participle group in second address.
3. address matching method according to claim 2, which is characterized in that first address and second address are also
Details address is respectively included, it is described by the corresponding tag addresses in first address and second address, according to nature
The first syntactic model in Language Processing model is segmented, respectively obtain first address it is corresponding second participle part and
After the step of second address corresponding second segments part, comprising:
By the corresponding details address in first address and second address, according in Natural Language Processing Models
Two syntactic models are segmented, and it is corresponding with second address to respectively obtain first address corresponding third participle part
Third segments part;
By the corresponding first participle part in first address, first address corresponding second participle part and described the
One address corresponding third participle part forms the corresponding first participle group in first address, and second address is corresponding
First participle part, second address corresponding second participle part and the corresponding third in second address segment part
Form the corresponding second participle group in second address.
4. address matching method according to claim 3, which is characterized in that the range address includes province, city/area, county
With four, township/town administrative grade, the tag addresses include cell name or building name, described to be obtained according to the second preset rules
The step of taking the matching result of all first segmentations and all second segmentations, comprising:
It is according to the Sequential Mapping of administrative grade from high to low respectively by all first segmentations and all second segmentations
Two mutually isostructural structure trees, wherein the structure tree includes multiple nodes, each node respectively with each first segmentation or
Each second segmentation corresponds;
The corresponding matching value of each node of structure tree of acquisition two;
Corresponding first weight of the range address, corresponding second weight of the tag addresses and the details are obtained respectively
The corresponding third weight in address;
Matching rate is calculated multiplied by respective weights according to matching value, respectively obtains corresponding first matching rate of the range address, institute
State the corresponding third matching rate of corresponding second matching rate of tag addresses and the details address;
By the adduction of first matching rate, second matching rate and the third matching rate, as described all described
The matching result of one segmentation and all second segmentations.
5. address matching method according to claim 4, which is characterized in that two each nodes of structure tree of the acquisition
The step of corresponding matching value, comprising:
By corresponding each first segmentation of the range address in first address, and by the range address pair in second address
Each second segmentation answered, corresponds according to node corresponding relationship and carries out precisely full matching, obtain each first matching value;
By corresponding each first segmentation of the tag addresses in first address, and by the tag addresses pair in second address
Each second segmentation answered, corresponds according to node corresponding relationship and carries out model keyword match, obtain each second matching value;
By corresponding each first segmentation in the details address in first address, and by the details address pair in second address
Each second segmentation answered, corresponds according to node corresponding relationship and carries out digital matching, obtain each third matching value;
Summarize each first matching value, each second matching value and each third matching value, obtains two knots
The corresponding matching value of each node of paper mulberry.
6. address matching method according to claim 5, which is characterized in that described to obtain the range address correspondence respectively
The first weight, corresponding second weight of the tag addresses and the step of the corresponding third weight in the details address it
Before, comprising:
By the training sample of the specified quantity of pre- mark similarity value, it is input in the Natural Language Processing Models and is instructed
Practice;
By adjusting training parameter to the first parameter, make the similarity value and the pre- mark of the Natural Language Processing Models output
It is consistent to infuse similarity value;
By corresponding weighted value in first parameter, first weight, described is corresponded to according to node corresponding relationship respectively
Second weight and the third weight.
7. address matching method according to claim 2, which is characterized in that the calling preset matching algorithm respectively will
First address and second address are segmented according to the first preset rules, obtain first address corresponding first
Before the step of participle group and the corresponding second participle group in second address, comprising:
It is pre- to obtain by non-structured being indexed of address date for the specified quantity being pre-stored in the index server
If index structure;
Receive the interface card that is uploaded under the specified directory of the index server, wherein the interface card is by by institute
It states after preset matching algorithm carries out packing encapsulation and is formed;
Obtain the configuration parameter of the interface card;
The default index structure and the interface card are established into calculating incidence relation by running the configuration parameter.
8. a kind of address matching device, which is characterized in that the first address is the address to be retrieved of user's input, the storage of the second address
In index server, device includes:
Word segmentation module, it is respectively that first address and second address is pre- according to first for calling preset matching algorithm
If rule is segmented, corresponding second participle of the corresponding first participle group in first address and second address is obtained
Group, wherein the preset matching algorithm includes participle calculating and matching primitives;
Division module, for first address to be divided into multiple first segmentations according to the first participle group, according to described
Second address is divided into multiple second segmentations by the second participle group;
Second obtains module, for obtaining all first segmentations and all second segmentations according to the second preset rules
Matching result;
Judgment module, for judging whether first address and second address are identical according to the matching result.
9. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 7 is realized when being executed by processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910601364.8A CN110442603B (en) | 2019-07-03 | 2019-07-03 | Address matching method, device, computer equipment and storage medium |
PCT/CN2020/098804 WO2021000831A1 (en) | 2019-07-03 | 2020-06-29 | Address matching method and apparatus, computer device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910601364.8A CN110442603B (en) | 2019-07-03 | 2019-07-03 | Address matching method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110442603A true CN110442603A (en) | 2019-11-12 |
CN110442603B CN110442603B (en) | 2024-01-19 |
Family
ID=68428771
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910601364.8A Active CN110442603B (en) | 2019-07-03 | 2019-07-03 | Address matching method, device, computer equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110442603B (en) |
WO (1) | WO2021000831A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111144117A (en) * | 2019-12-26 | 2020-05-12 | 同济大学 | Knowledge graph Chinese address disambiguation method |
CN111563806A (en) * | 2020-07-20 | 2020-08-21 | 平安国际智慧城市科技股份有限公司 | Method, device, medium and electronic equipment for identifying merchant compliance in network platform |
CN112163070A (en) * | 2020-09-27 | 2021-01-01 | 杭州海康威视系统技术有限公司 | Location name matching method and device, electronic equipment and machine-readable storage medium |
WO2021000831A1 (en) * | 2019-07-03 | 2021-01-07 | 平安科技(深圳)有限公司 | Address matching method and apparatus, computer device and storage medium |
CN112256821A (en) * | 2020-09-23 | 2021-01-22 | 北京捷通华声科技股份有限公司 | Method, device, equipment and storage medium for complementing Chinese address |
CN112835899A (en) * | 2021-01-29 | 2021-05-25 | 上海寻梦信息技术有限公司 | Address library indexing method, address matching method and related equipment |
CN112835897A (en) * | 2021-01-29 | 2021-05-25 | 上海寻梦信息技术有限公司 | Geographic region division management method, data conversion method and related equipment |
CN113343688A (en) * | 2021-06-22 | 2021-09-03 | 南京星云数字技术有限公司 | Address similarity determination method and device and computer equipment |
CN113987114A (en) * | 2021-09-17 | 2022-01-28 | 上海燃气有限公司 | Address matching method and device based on semantic analysis and electronic equipment |
CN114064827A (en) * | 2020-08-05 | 2022-02-18 | 北京四维图新科技股份有限公司 | Position searching method, device and equipment |
CN114756654A (en) * | 2022-04-25 | 2022-07-15 | 广州城市信息研究所有限公司 | Dynamic place name and address matching method and device, computer equipment and storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113935293B (en) * | 2021-12-16 | 2022-03-22 | 湖南四方天箭信息科技有限公司 | Address splitting and complementing method and device, computer equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1487444A (en) * | 2002-09-13 | 2004-04-07 | 富士施乐株式会社 | Text statement comparing unit |
CN101257516A (en) * | 2008-03-11 | 2008-09-03 | 中兴通讯股份有限公司 | Method for correcting source address |
CN101770499A (en) * | 2009-01-07 | 2010-07-07 | 上海聚力传媒技术有限公司 | Information retrieval method in search engine and corresponding search engine |
CN102402533A (en) * | 2010-09-13 | 2012-04-04 | 方正国际软件有限公司 | Address matching method and system |
CN106874384A (en) * | 2017-01-10 | 2017-06-20 | 广东精规划信息科技股份有限公司 | A kind of isomery address standard handovers and matching process |
KR20180057853A (en) * | 2016-11-23 | 2018-05-31 | 잠쉬딘 허지무하메도브 | Method, system and computer program for converting addresses |
CN108763215A (en) * | 2018-05-30 | 2018-11-06 | 中智诚征信有限公司 | A kind of address storage method, device and computer equipment based on address participle |
CN109145169A (en) * | 2018-07-26 | 2019-01-04 | 浙江省测绘科学技术研究院 | A kind of address matching method based on statistics participle |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10216837B1 (en) * | 2014-12-29 | 2019-02-26 | Google Llc | Selecting pattern matching segments for electronic communication clustering |
CN110442603B (en) * | 2019-07-03 | 2024-01-19 | 平安科技(深圳)有限公司 | Address matching method, device, computer equipment and storage medium |
-
2019
- 2019-07-03 CN CN201910601364.8A patent/CN110442603B/en active Active
-
2020
- 2020-06-29 WO PCT/CN2020/098804 patent/WO2021000831A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1487444A (en) * | 2002-09-13 | 2004-04-07 | 富士施乐株式会社 | Text statement comparing unit |
CN101257516A (en) * | 2008-03-11 | 2008-09-03 | 中兴通讯股份有限公司 | Method for correcting source address |
CN101770499A (en) * | 2009-01-07 | 2010-07-07 | 上海聚力传媒技术有限公司 | Information retrieval method in search engine and corresponding search engine |
CN102402533A (en) * | 2010-09-13 | 2012-04-04 | 方正国际软件有限公司 | Address matching method and system |
KR20180057853A (en) * | 2016-11-23 | 2018-05-31 | 잠쉬딘 허지무하메도브 | Method, system and computer program for converting addresses |
CN106874384A (en) * | 2017-01-10 | 2017-06-20 | 广东精规划信息科技股份有限公司 | A kind of isomery address standard handovers and matching process |
CN108763215A (en) * | 2018-05-30 | 2018-11-06 | 中智诚征信有限公司 | A kind of address storage method, device and computer equipment based on address participle |
CN109145169A (en) * | 2018-07-26 | 2019-01-04 | 浙江省测绘科学技术研究院 | A kind of address matching method based on statistics participle |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021000831A1 (en) * | 2019-07-03 | 2021-01-07 | 平安科技(深圳)有限公司 | Address matching method and apparatus, computer device and storage medium |
CN111144117A (en) * | 2019-12-26 | 2020-05-12 | 同济大学 | Knowledge graph Chinese address disambiguation method |
CN111144117B (en) * | 2019-12-26 | 2023-08-29 | 同济大学 | Method for disambiguating Chinese address of knowledge graph |
CN111563806A (en) * | 2020-07-20 | 2020-08-21 | 平安国际智慧城市科技股份有限公司 | Method, device, medium and electronic equipment for identifying merchant compliance in network platform |
CN114064827A (en) * | 2020-08-05 | 2022-02-18 | 北京四维图新科技股份有限公司 | Position searching method, device and equipment |
CN112256821A (en) * | 2020-09-23 | 2021-01-22 | 北京捷通华声科技股份有限公司 | Method, device, equipment and storage medium for complementing Chinese address |
CN112163070A (en) * | 2020-09-27 | 2021-01-01 | 杭州海康威视系统技术有限公司 | Location name matching method and device, electronic equipment and machine-readable storage medium |
CN112163070B (en) * | 2020-09-27 | 2024-02-27 | 杭州海康威视系统技术有限公司 | Place name matching method, place name matching device, electronic equipment and machine-readable storage medium |
CN112835899A (en) * | 2021-01-29 | 2021-05-25 | 上海寻梦信息技术有限公司 | Address library indexing method, address matching method and related equipment |
CN112835897A (en) * | 2021-01-29 | 2021-05-25 | 上海寻梦信息技术有限公司 | Geographic region division management method, data conversion method and related equipment |
CN112835897B (en) * | 2021-01-29 | 2024-03-15 | 上海寻梦信息技术有限公司 | Geographic area division management method, data conversion method and related equipment |
CN113343688A (en) * | 2021-06-22 | 2021-09-03 | 南京星云数字技术有限公司 | Address similarity determination method and device and computer equipment |
CN113987114A (en) * | 2021-09-17 | 2022-01-28 | 上海燃气有限公司 | Address matching method and device based on semantic analysis and electronic equipment |
CN114756654A (en) * | 2022-04-25 | 2022-07-15 | 广州城市信息研究所有限公司 | Dynamic place name and address matching method and device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110442603B (en) | 2024-01-19 |
WO2021000831A1 (en) | 2021-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110442603A (en) | Address matching method, apparatus, computer equipment and storage medium | |
CN109597856A (en) | A kind of data processing method, device, electronic equipment and storage medium | |
CN101299217B (en) | Method, apparatus and system for processing map information | |
CN103605706B (en) | A kind of resource retrieval method of knowledge based map | |
CN112347222B (en) | Method and system for converting non-standard address into standard address based on knowledge base reasoning | |
CN109657074B (en) | News knowledge graph construction method based on address tree | |
CN108268580A (en) | The answering method and device of knowledge based collection of illustrative plates | |
CN103838837B (en) | Remote sensing Metadata integration method based on semantic template | |
CN104866593A (en) | Database searching method based on knowledge graph | |
CN103440311A (en) | Method and system for identifying geographical name entities | |
CN111881290A (en) | Distribution network multi-source grid entity fusion method based on weighted semantic similarity | |
CN110457420A (en) | Point of interest location recognition methods, device, equipment and storage medium | |
CN112612908A (en) | Natural resource knowledge graph construction method and device, server and readable memory | |
CN104252507B (en) | A kind of business data matching process and device | |
CN109951377A (en) | A kind of good friend's group technology, device, computer equipment and storage medium | |
CN101639776A (en) | Database access and integration method and system thereof | |
CN110377751A (en) | Courseware intelligent generation method, device, computer equipment and storage medium | |
CN111666370A (en) | Semantic indexing method and device for multi-source heterogeneous space data | |
CN110619050A (en) | Intention recognition method and equipment | |
CN110347810A (en) | Method, apparatus, computer equipment and storage medium are answered in dialog mode retrieval | |
CN104794163A (en) | Entity set extension method | |
CN110119478A (en) | A kind of item recommendation method based on similarity of a variety of user feedback datas of combination | |
CN116414823A (en) | Address positioning method and device based on word segmentation model | |
CN112015908A (en) | Knowledge graph construction method and system, and query method and system | |
CN108345662A (en) | A kind of microblog data weighted statistical method of registering considering user distribution area differentiation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |