CN106598953A - Address resolution method and device - Google Patents

Address resolution method and device Download PDF

Info

Publication number
CN106598953A
CN106598953A CN201611239277.5A CN201611239277A CN106598953A CN 106598953 A CN106598953 A CN 106598953A CN 201611239277 A CN201611239277 A CN 201611239277A CN 106598953 A CN106598953 A CN 106598953A
Authority
CN
China
Prior art keywords
participle
result
semantic
word
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611239277.5A
Other languages
Chinese (zh)
Inventor
周长星
杨自强
毛政晖
韩永浩
张继伟
石磊
莫小良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Beyond Information Technology Services Ltd
Original Assignee
Shanghai Beyond Information Technology Services Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Beyond Information Technology Services Ltd filed Critical Shanghai Beyond Information Technology Services Ltd
Priority to CN201611239277.5A priority Critical patent/CN106598953A/en
Publication of CN106598953A publication Critical patent/CN106598953A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The embodiments of the invention provide an address resolution method and device, and relates to the technical field of information processing. The method comprises the steps of segmenting an address to be matched to generate a first segmentation result, and segmenting each standard library address to generate a second segmentation result; calculating the semantic score of third segmentation in the first segmentation result to generate a first semantic vector, and calculating the semantic score of the third segmentation in the second segmentation result to generate a second semantic vector; generating a first word order vector according to the third segmentation and the first segmentation result, generating a second word order vector according to the third segmentation and the second segmentation result, calculating a semantic similarity according to the first semantic vector and the second semantic vector, and calculating a word order similarity according to the first word order vector and the second word order vector; and selecting a standard library address matched with the address to be matched from the plurality of standard library addresses according to the semantic similarity and the word order similarity. The address resolution method and device are simple, fast and high in working efficiency.

Description

Address resolution method and device
Technical field
The present invention relates to technical field of information processing, in particular to a kind of address resolution method and device.
Background technology
At present, power grid enterprises need to parse the fail address of electric power work order, and a sufficient address should be detailed The administration relation of its affiliated administrative region is represented, but in actually writing, custom omits some administrative regions, and in order to be able to standard True description address, it will usually increase some repeated descriptions.Further, since, there is wrong word or ground in the carelessness of staff Situations such as location is imperfect.In at this stage, the quantity of the more complete specification in fail address of electric power work order is fewer, is mostly imperfect With nonstandard address.For the address of complete specifications, can rapidly be parsed, but for imperfect and nonstandard Address, can only otherwise be parsed by manual knowledge.
Due to imperfect nonstandard number of addresses it is larger, although know by hand otherwise process accuracy it is higher, Be staff workload it is big, efficiency is low, cannot meet the demand of operation monitoring business.
The content of the invention
It is an object of the invention to provide a kind of address resolution method and device, to solve prior art in address resolution deposit The problem that workload is big, efficiency is low.
To achieve these goals, the technical scheme that the embodiment of the present invention is adopted is as follows:
In a first aspect, the embodiment of the present invention proposes a kind of address resolution method, for selecting from multiple java standard library addresses The java standard library address matched with address to be matched, the address resolution method includes:The address to be matched is carried out into participle life Into first participle result, the first participle result includes the first participle;Each described java standard library address is carried out into participle life Into the second word segmentation result, second word segmentation result includes the second participle;By the first participle result and second participle As a result the 3rd word segmentation result is merged into, the 3rd word segmentation result includes the 3rd participle;The 3rd participle is calculated described Semantic fraction in one word segmentation result, generates the first semantic vector, and calculating the 3rd participle is in second word segmentation result Semantic fraction, generate the second semantic vector;According to the 3rd participle and the first participle result generate the first word order to Amount, according to the 3rd participle and second word segmentation result the second lexical order vector is generated;According to first semantic vector with The second semantic vector computing semantic similarity;Word order is calculated according to first lexical order vector and second lexical order vector Similarity;Select to be treated with described from the plurality of java standard library address according to the semantic similarity and the word order similarity Java standard library address with address matching.
Second aspect, the embodiment of the present invention also proposes a kind of address analyzing device, for selecting from multiple java standard library addresses The java standard library address matched with address to be matched is selected, the address analyzing device includes:Word-dividing mode, for will be described to be matched Address carries out participle and generates first participle result, and the first participle result includes the first participle, by each java standard library Address carries out participle and generates the second word segmentation result, and second word segmentation result includes the second participle;Merging module, for will be described First participle result merges into the 3rd word segmentation result with second word segmentation result, and the 3rd word segmentation result includes the 3rd point Word;Semantic vector generation module, for calculating semantic fraction of the 3rd participle in the first participle result, generates the One semantic vector, calculates semantic fraction of the 3rd participle in second word segmentation result, generates the second semantic vector;Word Sequence vector generation module, for generating the first lexical order vector according to the 3rd participle and the first participle result, according to institute State the 3rd participle and second word segmentation result generates the second lexical order vector;Semantic Similarity Measurement module, for according to described First semantic vector and the second semantic vector computing semantic similarity;Word order similarity calculation module, for according to described First lexical order vector and second lexical order vector calculate word order similarity;Selecting module, for according to the semantic similarity The java standard library address that matches with the address to be matched is selected from the plurality of java standard library address with the word order similarity.
Hinge structure, the invention has the advantages that:Address resolution method and device that the present invention is provided, lead to Cross carries out participle by address to be matched, generates first participle result, and each java standard library address is carried out into participle, generates second point Word result.The first participle result and second word segmentation result are merged into into the 3rd word segmentation result, the 3rd word segmentation result includes 3rd participle.Semantic fraction of the 3rd participle in the first participle result is calculated, the first semantic vector is generated, the 3rd point is calculated Semantic fraction of the word in second word segmentation result, generates the second semantic vector.Give birth to according to the 3rd participle and first participle result Into the first lexical order vector, according to the 3rd participle and the second word segmentation result the second lexical order vector is generated.According to the first semantic vector with Second semantic vector computing semantic similarity, according to the first lexical order vector and the second lexical order vector word order similarity, foundation are calculated The semantic similarity and word order similarity select the java standard library address matched with the address to be matched from multiple java standard library addresses. Address resolution method provided in an embodiment of the present invention and device are simple, quick, can effectively improve operating efficiency, reduce workload.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate Appended accompanying drawing, is described in detail below.
Description of the drawings
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be attached to what is used needed for embodiment Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, thus be not construed as it is right The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can be with according to this A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 shows that the address analyzing device that one embodiment of the present of invention is provided is applied to the signal of user terminal Figure.
Fig. 2 shows the functional block diagram of the address analyzing device that one embodiment of the present of invention is provided.
Fig. 3 shows the schematic flow sheet of the address resolution method that one embodiment of the present of invention is provided.
Fig. 4 shows the idiographic flow schematic diagram of step S104 in Fig. 3.
Fig. 5 shows the idiographic flow schematic diagram of step S105 in Fig. 3.
Icon:100- user terminals;110- address analyzing devices;120- memories;130- storage controls;140- process Device;150- Peripheral Interfaces;160- display units;170- input-output units;111- word-dividing modes;112- merging modules;113- Semantic vector generation module;114- lexical order vector generation modules;115- Semantic Similarity Measurement modules;116- word order similarity meters Calculate module;117- selecting modules.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Ground description, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.Generally exist Herein the component of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged and designed with a variety of configurations.Cause This, below the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it need not be further defined and is explained in subsequent accompanying drawing.Meanwhile, the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or implying relative importance.
The address analyzing device 110 that one embodiment of the present of invention shown in Fig. 1 is provided is applied to user terminal 100, The user terminal 100 may be, but not limited to, PC (personal computer, PC), smart mobile phone, flat board electricity Brain, personal digital assistant (personal digital assistant, PDA), mobile internet surfing equipment (mobile Internet Device, MID) etc..The user terminal 100 include memory 120, storage control 130, processor 140, Peripheral Interface 150, Display unit 160 and input-output unit 170.
The memory 120, storage control 130, processor 140, Peripheral Interface 150, display unit 160 and input and output Directly or indirectly it is electrically connected between each element of unit 170, to realize the transmission or interaction of data.For example, these elements Typical case's connection can be realized by one or more communication bus or holding wire each other.The address analyzing device 110 include to Few one can be stored in the memory 120 or be solidificated in the user terminal 100 in the form of software or firmware (firmware) Operating system (operating system, OS) in software function module.The processor 140 is used to perform the memory The executable module stored in 120, such as software function module and computer program included by the address analyzing device 110 Deng.
Wherein, the memory 120 can be but not limited to, random access memory (Random Access Memory, RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Memory 120 is used for storage program, and processor 140 is used for after execute instruction is received, and performs the program.The processor 140 And access of other possible components to memory 120 can be carried out under the control of storage control 130.
The processor 140 is probably a kind of IC chip, with signal handling capacity.The processor 140 can be General processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;Can also be digital signal processor (DSP)), special IC (ASIC), ready-made programmable gate Array (FPGA) either other PLDs, discrete gate or transistor logic, discrete hardware components.
The Peripheral Interface 150 couples various input/output devices (such as input-output unit 170, display unit 160) To the processor 140 and the memory 120.In certain embodiments, Peripheral Interface 150, processor 140 and storage control Device 130 can be realized in one single chip.In some other example, they can be realized respectively by independent chip.
The display unit 160 is used to provide an interactive interface or for display image data.
The input-output unit 170 is used to realize interacting for user and the user terminal 100.In the present embodiment, this is defeated Entering output unit 170 can be but not limited to mouse, keyboard etc..
Fig. 2 shows the high-level schematic functional block diagram of the address analyzing device 110 that one embodiment of the present of invention is provided, should Address analyzing device 110 is applied to the user terminal 100, for selecting to be matched with address to be matched from multiple java standard library addresses Java standard library address, including word-dividing mode 111, merging module 112, semantic vector generation module 113, lexical order vector generation module 114th, Semantic Similarity Measurement module 115, word order similarity calculation module 116 and selecting module 117.Above-mentioned multiple java standard libraries Address is pre-stored within the memory 120.
The word-dividing mode 111 is used to for the address to be matched to carry out participle generation first participle result, first participle knot Fruit includes the first participle, each java standard library address is carried out into participle and generates the second word segmentation result, and second word segmentation result includes Second participle.In the present embodiment, an address to be matched is given, the address to be matched is carried out after participle, to obtain first point Word result Ti={ w1, w2 ..., wn }, wherein, w1, w2 ..., wn is the first participle, and n is represented in first participle result Ti The number of one participle, that is to say the vector length Len (Ti) of first participle result Ti;Will be each in multiple java standard library addresses Individual java standard library address is carried out after participle, obtains the second word segmentation result Tj={ k1, k2 ..., km }, wherein, k1, k2 ..., km is Two participles, m represents the number of the second participle in second word segmentation result Tj, that is to say the vector length of second word segmentation result Tj Len(Tj).For example, address to be matched be " cell of ten thousand building, water and soil runoff Nan Ping South Road ", in java standard library address a ground Location is " the happy cell in Nanping South Road, Nanan District, Chongqing City ten thousand ", and the first participle result that the address to be matched generates Jing after participle is Ti =Chongqing City, and Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, the second word segmentation result Tj that the java standard library address generates Jing after participle ={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }.
The merging module 112 is used to for the first participle result and second word segmentation result to merge into the 3rd word segmentation result, 3rd word segmentation result includes the 3rd participle.In the present embodiment, the merging module 112 is by the institute in first participle result Ti All second participles in having the first participle and second word segmentation result Tj are merged, for the identical first participle and second Participle only retains one, thus obtains the 3rd word segmentation result T=Ti ∪ Tj={ p1, p2 ..., px }, wherein, p1, p2 ..., px For the 3rd participle, x represents the number of the 3rd participle in the 3rd word segmentation result T, that is to say the vector length of the 3rd word segmentation result T Degree Len (T), it is known that the vector length Len (T) of the 3rd word segmentation result T≤Len (Tj)+Len (Ti).For example, T={ Chongqing City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }.
The semantic vector generation module 113 is used to calculate semantic fraction of the 3rd participle in the first participle result, The first semantic vector is generated, semantic fraction of the 3rd participle in second word segmentation result is calculated, the second semantic vector is generated. In the present embodiment, for each the 3rd participle (p1, p2 ..., px) in the 3rd word segmentation result T, each is calculated successively The similarity of each first participle (w1, w2 ..., wn) in 3rd participle and first participle result Ti, it is preferable that the similarity Between 0 to 1, the maximum in all similarity results is referred to as semanteme of the 3rd participle in first participle result Ti to value Fraction Ci.In the same manner, for each the 3rd participle in the 3rd word segmentation result T, each the 3rd participle and second is calculated successively The similarity of each the second participle (k1, k2 ..., km) in word segmentation result Tj, the maximum in all similarity results is referred to as Semantic fraction Cj of 3rd participle in second word segmentation result Tj.In the present embodiment, it is every in the 3rd word segmentation result T One vector of the semantic fraction Ci compositions of one the 3rd participle in first participle result Ti is referred to as the first semantic vector, can Si={ C1, C2 ..., Ci } is expressed as, each the 3rd participle in the 3rd word segmentation result T is in second word segmentation result Tj In semantic fraction Cj composition a vector be referred to as the second semantic vector, be represented by Sj={ C1, C2 ..., Cj }.
In the present embodiment, for each the 3rd participle in the 3rd word segmentation result T, when the 3rd participle is at first point When occurring in word result Ti, semantic fraction Ci of the 3rd participle in first participle result Ti is default value;When the 3rd When participle occurs in the second word segmentation result Tj, semantic fraction Cj of the 3rd participle in second word segmentation result Tj is acquiescence Value, in the present embodiment, the default value can be set to 1, i.e. Ci=1, Cj=1.When the 3rd participle does not appear in this first point When in word result Ti, semantic fraction Ci of the 3rd participle in first participle result Ti is the first preset value, when the 3rd When participle is not appeared in second word segmentation result Tj, semantic fraction Cj of the 3rd participle in second word segmentation result Tj is First preset value, in the present embodiment, first preset value can be set to 0.2, but not limited to this.For example, when Ti={ Chongqing City, Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, Tj={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }, T={ Chongqing City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell } when, generation the first semantic vector Si=1,1,0.2, 1,1,0.2,1 }, the second semantic vector Sj={ 1,1,1,0.2,0.2,1,1 } of generation.It is appreciated that in the present embodiment, During " Nanping South Road " does not appear in first participle result Ti with " ten thousand pleasures ", therefore, " Nanping South Road " is with " ten thousand pleasures " at this first point Semantic fraction in word result Ti is 0.2;Meanwhile, during " Nan Ping South Road " does not appear in the second word segmentation result Tj with " ten thousand buildings ", because This, " Nan Ping South Road " is 0.2 with the semantic fraction of " ten thousand buildings " in second word segmentation result Tj.
The lexical order vector generation module 114 be used for according to the 3rd participle and the first participle result generate first word order to Amount, according to the 3rd participle and second word segmentation result the second lexical order vector is generated.In the present embodiment, for the 3rd participle is tied Each the 3rd participle (p1, p2 ..., px) in fruit T, when the 3rd participle is occurred in first participle result Ti, meter There is the word order qi of the 3rd participle in first participle result Ti in calculation, when the 3rd participle occurs in second word segmentation result Tj When middle, there is the word order qj of the 3rd participle in second word segmentation result Tj in calculating.When the 3rd participle do not appear in this When in one word segmentation result Ti, the first participle similar to the 3rd participle is found out, and calculate the 3rd participle and the first participle Similarity, when the similarity be more than the second preset value when, calculating there is the first participle in first participle result Ti Word order qi;In the same manner, when the 3rd participle is not appeared in second word segmentation result Tj, similar to the 3rd participle is found out Two participles, and the similarity of the 3rd participle and second participle is calculated, when the similarity is more than the second preset value, calculate Occurs the word order qj of second participle in second word segmentation result Tj;In the present embodiment, second preset value can be set to 0.4, but not limited to this.When the 3rd participle is not appeared in first participle result Ti and the 3rd participle and this first point When the similarity of word is less than second preset value, word order of the 3rd participle in first participle result Ti is set to sky, assigns Value null;When the 3rd participle is not appeared in second word segmentation result Tj and the 3rd participle is similar to second participle When degree is less than second preset value, word order of the 3rd participle in second word segmentation result Tj is set to sky, assignment null. In the present embodiment, a vector being made up of word order qi is referred to as the first lexical order vector, is represented by ri={ q1, q2 ..., qi }, A vector being made up of word order qj is referred to as the second lexical order vector, is represented by rj={ q1, q2 ..., qj }.For example, Ti=is worked as Chongqing City, and Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, Tj={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }, T= When { Chongqing City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }, the first lexical order vector ri=of generation 1, 2,3,3,4, null, 5 }, the second lexical order vector rj={ 1,2,3,3, null, 4,5 } of generation.It is appreciated that in the present embodiment In, it is believed that the similarity at " Nanping South Road " and " Nan Ping South Road " is more than the second preset value, therefore take " Nan Ping South Road " this first Word order in word segmentation result Ti, i.e. " 3 ";" ten thousand pleasures " is thought simultaneously and including the institute in first participle result Ti including " ten thousand buildings " The similarity for having participle is respectively less than the second preset value, therefore assignment " null ".
The Semantic Similarity Measurement module 115 is used to calculate language with second semantic vector according to first semantic vector Adopted similarity.In the present embodiment, the computing formula of the semantic similarity isWherein, Si is first language Adopted vector, Sj is second semantic vector.For example, according to the first semantic vector Si={ 1,1,0.2,1,1,0.2,1 } and second Semantic vector Sj={ 1,1,1,0.2,0.2,1,1 } can calculate semantic similitude angle value.
The word order similarity calculation module 116 is used to calculate word order according to first lexical order vector and second lexical order vector Similarity.In the present embodiment, the computing formula of the word order similarity isWherein, ri is first word Sequence vector, rj is second lexical order vector.For example, according to the first lexical order vector ri={ 1,2,3,3,4, null, 5 } and the second word Sequence vector rj={ 1,2,3,3, null, 4,5 } can calculate word order Similarity value.
The selecting module 117 is used to be selected from multiple java standard library addresses according to the semantic similarity and the word order similarity The java standard library address matched with the address to be matched.In the present embodiment, when address to be matched is carried out with multiple java standard library addresses After the calculating of semantic similarity and word order similarity, multiple semantic similitude angle value and word order Similarity value is obtained.For example, set One the 3rd preset value, then can select the java standard library address of semantic similarity and word order similarity more than the 3rd preset value, And these java standard library addresses for matching are inserted in one result table, can rob for Customer Service Center, dispatching control center, distribution Repair the departments such as center, maintenance and information support, reasonable distribution human resources are provided.
Fig. 3 shows the schematic flow sheet of the address resolution method that one embodiment of the present of invention is provided.Need explanation , address resolution method of the present invention is not with the particular order of Fig. 3 and described below to limit.It should be appreciated that In other embodiments, the order of address resolution method which part step of the present invention can be mutual according to actual needs Exchange, or part steps therein can also be omitted or deleted.The idiographic flow shown in Fig. 3 will in detail be explained below State.
Step S101, carries out address to be matched participle and generates first participle result, and the first participle result includes first Participle.
In the present embodiment, give an address to be matched, be obtained Jing after participle the first participle result Ti=w1, W2 ..., wn }, wherein, w1, w2 ..., wn is the first participle.For example, address to be matched is " water and soil runoff Nan Ping South Road ten thousand Building cell ", the first participle result generated Jing after participle is Ti={ Chongqing City, Nan'an District, Nan Ping South Road, ten thousand buildings, cell }.
It is appreciated that step S101 can be performed by above-mentioned word-dividing mode 111.
Step S102, carries out each java standard library address participle and generates the second word segmentation result, the second word segmentation result bag Include the second participle.
In the present embodiment, each the java standard library address in multiple java standard library addresses is carried out into participle, is obtained second Word segmentation result Tj={ k1, k2 ..., km }, wherein, k1, k2 ..., km is the second participle.For example, in java standard library address Address is " the happy cell in Nanping South Road, Nanan District, Chongqing City ten thousand ", the second word segmentation result Tj={ Chongqing City, the south generated Jing after participle Land region, Nanping South Road, Wan Le, cell }.
It is appreciated that step S102 can be performed by above-mentioned word-dividing mode 111.
Step S103, by the first participle result and second word segmentation result the 3rd word segmentation result is merged into, the 3rd point Word result includes the 3rd participle.
In the present embodiment, by all first participles in first participle result Ti and second word segmentation result Tj All second participles are merged, and for the identical first participle and the second participle only retain one, thus obtain the 3rd participle As a result T={ p1, p2 ..., px }, wherein, p1, p2 ..., px is the 3rd participle.For example, T={ Chongqing City, Nan'an District, Nanping south Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }.
It is appreciated that step S103 can be performed by above-mentioned merging module 112.
Step S104, calculates semantic fraction of the 3rd participle in the first participle result, generates the first semantic vector, Semantic fraction of the 3rd participle in second word segmentation result is calculated, the second semantic vector is generated.
In the present embodiment, for each the 3rd participle (p1, p2 ..., px) in the 3rd word segmentation result T, count successively Calculate the similarity of each the 3rd participle and each first participle (w1, w2 ..., wn) in first participle result Ti, Suo Youxiang It is semantic fraction Ci of the 3rd participle in first participle result Ti like the maximum in degree result, the 3rd word segmentation result Semantic fraction Ci of the participle of each in T the 3rd in first participle result Ti can constitute the first semantic vector Si=C1, C2 ..., Ci }, semantic fraction Cj of each the 3rd participle in second word segmentation result Tj is obtained in the same manner and by the semanteme The second semantic vector Sj={ C1, C2 ..., Cj } of fraction Cj compositions.
It is appreciated that step S104 can be performed by above-mentioned semantic vector generation module 113.
As shown in figure 4, in the present embodiment, step S104 includes following sub-step:
Sub-step S1041, when the 3rd participle occurs in the first participle result or second word segmentation result, this Semantic fraction of three participles in the first participle result or second word segmentation result is default value.In the present embodiment, this is write from memory Recognizing value can be set to the semantic fraction Ci=1 of 1, i.e. the 3rd participle in first participle result Ti, and the 3rd participle is at this Semantic fraction Cj=1 in second word segmentation result Tj.
Sub-step S1042, when during the 3rd participle does not appear in the first participle result or second word segmentation result, Semantic fraction of 3rd participle in the first participle result or second word segmentation result is the first preset value.In this enforcement In example, first preset value can be set to 0.2, but not limited to this, and now the 3rd participle is in first participle result Ti Semantic fraction Ci=0.2, semantic fraction Cj=0.2 of the 3rd participle in second word segmentation result Tj.
In the present embodiment, by sub-step S1041 and sub-step S1042 generate the first semantic vector Si=1,1,0.2, 1,1,0.2,1 }, the second semantic vector Sj={ 1,1,1,0.2,0.2,1,1 }.
Step S105, generates the first lexical order vector, according to the 3rd point according to the 3rd participle and the first participle result Word and second word segmentation result generate the second lexical order vector.
It is appreciated that step S105 can be performed by above-mentioned lexical order vector generation module 114.
As shown in figure 5, in the present embodiment, step S105 includes following sub-step:
Sub-step S1051, for each the 3rd participle (p1, p2 ..., px), when the 3rd participle occur in this first When in word segmentation result or the second word segmentation result, calculating occur in the first participle result word order of the 3rd participle and this Occurs the word order of the 3rd participle in two word segmentation results.
Sub-step S1052, when during the 3rd participle does not appear in the first participle result or second word segmentation result, Find out the first participle or second participle similar to the 3rd participle, the 3rd participle and the first participle or the second participle Similarity when being more than the second preset value, calculate the word order that occurs the first participle in the first participle result or this Occurs the word order of second participle in two word segmentation results;In the present embodiment, second preset value can be set to 0.4, but not limit In this.
Sub-step S1053, in the 3rd participle does not appear in the first participle result or second word segmentation result simultaneously And the 3rd similarity of participle and the first participle or second participle when being less than second preset value, the 3rd participle Word order is set to sky.In the present embodiment, assignment null.
In the present embodiment, by sub-step S1051, sub-step S1052, sub-step S1053 can obtain, by word order First lexical order vector ri={ q1, q2 ..., qi } of qi compositions, the second lexical order vector rj=being made up of word order qj q1, q2,...,qj}.For example, the first lexical order vector ri={ 1,2,3,3,4, null, 5 }, the second lexical order vector rj=1,2,3,3, null,4,5}。
Step S106, according to first semantic vector and the second semantic vector computing semantic similarity.
In the present embodiment, the computing formula of the semantic similarity isWherein, Si is first semantic for this Vector, Sj is second semantic vector.For example, according to the first semantic vector Si={ 1,1,0.2,1,1,0.2,1 } and the second language Adopted vector Sj={ 1,1,1,0.2,0.2,1,1 } can calculate semantic similitude angle value.
It is appreciated that step S106 can be performed by above-mentioned Semantic Similarity Measurement module 115.
Step S107, according to first lexical order vector and second lexical order vector word order similarity is calculated.
In the present embodiment, the computing formula of the word order similarity isWherein, ri is first word Sequence vector, rj is second lexical order vector.For example, according to the first lexical order vector ri={ 1,2,3,3,4, null, 5 } and the second word Sequence vector rj={ 1,2,3,3, null, 4,5 } can calculate word order Similarity value.
It is appreciated that step S107 can be performed by above-mentioned word order similarity calculation module 116.
Step S108, selects with this to be matchedly according to semantic similarity and word order similarity from multiple java standard library addresses The java standard library address of location matching.
In the present embodiment, when address to be matched and multiple java standard library addresses carry out semantic similarity and word order similarity After calculating, multiple semantic similitude angle value and word order Similarity value are obtained, the 3rd preset value can be set, then can select language Adopted similarity and word order similarity are more than the java standard library address of the 3rd preset value, and java standard library addresses insertion that these are matched In one result table, information can be provided for departments such as Customer Service Center, dispatching control center, distribution repairing center, maintenance Support, reasonable distribution human resources.
It is appreciated that step S108 can be performed by above-mentioned selecting module 117.
It should be noted that the java standard library address in the present embodiment can be one or more, selecting to be treated with this During the java standard library address with address matching, each the java standard library address in the java standard library address will be entered with the address to be matched Row semantic similarity and word order Similarity Measure, and then select the java standard library address with the addresses match to be matched.
In sum, the embodiment of the present invention is provided address resolution method and device, by the way that address to be matched is carried out Participle, generates first participle result.Each java standard library address is carried out into participle, the second word segmentation result is generated.By this first point Word result merges into the 3rd word segmentation result with second word segmentation result, and the 3rd word segmentation result includes the 3rd participle.Calculate the 3rd Semantic fraction of the participle in the first participle result, generates the first semantic vector, calculates the 3rd participle in the second participle knot Semantic fraction in fruit, generates the second semantic vector.The first lexical order vector is generated according to the 3rd participle and first participle result, according to The second lexical order vector is generated according to the 3rd participle and the second word segmentation result.Language is calculated according to the first semantic vector and the second semantic vector Adopted similarity, calculates word order similarity, according to the semantic similarity and word order according to the first lexical order vector and the second lexical order vector Similarity selects the java standard library address matched with the address to be matched from multiple java standard library addresses.It is provided in an embodiment of the present invention Address resolution method and device are simple, quick, can effectively improve operating efficiency, reduce workload.
It should be noted that herein, the relational terms of such as " first " and " second " or the like are used merely to one Individual entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operate it Between there is any this actual relation or order.And, term " including ", "comprising" or its any other variant are intended to Cover including for nonexcludability, so that a series of process, method, article or equipment including key elements not only includes those Key element, but also including other key elements being not expressly set out, or also include for this process, method, article or set Standby intrinsic key element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that Also there is other identical element in the process including the key element, method, article or equipment.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exists Similar terms is represented in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, then it is not required in subsequent accompanying drawing It is further defined and is explained.

Claims (10)

1. a kind of address resolution method, for selecting the java standard library ground matched with address to be matched from multiple java standard library addresses Location, it is characterised in that the address resolution method includes:
The address to be matched is carried out into participle and generates first participle result, the first participle result includes the first participle;
Each described java standard library address is carried out into participle and generates the second word segmentation result, second word segmentation result includes second point Word;
The first participle result and second word segmentation result are merged into into the 3rd word segmentation result, the 3rd word segmentation result bag Include the 3rd participle;
Semantic fraction of the 3rd participle in the first participle result is calculated, the first semantic vector is generated, calculates described Semantic fraction of 3rd participle in second word segmentation result, generates the second semantic vector;
The first lexical order vector is generated according to the 3rd participle and the first participle result, according to the 3rd participle and described Second word segmentation result generates the second lexical order vector;
According to first semantic vector and the second semantic vector computing semantic similarity;
Word order similarity is calculated according to first lexical order vector and second lexical order vector;
Select to be matched with described from the plurality of java standard library address according to the semantic similarity and the word order similarity The java standard library address of address matching.
2. address resolution method as claimed in claim 1, it is characterised in that when the 3rd participle is tied in the first participle When occurring in fruit or second word segmentation result, the 3rd participle is in the first participle result or second word segmentation result In semantic fraction be default value;When the 3rd participle does not appear in the first participle result or second participle knot When in fruit, semantic fraction of the 3rd participle in the first participle result or second word segmentation result is first pre- If value.
3. address resolution method as claimed in claim 1, it is characterised in that when the 3rd participle occurs in described first point When in word result or the second word segmentation result, there is the word order of the 3rd participle with life in the first participle result in calculating The word order for occurring the 3rd participle into first lexical order vector or in second word segmentation result generates second with described Lexical order vector;
When during the 3rd participle does not appear in the first participle result or second word segmentation result, then find out and institute State the similar first participle of the 3rd participle or second participle, the 3rd participle and the first participle or institute When the similarity for stating the second participle is more than the second preset value, calculates and the first participle occur in the first participle result Word order generating first lexical order vector, or occur the word order of second participle in second word segmentation result with life Into second lexical order vector;
In the 3rd participle does not appear in the first participle result or second word segmentation result and the described 3rd When participle is less than second preset value with the similarity of the first participle or second participle, the 3rd participle Word order is set to sky.
4. address resolution method as claimed in claim 1, it is characterised in that the computing formula of the semantic similarity isWherein, Si is first semantic vector, and Sj is second semantic vector.
5. address resolution method as claimed in claim 1, it is characterised in that the computing formula of the word order similarity isWherein, ri is first lexical order vector, and rj is second lexical order vector.
6. a kind of address analyzing device, for selecting the java standard library ground matched with address to be matched from multiple java standard library addresses Location, it is characterised in that the address analyzing device includes:
Word-dividing mode, for the address to be matched to be carried out into participle first participle result, the first participle result bag are generated The first participle is included, each described java standard library address is carried out into participle and is generated the second word segmentation result, the second word segmentation result bag Include the second participle;
Merging module, it is described for the first participle result to be merged into into the 3rd word segmentation result with second word segmentation result 3rd word segmentation result includes the 3rd participle;
Semantic vector generation module, for calculating semantic fraction of the 3rd participle in the first participle result, generates First semantic vector, calculates semantic fraction of the 3rd participle in second word segmentation result, generates the second semantic vector;
Lexical order vector generation module, for generating the first lexical order vector according to the 3rd participle and the first participle result, The second lexical order vector is generated according to the 3rd participle and second word segmentation result;
Semantic Similarity Measurement module, for calculating semantic similitude with second semantic vector according to first semantic vector Degree;
Word order similarity calculation module, it is similar for calculating word order according to first lexical order vector and second lexical order vector Degree;
Selecting module, for selecting from the plurality of java standard library address according to the semantic similarity and the word order similarity The java standard library address matched with the address to be matched.
7. address analyzing device as claimed in claim 6, it is characterised in that when the 3rd participle is tied in the first participle When occurring in fruit or second word segmentation result, the 3rd participle is in the first participle result or second word segmentation result In semantic fraction be default value;When the 3rd participle does not appear in the first participle result or second participle knot When in fruit, semantic fraction of the 3rd participle in the first participle result or second word segmentation result is first pre- If value.
8. address analyzing device as claimed in claim 6, it is characterised in that when the 3rd participle occurs in described first point When in word result or the second word segmentation result, there is the word order of the 3rd participle with life in the first participle result in calculating There is the word order of the 3rd participle to generate the second lexical order vector into the first lexical order vector or in second word segmentation result;
When during the 3rd participle does not appear in the first participle result or second word segmentation result, then find out and institute State the similar first participle of the 3rd participle or second participle, the 3rd participle and the first participle or institute When the similarity for stating the second participle is more than the second preset value, calculates and the first participle occur in the first participle result Word order generating first lexical order vector, or occur the word order of second participle in second word segmentation result with life Into second lexical order vector;
In the 3rd participle does not appear in the first participle result or second word segmentation result and the described 3rd When participle is less than second preset value with the similarity of the first participle or second participle, the 3rd participle Word order is set to sky.
9. address analyzing device as claimed in claim 6, it is characterised in that the computing formula of the semantic similarity isWherein, Si is first semantic vector, and Sj is second semantic vector.
10. address analyzing device as claimed in claim 6, it is characterised in that the computing formula of the word order similarity isWherein, ri is first lexical order vector, and rj is second lexical order vector.
CN201611239277.5A 2016-12-28 2016-12-28 Address resolution method and device Pending CN106598953A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611239277.5A CN106598953A (en) 2016-12-28 2016-12-28 Address resolution method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611239277.5A CN106598953A (en) 2016-12-28 2016-12-28 Address resolution method and device

Publications (1)

Publication Number Publication Date
CN106598953A true CN106598953A (en) 2017-04-26

Family

ID=58604811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611239277.5A Pending CN106598953A (en) 2016-12-28 2016-12-28 Address resolution method and device

Country Status (1)

Country Link
CN (1) CN106598953A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577744A (en) * 2017-08-28 2018-01-12 苏州科技大学 Nonstandard Address automatic matching model, matching process and method for establishing model
CN109145073A (en) * 2018-08-28 2019-01-04 成都市映潮科技股份有限公司 A kind of address resolution method and device based on segmentation methods
CN109145095A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Information of place names matching process, information matching method, device and computer equipment
CN109254964A (en) * 2018-08-20 2019-01-22 中国平安人寿保险股份有限公司 Address Standardization method, apparatus, computer equipment and storage medium
CN109753555A (en) * 2018-11-30 2019-05-14 平安科技(深圳)有限公司 Word match method, apparatus, equipment and computer readable storage medium
CN110019575A (en) * 2017-08-04 2019-07-16 北京京东尚科信息技术有限公司 The method and apparatus that geographical address is standardized
CN110532546A (en) * 2019-07-29 2019-12-03 河北远东通信系统工程有限公司 A kind of automatic delivery method of alert merging geographical location and text similarity
CN111400433A (en) * 2019-01-02 2020-07-10 阿里巴巴集团控股有限公司 Address text processing method and device
CN111625732A (en) * 2020-05-25 2020-09-04 鼎富智能科技有限公司 Address matching method and device
CN112818685A (en) * 2021-01-29 2021-05-18 上海寻梦信息技术有限公司 Address matching method and device, electronic equipment and storage medium
CN112884390A (en) * 2019-11-29 2021-06-01 北京三快在线科技有限公司 Order processing method and device, readable storage medium and electronic equipment
CN113987114A (en) * 2021-09-17 2022-01-28 上海燃气有限公司 Address matching method and device based on semantic analysis and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙亚夫 等: "基于分词的地址匹配技术", 《中国地理信息系统协会年会.2007》 *
殷耀明 等: "基于关系向量模型的句子相似度计算", 《计算机工程与应用》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145095A (en) * 2017-06-16 2019-01-04 贵州小爱机器人科技有限公司 Information of place names matching process, information matching method, device and computer equipment
CN109145095B (en) * 2017-06-16 2024-03-29 贵州小爱机器人科技有限公司 Place name information matching method, information matching device and computer equipment
CN110019575A (en) * 2017-08-04 2019-07-16 北京京东尚科信息技术有限公司 The method and apparatus that geographical address is standardized
CN107577744A (en) * 2017-08-28 2018-01-12 苏州科技大学 Nonstandard Address automatic matching model, matching process and method for establishing model
CN109254964A (en) * 2018-08-20 2019-01-22 中国平安人寿保险股份有限公司 Address Standardization method, apparatus, computer equipment and storage medium
CN109145073A (en) * 2018-08-28 2019-01-04 成都市映潮科技股份有限公司 A kind of address resolution method and device based on segmentation methods
CN109753555A (en) * 2018-11-30 2019-05-14 平安科技(深圳)有限公司 Word match method, apparatus, equipment and computer readable storage medium
CN109753555B (en) * 2018-11-30 2023-07-07 平安科技(深圳)有限公司 Word matching method, device, equipment and computer readable storage medium
CN111400433B (en) * 2019-01-02 2023-04-11 阿里巴巴集团控股有限公司 Address text processing method and device
CN111400433A (en) * 2019-01-02 2020-07-10 阿里巴巴集团控股有限公司 Address text processing method and device
CN110532546A (en) * 2019-07-29 2019-12-03 河北远东通信系统工程有限公司 A kind of automatic delivery method of alert merging geographical location and text similarity
CN112884390A (en) * 2019-11-29 2021-06-01 北京三快在线科技有限公司 Order processing method and device, readable storage medium and electronic equipment
CN111625732A (en) * 2020-05-25 2020-09-04 鼎富智能科技有限公司 Address matching method and device
CN111625732B (en) * 2020-05-25 2023-06-23 鼎富智能科技有限公司 Address matching method and device
CN112818685A (en) * 2021-01-29 2021-05-18 上海寻梦信息技术有限公司 Address matching method and device, electronic equipment and storage medium
CN113987114A (en) * 2021-09-17 2022-01-28 上海燃气有限公司 Address matching method and device based on semantic analysis and electronic equipment

Similar Documents

Publication Publication Date Title
CN106598953A (en) Address resolution method and device
WO2021135919A1 (en) Machine learning-based sql statement security testing method and apparatus, device, and medium
CN107729924B (en) Picture review probability interval generation method and picture review determination method
CN112365202A (en) Method for screening evaluation factors of multi-target object and related equipment thereof
CN106168959A (en) Page layout method and device
KR20190017395A (en) Method for providing data management service having automatic cell merging function and providing service server for performing the same
CN113434542B (en) Data relationship identification method and device, electronic equipment and storage medium
CN111914101B (en) File association relationship abnormality identification method and device and computer equipment
CN113779269A (en) Power grid load data display method and device, electronic equipment and storage medium
CN113595246A (en) Microgrid state online monitoring method and device, computer equipment and storage medium
CN110443072B (en) Data signature method, data verification device and storage medium
CN108830663B (en) Electric power customer value evaluation method and system and terminal equipment
CN113365113B (en) Target node identification method and device
CN110399658A (en) Accelerated factor value calculating method, device, equipment and the storage medium of battery
CN113656187B (en) Public security big data computing power service system based on 5G
CN114123190A (en) Method and device for determining target region to which ammeter belongs, electronic equipment and storage medium
WO2022105120A1 (en) Text detection method and apparatus from image, computer device and storage medium
CN111738290B (en) Image detection method, model construction and training method, device, equipment and medium
CN114565105A (en) Data processing method and deep learning model training method and device
CN109696614A (en) Circuit test optimization method and device
CN114398434A (en) Structured information extraction method and device, electronic equipment and storage medium
CN114782668A (en) Model aggregation method, device and system and electronic equipment
CN111222739A (en) Task allocation method and task allocation system of nuclear power station
CN116992220B (en) Low-redundancy electricity consumption data intelligent acquisition method
CN115509909B (en) Test method, test device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170426