CN106598953A - Address resolution method and device - Google Patents
Address resolution method and device Download PDFInfo
- Publication number
- CN106598953A CN106598953A CN201611239277.5A CN201611239277A CN106598953A CN 106598953 A CN106598953 A CN 106598953A CN 201611239277 A CN201611239277 A CN 201611239277A CN 106598953 A CN106598953 A CN 106598953A
- Authority
- CN
- China
- Prior art keywords
- participle
- result
- semantic
- word
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Abstract
The embodiments of the invention provide an address resolution method and device, and relates to the technical field of information processing. The method comprises the steps of segmenting an address to be matched to generate a first segmentation result, and segmenting each standard library address to generate a second segmentation result; calculating the semantic score of third segmentation in the first segmentation result to generate a first semantic vector, and calculating the semantic score of the third segmentation in the second segmentation result to generate a second semantic vector; generating a first word order vector according to the third segmentation and the first segmentation result, generating a second word order vector according to the third segmentation and the second segmentation result, calculating a semantic similarity according to the first semantic vector and the second semantic vector, and calculating a word order similarity according to the first word order vector and the second word order vector; and selecting a standard library address matched with the address to be matched from the plurality of standard library addresses according to the semantic similarity and the word order similarity. The address resolution method and device are simple, fast and high in working efficiency.
Description
Technical field
The present invention relates to technical field of information processing, in particular to a kind of address resolution method and device.
Background technology
At present, power grid enterprises need to parse the fail address of electric power work order, and a sufficient address should be detailed
The administration relation of its affiliated administrative region is represented, but in actually writing, custom omits some administrative regions, and in order to be able to standard
True description address, it will usually increase some repeated descriptions.Further, since, there is wrong word or ground in the carelessness of staff
Situations such as location is imperfect.In at this stage, the quantity of the more complete specification in fail address of electric power work order is fewer, is mostly imperfect
With nonstandard address.For the address of complete specifications, can rapidly be parsed, but for imperfect and nonstandard
Address, can only otherwise be parsed by manual knowledge.
Due to imperfect nonstandard number of addresses it is larger, although know by hand otherwise process accuracy it is higher,
Be staff workload it is big, efficiency is low, cannot meet the demand of operation monitoring business.
The content of the invention
It is an object of the invention to provide a kind of address resolution method and device, to solve prior art in address resolution deposit
The problem that workload is big, efficiency is low.
To achieve these goals, the technical scheme that the embodiment of the present invention is adopted is as follows:
In a first aspect, the embodiment of the present invention proposes a kind of address resolution method, for selecting from multiple java standard library addresses
The java standard library address matched with address to be matched, the address resolution method includes:The address to be matched is carried out into participle life
Into first participle result, the first participle result includes the first participle;Each described java standard library address is carried out into participle life
Into the second word segmentation result, second word segmentation result includes the second participle;By the first participle result and second participle
As a result the 3rd word segmentation result is merged into, the 3rd word segmentation result includes the 3rd participle;The 3rd participle is calculated described
Semantic fraction in one word segmentation result, generates the first semantic vector, and calculating the 3rd participle is in second word segmentation result
Semantic fraction, generate the second semantic vector;According to the 3rd participle and the first participle result generate the first word order to
Amount, according to the 3rd participle and second word segmentation result the second lexical order vector is generated;According to first semantic vector with
The second semantic vector computing semantic similarity;Word order is calculated according to first lexical order vector and second lexical order vector
Similarity;Select to be treated with described from the plurality of java standard library address according to the semantic similarity and the word order similarity
Java standard library address with address matching.
Second aspect, the embodiment of the present invention also proposes a kind of address analyzing device, for selecting from multiple java standard library addresses
The java standard library address matched with address to be matched is selected, the address analyzing device includes:Word-dividing mode, for will be described to be matched
Address carries out participle and generates first participle result, and the first participle result includes the first participle, by each java standard library
Address carries out participle and generates the second word segmentation result, and second word segmentation result includes the second participle;Merging module, for will be described
First participle result merges into the 3rd word segmentation result with second word segmentation result, and the 3rd word segmentation result includes the 3rd point
Word;Semantic vector generation module, for calculating semantic fraction of the 3rd participle in the first participle result, generates the
One semantic vector, calculates semantic fraction of the 3rd participle in second word segmentation result, generates the second semantic vector;Word
Sequence vector generation module, for generating the first lexical order vector according to the 3rd participle and the first participle result, according to institute
State the 3rd participle and second word segmentation result generates the second lexical order vector;Semantic Similarity Measurement module, for according to described
First semantic vector and the second semantic vector computing semantic similarity;Word order similarity calculation module, for according to described
First lexical order vector and second lexical order vector calculate word order similarity;Selecting module, for according to the semantic similarity
The java standard library address that matches with the address to be matched is selected from the plurality of java standard library address with the word order similarity.
Hinge structure, the invention has the advantages that:Address resolution method and device that the present invention is provided, lead to
Cross carries out participle by address to be matched, generates first participle result, and each java standard library address is carried out into participle, generates second point
Word result.The first participle result and second word segmentation result are merged into into the 3rd word segmentation result, the 3rd word segmentation result includes
3rd participle.Semantic fraction of the 3rd participle in the first participle result is calculated, the first semantic vector is generated, the 3rd point is calculated
Semantic fraction of the word in second word segmentation result, generates the second semantic vector.Give birth to according to the 3rd participle and first participle result
Into the first lexical order vector, according to the 3rd participle and the second word segmentation result the second lexical order vector is generated.According to the first semantic vector with
Second semantic vector computing semantic similarity, according to the first lexical order vector and the second lexical order vector word order similarity, foundation are calculated
The semantic similarity and word order similarity select the java standard library address matched with the address to be matched from multiple java standard library addresses.
Address resolution method provided in an embodiment of the present invention and device are simple, quick, can effectively improve operating efficiency, reduce workload.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate
Appended accompanying drawing, is described in detail below.
Description of the drawings
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below will be attached to what is used needed for embodiment
Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, thus be not construed as it is right
The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can be with according to this
A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 shows that the address analyzing device that one embodiment of the present of invention is provided is applied to the signal of user terminal
Figure.
Fig. 2 shows the functional block diagram of the address analyzing device that one embodiment of the present of invention is provided.
Fig. 3 shows the schematic flow sheet of the address resolution method that one embodiment of the present of invention is provided.
Fig. 4 shows the idiographic flow schematic diagram of step S104 in Fig. 3.
Fig. 5 shows the idiographic flow schematic diagram of step S105 in Fig. 3.
Icon:100- user terminals;110- address analyzing devices;120- memories;130- storage controls;140- process
Device;150- Peripheral Interfaces;160- display units;170- input-output units;111- word-dividing modes;112- merging modules;113-
Semantic vector generation module;114- lexical order vector generation modules;115- Semantic Similarity Measurement modules;116- word order similarity meters
Calculate module;117- selecting modules.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Ground description, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.Generally exist
Herein the component of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged and designed with a variety of configurations.Cause
This, below the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention
Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi
It is defined in individual accompanying drawing, then it need not be further defined and is explained in subsequent accompanying drawing.Meanwhile, the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that indicating or implying relative importance.
The address analyzing device 110 that one embodiment of the present of invention shown in Fig. 1 is provided is applied to user terminal 100,
The user terminal 100 may be, but not limited to, PC (personal computer, PC), smart mobile phone, flat board electricity
Brain, personal digital assistant (personal digital assistant, PDA), mobile internet surfing equipment (mobile Internet
Device, MID) etc..The user terminal 100 include memory 120, storage control 130, processor 140, Peripheral Interface 150,
Display unit 160 and input-output unit 170.
The memory 120, storage control 130, processor 140, Peripheral Interface 150, display unit 160 and input and output
Directly or indirectly it is electrically connected between each element of unit 170, to realize the transmission or interaction of data.For example, these elements
Typical case's connection can be realized by one or more communication bus or holding wire each other.The address analyzing device 110 include to
Few one can be stored in the memory 120 or be solidificated in the user terminal 100 in the form of software or firmware (firmware)
Operating system (operating system, OS) in software function module.The processor 140 is used to perform the memory
The executable module stored in 120, such as software function module and computer program included by the address analyzing device 110
Deng.
Wherein, the memory 120 can be but not limited to, random access memory (Random Access Memory,
RAM), read-only storage (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only
Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM),
Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Memory 120 is used for storage program, and processor 140 is used for after execute instruction is received, and performs the program.The processor 140
And access of other possible components to memory 120 can be carried out under the control of storage control 130.
The processor 140 is probably a kind of IC chip, with signal handling capacity.The processor 140 can be
General processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network
Processor, NP) etc.;Can also be digital signal processor (DSP)), special IC (ASIC), ready-made programmable gate
Array (FPGA) either other PLDs, discrete gate or transistor logic, discrete hardware components.
The Peripheral Interface 150 couples various input/output devices (such as input-output unit 170, display unit 160)
To the processor 140 and the memory 120.In certain embodiments, Peripheral Interface 150, processor 140 and storage control
Device 130 can be realized in one single chip.In some other example, they can be realized respectively by independent chip.
The display unit 160 is used to provide an interactive interface or for display image data.
The input-output unit 170 is used to realize interacting for user and the user terminal 100.In the present embodiment, this is defeated
Entering output unit 170 can be but not limited to mouse, keyboard etc..
Fig. 2 shows the high-level schematic functional block diagram of the address analyzing device 110 that one embodiment of the present of invention is provided, should
Address analyzing device 110 is applied to the user terminal 100, for selecting to be matched with address to be matched from multiple java standard library addresses
Java standard library address, including word-dividing mode 111, merging module 112, semantic vector generation module 113, lexical order vector generation module
114th, Semantic Similarity Measurement module 115, word order similarity calculation module 116 and selecting module 117.Above-mentioned multiple java standard libraries
Address is pre-stored within the memory 120.
The word-dividing mode 111 is used to for the address to be matched to carry out participle generation first participle result, first participle knot
Fruit includes the first participle, each java standard library address is carried out into participle and generates the second word segmentation result, and second word segmentation result includes
Second participle.In the present embodiment, an address to be matched is given, the address to be matched is carried out after participle, to obtain first point
Word result Ti={ w1, w2 ..., wn }, wherein, w1, w2 ..., wn is the first participle, and n is represented in first participle result Ti
The number of one participle, that is to say the vector length Len (Ti) of first participle result Ti;Will be each in multiple java standard library addresses
Individual java standard library address is carried out after participle, obtains the second word segmentation result Tj={ k1, k2 ..., km }, wherein, k1, k2 ..., km is
Two participles, m represents the number of the second participle in second word segmentation result Tj, that is to say the vector length of second word segmentation result Tj
Len(Tj).For example, address to be matched be " cell of ten thousand building, water and soil runoff Nan Ping South Road ", in java standard library address a ground
Location is " the happy cell in Nanping South Road, Nanan District, Chongqing City ten thousand ", and the first participle result that the address to be matched generates Jing after participle is Ti
=Chongqing City, and Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, the second word segmentation result Tj that the java standard library address generates Jing after participle
={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }.
The merging module 112 is used to for the first participle result and second word segmentation result to merge into the 3rd word segmentation result,
3rd word segmentation result includes the 3rd participle.In the present embodiment, the merging module 112 is by the institute in first participle result Ti
All second participles in having the first participle and second word segmentation result Tj are merged, for the identical first participle and second
Participle only retains one, thus obtains the 3rd word segmentation result T=Ti ∪ Tj={ p1, p2 ..., px }, wherein, p1, p2 ..., px
For the 3rd participle, x represents the number of the 3rd participle in the 3rd word segmentation result T, that is to say the vector length of the 3rd word segmentation result T
Degree Len (T), it is known that the vector length Len (T) of the 3rd word segmentation result T≤Len (Tj)+Len (Ti).For example, T={ Chongqing
City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }.
The semantic vector generation module 113 is used to calculate semantic fraction of the 3rd participle in the first participle result,
The first semantic vector is generated, semantic fraction of the 3rd participle in second word segmentation result is calculated, the second semantic vector is generated.
In the present embodiment, for each the 3rd participle (p1, p2 ..., px) in the 3rd word segmentation result T, each is calculated successively
The similarity of each first participle (w1, w2 ..., wn) in 3rd participle and first participle result Ti, it is preferable that the similarity
Between 0 to 1, the maximum in all similarity results is referred to as semanteme of the 3rd participle in first participle result Ti to value
Fraction Ci.In the same manner, for each the 3rd participle in the 3rd word segmentation result T, each the 3rd participle and second is calculated successively
The similarity of each the second participle (k1, k2 ..., km) in word segmentation result Tj, the maximum in all similarity results is referred to as
Semantic fraction Cj of 3rd participle in second word segmentation result Tj.In the present embodiment, it is every in the 3rd word segmentation result T
One vector of the semantic fraction Ci compositions of one the 3rd participle in first participle result Ti is referred to as the first semantic vector, can
Si={ C1, C2 ..., Ci } is expressed as, each the 3rd participle in the 3rd word segmentation result T is in second word segmentation result Tj
In semantic fraction Cj composition a vector be referred to as the second semantic vector, be represented by Sj={ C1, C2 ..., Cj }.
In the present embodiment, for each the 3rd participle in the 3rd word segmentation result T, when the 3rd participle is at first point
When occurring in word result Ti, semantic fraction Ci of the 3rd participle in first participle result Ti is default value;When the 3rd
When participle occurs in the second word segmentation result Tj, semantic fraction Cj of the 3rd participle in second word segmentation result Tj is acquiescence
Value, in the present embodiment, the default value can be set to 1, i.e. Ci=1, Cj=1.When the 3rd participle does not appear in this first point
When in word result Ti, semantic fraction Ci of the 3rd participle in first participle result Ti is the first preset value, when the 3rd
When participle is not appeared in second word segmentation result Tj, semantic fraction Cj of the 3rd participle in second word segmentation result Tj is
First preset value, in the present embodiment, first preset value can be set to 0.2, but not limited to this.For example, when Ti={ Chongqing
City, Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, Tj={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }, T={ Chongqing
City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell } when, generation the first semantic vector Si=1,1,0.2,
1,1,0.2,1 }, the second semantic vector Sj={ 1,1,1,0.2,0.2,1,1 } of generation.It is appreciated that in the present embodiment,
During " Nanping South Road " does not appear in first participle result Ti with " ten thousand pleasures ", therefore, " Nanping South Road " is with " ten thousand pleasures " at this first point
Semantic fraction in word result Ti is 0.2;Meanwhile, during " Nan Ping South Road " does not appear in the second word segmentation result Tj with " ten thousand buildings ", because
This, " Nan Ping South Road " is 0.2 with the semantic fraction of " ten thousand buildings " in second word segmentation result Tj.
The lexical order vector generation module 114 be used for according to the 3rd participle and the first participle result generate first word order to
Amount, according to the 3rd participle and second word segmentation result the second lexical order vector is generated.In the present embodiment, for the 3rd participle is tied
Each the 3rd participle (p1, p2 ..., px) in fruit T, when the 3rd participle is occurred in first participle result Ti, meter
There is the word order qi of the 3rd participle in first participle result Ti in calculation, when the 3rd participle occurs in second word segmentation result Tj
When middle, there is the word order qj of the 3rd participle in second word segmentation result Tj in calculating.When the 3rd participle do not appear in this
When in one word segmentation result Ti, the first participle similar to the 3rd participle is found out, and calculate the 3rd participle and the first participle
Similarity, when the similarity be more than the second preset value when, calculating there is the first participle in first participle result Ti
Word order qi;In the same manner, when the 3rd participle is not appeared in second word segmentation result Tj, similar to the 3rd participle is found out
Two participles, and the similarity of the 3rd participle and second participle is calculated, when the similarity is more than the second preset value, calculate
Occurs the word order qj of second participle in second word segmentation result Tj;In the present embodiment, second preset value can be set to
0.4, but not limited to this.When the 3rd participle is not appeared in first participle result Ti and the 3rd participle and this first point
When the similarity of word is less than second preset value, word order of the 3rd participle in first participle result Ti is set to sky, assigns
Value null;When the 3rd participle is not appeared in second word segmentation result Tj and the 3rd participle is similar to second participle
When degree is less than second preset value, word order of the 3rd participle in second word segmentation result Tj is set to sky, assignment null.
In the present embodiment, a vector being made up of word order qi is referred to as the first lexical order vector, is represented by ri={ q1, q2 ..., qi },
A vector being made up of word order qj is referred to as the second lexical order vector, is represented by rj={ q1, q2 ..., qj }.For example, Ti=is worked as
Chongqing City, and Nan'an District, Nan Ping South Road, ten thousand buildings, cell }, Tj={ Chongqing City, Nan'an District, Nanping South Road, Wan Le, cell }, T=
When { Chongqing City, Nan'an District, Nanping South Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }, the first lexical order vector ri=of generation 1,
2,3,3,4, null, 5 }, the second lexical order vector rj={ 1,2,3,3, null, 4,5 } of generation.It is appreciated that in the present embodiment
In, it is believed that the similarity at " Nanping South Road " and " Nan Ping South Road " is more than the second preset value, therefore take " Nan Ping South Road " this first
Word order in word segmentation result Ti, i.e. " 3 ";" ten thousand pleasures " is thought simultaneously and including the institute in first participle result Ti including " ten thousand buildings "
The similarity for having participle is respectively less than the second preset value, therefore assignment " null ".
The Semantic Similarity Measurement module 115 is used to calculate language with second semantic vector according to first semantic vector
Adopted similarity.In the present embodiment, the computing formula of the semantic similarity isWherein, Si is first language
Adopted vector, Sj is second semantic vector.For example, according to the first semantic vector Si={ 1,1,0.2,1,1,0.2,1 } and second
Semantic vector Sj={ 1,1,1,0.2,0.2,1,1 } can calculate semantic similitude angle value.
The word order similarity calculation module 116 is used to calculate word order according to first lexical order vector and second lexical order vector
Similarity.In the present embodiment, the computing formula of the word order similarity isWherein, ri is first word
Sequence vector, rj is second lexical order vector.For example, according to the first lexical order vector ri={ 1,2,3,3,4, null, 5 } and the second word
Sequence vector rj={ 1,2,3,3, null, 4,5 } can calculate word order Similarity value.
The selecting module 117 is used to be selected from multiple java standard library addresses according to the semantic similarity and the word order similarity
The java standard library address matched with the address to be matched.In the present embodiment, when address to be matched is carried out with multiple java standard library addresses
After the calculating of semantic similarity and word order similarity, multiple semantic similitude angle value and word order Similarity value is obtained.For example, set
One the 3rd preset value, then can select the java standard library address of semantic similarity and word order similarity more than the 3rd preset value,
And these java standard library addresses for matching are inserted in one result table, can rob for Customer Service Center, dispatching control center, distribution
Repair the departments such as center, maintenance and information support, reasonable distribution human resources are provided.
Fig. 3 shows the schematic flow sheet of the address resolution method that one embodiment of the present of invention is provided.Need explanation
, address resolution method of the present invention is not with the particular order of Fig. 3 and described below to limit.It should be appreciated that
In other embodiments, the order of address resolution method which part step of the present invention can be mutual according to actual needs
Exchange, or part steps therein can also be omitted or deleted.The idiographic flow shown in Fig. 3 will in detail be explained below
State.
Step S101, carries out address to be matched participle and generates first participle result, and the first participle result includes first
Participle.
In the present embodiment, give an address to be matched, be obtained Jing after participle the first participle result Ti=w1,
W2 ..., wn }, wherein, w1, w2 ..., wn is the first participle.For example, address to be matched is " water and soil runoff Nan Ping South Road ten thousand
Building cell ", the first participle result generated Jing after participle is Ti={ Chongqing City, Nan'an District, Nan Ping South Road, ten thousand buildings, cell }.
It is appreciated that step S101 can be performed by above-mentioned word-dividing mode 111.
Step S102, carries out each java standard library address participle and generates the second word segmentation result, the second word segmentation result bag
Include the second participle.
In the present embodiment, each the java standard library address in multiple java standard library addresses is carried out into participle, is obtained second
Word segmentation result Tj={ k1, k2 ..., km }, wherein, k1, k2 ..., km is the second participle.For example, in java standard library address
Address is " the happy cell in Nanping South Road, Nanan District, Chongqing City ten thousand ", the second word segmentation result Tj={ Chongqing City, the south generated Jing after participle
Land region, Nanping South Road, Wan Le, cell }.
It is appreciated that step S102 can be performed by above-mentioned word-dividing mode 111.
Step S103, by the first participle result and second word segmentation result the 3rd word segmentation result is merged into, the 3rd point
Word result includes the 3rd participle.
In the present embodiment, by all first participles in first participle result Ti and second word segmentation result Tj
All second participles are merged, and for the identical first participle and the second participle only retain one, thus obtain the 3rd participle
As a result T={ p1, p2 ..., px }, wherein, p1, p2 ..., px is the 3rd participle.For example, T={ Chongqing City, Nan'an District, Nanping south
Road, Nan Ping South Road, ten thousand buildings, Wan Le, cell }.
It is appreciated that step S103 can be performed by above-mentioned merging module 112.
Step S104, calculates semantic fraction of the 3rd participle in the first participle result, generates the first semantic vector,
Semantic fraction of the 3rd participle in second word segmentation result is calculated, the second semantic vector is generated.
In the present embodiment, for each the 3rd participle (p1, p2 ..., px) in the 3rd word segmentation result T, count successively
Calculate the similarity of each the 3rd participle and each first participle (w1, w2 ..., wn) in first participle result Ti, Suo Youxiang
It is semantic fraction Ci of the 3rd participle in first participle result Ti like the maximum in degree result, the 3rd word segmentation result
Semantic fraction Ci of the participle of each in T the 3rd in first participle result Ti can constitute the first semantic vector Si=C1,
C2 ..., Ci }, semantic fraction Cj of each the 3rd participle in second word segmentation result Tj is obtained in the same manner and by the semanteme
The second semantic vector Sj={ C1, C2 ..., Cj } of fraction Cj compositions.
It is appreciated that step S104 can be performed by above-mentioned semantic vector generation module 113.
As shown in figure 4, in the present embodiment, step S104 includes following sub-step:
Sub-step S1041, when the 3rd participle occurs in the first participle result or second word segmentation result, this
Semantic fraction of three participles in the first participle result or second word segmentation result is default value.In the present embodiment, this is write from memory
Recognizing value can be set to the semantic fraction Ci=1 of 1, i.e. the 3rd participle in first participle result Ti, and the 3rd participle is at this
Semantic fraction Cj=1 in second word segmentation result Tj.
Sub-step S1042, when during the 3rd participle does not appear in the first participle result or second word segmentation result,
Semantic fraction of 3rd participle in the first participle result or second word segmentation result is the first preset value.In this enforcement
In example, first preset value can be set to 0.2, but not limited to this, and now the 3rd participle is in first participle result Ti
Semantic fraction Ci=0.2, semantic fraction Cj=0.2 of the 3rd participle in second word segmentation result Tj.
In the present embodiment, by sub-step S1041 and sub-step S1042 generate the first semantic vector Si=1,1,0.2,
1,1,0.2,1 }, the second semantic vector Sj={ 1,1,1,0.2,0.2,1,1 }.
Step S105, generates the first lexical order vector, according to the 3rd point according to the 3rd participle and the first participle result
Word and second word segmentation result generate the second lexical order vector.
It is appreciated that step S105 can be performed by above-mentioned lexical order vector generation module 114.
As shown in figure 5, in the present embodiment, step S105 includes following sub-step:
Sub-step S1051, for each the 3rd participle (p1, p2 ..., px), when the 3rd participle occur in this first
When in word segmentation result or the second word segmentation result, calculating occur in the first participle result word order of the 3rd participle and this
Occurs the word order of the 3rd participle in two word segmentation results.
Sub-step S1052, when during the 3rd participle does not appear in the first participle result or second word segmentation result,
Find out the first participle or second participle similar to the 3rd participle, the 3rd participle and the first participle or the second participle
Similarity when being more than the second preset value, calculate the word order that occurs the first participle in the first participle result or this
Occurs the word order of second participle in two word segmentation results;In the present embodiment, second preset value can be set to 0.4, but not limit
In this.
Sub-step S1053, in the 3rd participle does not appear in the first participle result or second word segmentation result simultaneously
And the 3rd similarity of participle and the first participle or second participle when being less than second preset value, the 3rd participle
Word order is set to sky.In the present embodiment, assignment null.
In the present embodiment, by sub-step S1051, sub-step S1052, sub-step S1053 can obtain, by word order
First lexical order vector ri={ q1, q2 ..., qi } of qi compositions, the second lexical order vector rj=being made up of word order qj q1,
q2,...,qj}.For example, the first lexical order vector ri={ 1,2,3,3,4, null, 5 }, the second lexical order vector rj=1,2,3,3,
null,4,5}。
Step S106, according to first semantic vector and the second semantic vector computing semantic similarity.
In the present embodiment, the computing formula of the semantic similarity isWherein, Si is first semantic for this
Vector, Sj is second semantic vector.For example, according to the first semantic vector Si={ 1,1,0.2,1,1,0.2,1 } and the second language
Adopted vector Sj={ 1,1,1,0.2,0.2,1,1 } can calculate semantic similitude angle value.
It is appreciated that step S106 can be performed by above-mentioned Semantic Similarity Measurement module 115.
Step S107, according to first lexical order vector and second lexical order vector word order similarity is calculated.
In the present embodiment, the computing formula of the word order similarity isWherein, ri is first word
Sequence vector, rj is second lexical order vector.For example, according to the first lexical order vector ri={ 1,2,3,3,4, null, 5 } and the second word
Sequence vector rj={ 1,2,3,3, null, 4,5 } can calculate word order Similarity value.
It is appreciated that step S107 can be performed by above-mentioned word order similarity calculation module 116.
Step S108, selects with this to be matchedly according to semantic similarity and word order similarity from multiple java standard library addresses
The java standard library address of location matching.
In the present embodiment, when address to be matched and multiple java standard library addresses carry out semantic similarity and word order similarity
After calculating, multiple semantic similitude angle value and word order Similarity value are obtained, the 3rd preset value can be set, then can select language
Adopted similarity and word order similarity are more than the java standard library address of the 3rd preset value, and java standard library addresses insertion that these are matched
In one result table, information can be provided for departments such as Customer Service Center, dispatching control center, distribution repairing center, maintenance
Support, reasonable distribution human resources.
It is appreciated that step S108 can be performed by above-mentioned selecting module 117.
It should be noted that the java standard library address in the present embodiment can be one or more, selecting to be treated with this
During the java standard library address with address matching, each the java standard library address in the java standard library address will be entered with the address to be matched
Row semantic similarity and word order Similarity Measure, and then select the java standard library address with the addresses match to be matched.
In sum, the embodiment of the present invention is provided address resolution method and device, by the way that address to be matched is carried out
Participle, generates first participle result.Each java standard library address is carried out into participle, the second word segmentation result is generated.By this first point
Word result merges into the 3rd word segmentation result with second word segmentation result, and the 3rd word segmentation result includes the 3rd participle.Calculate the 3rd
Semantic fraction of the participle in the first participle result, generates the first semantic vector, calculates the 3rd participle in the second participle knot
Semantic fraction in fruit, generates the second semantic vector.The first lexical order vector is generated according to the 3rd participle and first participle result, according to
The second lexical order vector is generated according to the 3rd participle and the second word segmentation result.Language is calculated according to the first semantic vector and the second semantic vector
Adopted similarity, calculates word order similarity, according to the semantic similarity and word order according to the first lexical order vector and the second lexical order vector
Similarity selects the java standard library address matched with the address to be matched from multiple java standard library addresses.It is provided in an embodiment of the present invention
Address resolution method and device are simple, quick, can effectively improve operating efficiency, reduce workload.
It should be noted that herein, the relational terms of such as " first " and " second " or the like are used merely to one
Individual entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operate it
Between there is any this actual relation or order.And, term " including ", "comprising" or its any other variant are intended to
Cover including for nonexcludability, so that a series of process, method, article or equipment including key elements not only includes those
Key element, but also including other key elements being not expressly set out, or also include for this process, method, article or set
Standby intrinsic key element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that
Also there is other identical element in the process including the key element, method, article or equipment.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area
For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exists
Similar terms is represented in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, then it is not required in subsequent accompanying drawing
It is further defined and is explained.
Claims (10)
1. a kind of address resolution method, for selecting the java standard library ground matched with address to be matched from multiple java standard library addresses
Location, it is characterised in that the address resolution method includes:
The address to be matched is carried out into participle and generates first participle result, the first participle result includes the first participle;
Each described java standard library address is carried out into participle and generates the second word segmentation result, second word segmentation result includes second point
Word;
The first participle result and second word segmentation result are merged into into the 3rd word segmentation result, the 3rd word segmentation result bag
Include the 3rd participle;
Semantic fraction of the 3rd participle in the first participle result is calculated, the first semantic vector is generated, calculates described
Semantic fraction of 3rd participle in second word segmentation result, generates the second semantic vector;
The first lexical order vector is generated according to the 3rd participle and the first participle result, according to the 3rd participle and described
Second word segmentation result generates the second lexical order vector;
According to first semantic vector and the second semantic vector computing semantic similarity;
Word order similarity is calculated according to first lexical order vector and second lexical order vector;
Select to be matched with described from the plurality of java standard library address according to the semantic similarity and the word order similarity
The java standard library address of address matching.
2. address resolution method as claimed in claim 1, it is characterised in that when the 3rd participle is tied in the first participle
When occurring in fruit or second word segmentation result, the 3rd participle is in the first participle result or second word segmentation result
In semantic fraction be default value;When the 3rd participle does not appear in the first participle result or second participle knot
When in fruit, semantic fraction of the 3rd participle in the first participle result or second word segmentation result is first pre-
If value.
3. address resolution method as claimed in claim 1, it is characterised in that when the 3rd participle occurs in described first point
When in word result or the second word segmentation result, there is the word order of the 3rd participle with life in the first participle result in calculating
The word order for occurring the 3rd participle into first lexical order vector or in second word segmentation result generates second with described
Lexical order vector;
When during the 3rd participle does not appear in the first participle result or second word segmentation result, then find out and institute
State the similar first participle of the 3rd participle or second participle, the 3rd participle and the first participle or institute
When the similarity for stating the second participle is more than the second preset value, calculates and the first participle occur in the first participle result
Word order generating first lexical order vector, or occur the word order of second participle in second word segmentation result with life
Into second lexical order vector;
In the 3rd participle does not appear in the first participle result or second word segmentation result and the described 3rd
When participle is less than second preset value with the similarity of the first participle or second participle, the 3rd participle
Word order is set to sky.
4. address resolution method as claimed in claim 1, it is characterised in that the computing formula of the semantic similarity isWherein, Si is first semantic vector, and Sj is second semantic vector.
5. address resolution method as claimed in claim 1, it is characterised in that the computing formula of the word order similarity isWherein, ri is first lexical order vector, and rj is second lexical order vector.
6. a kind of address analyzing device, for selecting the java standard library ground matched with address to be matched from multiple java standard library addresses
Location, it is characterised in that the address analyzing device includes:
Word-dividing mode, for the address to be matched to be carried out into participle first participle result, the first participle result bag are generated
The first participle is included, each described java standard library address is carried out into participle and is generated the second word segmentation result, the second word segmentation result bag
Include the second participle;
Merging module, it is described for the first participle result to be merged into into the 3rd word segmentation result with second word segmentation result
3rd word segmentation result includes the 3rd participle;
Semantic vector generation module, for calculating semantic fraction of the 3rd participle in the first participle result, generates
First semantic vector, calculates semantic fraction of the 3rd participle in second word segmentation result, generates the second semantic vector;
Lexical order vector generation module, for generating the first lexical order vector according to the 3rd participle and the first participle result,
The second lexical order vector is generated according to the 3rd participle and second word segmentation result;
Semantic Similarity Measurement module, for calculating semantic similitude with second semantic vector according to first semantic vector
Degree;
Word order similarity calculation module, it is similar for calculating word order according to first lexical order vector and second lexical order vector
Degree;
Selecting module, for selecting from the plurality of java standard library address according to the semantic similarity and the word order similarity
The java standard library address matched with the address to be matched.
7. address analyzing device as claimed in claim 6, it is characterised in that when the 3rd participle is tied in the first participle
When occurring in fruit or second word segmentation result, the 3rd participle is in the first participle result or second word segmentation result
In semantic fraction be default value;When the 3rd participle does not appear in the first participle result or second participle knot
When in fruit, semantic fraction of the 3rd participle in the first participle result or second word segmentation result is first pre-
If value.
8. address analyzing device as claimed in claim 6, it is characterised in that when the 3rd participle occurs in described first point
When in word result or the second word segmentation result, there is the word order of the 3rd participle with life in the first participle result in calculating
There is the word order of the 3rd participle to generate the second lexical order vector into the first lexical order vector or in second word segmentation result;
When during the 3rd participle does not appear in the first participle result or second word segmentation result, then find out and institute
State the similar first participle of the 3rd participle or second participle, the 3rd participle and the first participle or institute
When the similarity for stating the second participle is more than the second preset value, calculates and the first participle occur in the first participle result
Word order generating first lexical order vector, or occur the word order of second participle in second word segmentation result with life
Into second lexical order vector;
In the 3rd participle does not appear in the first participle result or second word segmentation result and the described 3rd
When participle is less than second preset value with the similarity of the first participle or second participle, the 3rd participle
Word order is set to sky.
9. address analyzing device as claimed in claim 6, it is characterised in that the computing formula of the semantic similarity isWherein, Si is first semantic vector, and Sj is second semantic vector.
10. address analyzing device as claimed in claim 6, it is characterised in that the computing formula of the word order similarity isWherein, ri is first lexical order vector, and rj is second lexical order vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611239277.5A CN106598953A (en) | 2016-12-28 | 2016-12-28 | Address resolution method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611239277.5A CN106598953A (en) | 2016-12-28 | 2016-12-28 | Address resolution method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106598953A true CN106598953A (en) | 2017-04-26 |
Family
ID=58604811
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611239277.5A Pending CN106598953A (en) | 2016-12-28 | 2016-12-28 | Address resolution method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106598953A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107577744A (en) * | 2017-08-28 | 2018-01-12 | 苏州科技大学 | Nonstandard Address automatic matching model, matching process and method for establishing model |
CN109145073A (en) * | 2018-08-28 | 2019-01-04 | 成都市映潮科技股份有限公司 | A kind of address resolution method and device based on segmentation methods |
CN109145095A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Information of place names matching process, information matching method, device and computer equipment |
CN109254964A (en) * | 2018-08-20 | 2019-01-22 | 中国平安人寿保险股份有限公司 | Address Standardization method, apparatus, computer equipment and storage medium |
CN109753555A (en) * | 2018-11-30 | 2019-05-14 | 平安科技(深圳)有限公司 | Word match method, apparatus, equipment and computer readable storage medium |
CN110019575A (en) * | 2017-08-04 | 2019-07-16 | 北京京东尚科信息技术有限公司 | The method and apparatus that geographical address is standardized |
CN110532546A (en) * | 2019-07-29 | 2019-12-03 | 河北远东通信系统工程有限公司 | A kind of automatic delivery method of alert merging geographical location and text similarity |
CN111400433A (en) * | 2019-01-02 | 2020-07-10 | 阿里巴巴集团控股有限公司 | Address text processing method and device |
CN111625732A (en) * | 2020-05-25 | 2020-09-04 | 鼎富智能科技有限公司 | Address matching method and device |
CN112818685A (en) * | 2021-01-29 | 2021-05-18 | 上海寻梦信息技术有限公司 | Address matching method and device, electronic equipment and storage medium |
CN112884390A (en) * | 2019-11-29 | 2021-06-01 | 北京三快在线科技有限公司 | Order processing method and device, readable storage medium and electronic equipment |
CN113987114A (en) * | 2021-09-17 | 2022-01-28 | 上海燃气有限公司 | Address matching method and device based on semantic analysis and electronic equipment |
-
2016
- 2016-12-28 CN CN201611239277.5A patent/CN106598953A/en active Pending
Non-Patent Citations (2)
Title |
---|
孙亚夫 等: "基于分词的地址匹配技术", 《中国地理信息系统协会年会.2007》 * |
殷耀明 等: "基于关系向量模型的句子相似度计算", 《计算机工程与应用》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109145095A (en) * | 2017-06-16 | 2019-01-04 | 贵州小爱机器人科技有限公司 | Information of place names matching process, information matching method, device and computer equipment |
CN109145095B (en) * | 2017-06-16 | 2024-03-29 | 贵州小爱机器人科技有限公司 | Place name information matching method, information matching device and computer equipment |
CN110019575A (en) * | 2017-08-04 | 2019-07-16 | 北京京东尚科信息技术有限公司 | The method and apparatus that geographical address is standardized |
CN107577744A (en) * | 2017-08-28 | 2018-01-12 | 苏州科技大学 | Nonstandard Address automatic matching model, matching process and method for establishing model |
CN109254964A (en) * | 2018-08-20 | 2019-01-22 | 中国平安人寿保险股份有限公司 | Address Standardization method, apparatus, computer equipment and storage medium |
CN109145073A (en) * | 2018-08-28 | 2019-01-04 | 成都市映潮科技股份有限公司 | A kind of address resolution method and device based on segmentation methods |
CN109753555A (en) * | 2018-11-30 | 2019-05-14 | 平安科技(深圳)有限公司 | Word match method, apparatus, equipment and computer readable storage medium |
CN109753555B (en) * | 2018-11-30 | 2023-07-07 | 平安科技(深圳)有限公司 | Word matching method, device, equipment and computer readable storage medium |
CN111400433B (en) * | 2019-01-02 | 2023-04-11 | 阿里巴巴集团控股有限公司 | Address text processing method and device |
CN111400433A (en) * | 2019-01-02 | 2020-07-10 | 阿里巴巴集团控股有限公司 | Address text processing method and device |
CN110532546A (en) * | 2019-07-29 | 2019-12-03 | 河北远东通信系统工程有限公司 | A kind of automatic delivery method of alert merging geographical location and text similarity |
CN112884390A (en) * | 2019-11-29 | 2021-06-01 | 北京三快在线科技有限公司 | Order processing method and device, readable storage medium and electronic equipment |
CN111625732A (en) * | 2020-05-25 | 2020-09-04 | 鼎富智能科技有限公司 | Address matching method and device |
CN111625732B (en) * | 2020-05-25 | 2023-06-23 | 鼎富智能科技有限公司 | Address matching method and device |
CN112818685A (en) * | 2021-01-29 | 2021-05-18 | 上海寻梦信息技术有限公司 | Address matching method and device, electronic equipment and storage medium |
CN113987114A (en) * | 2021-09-17 | 2022-01-28 | 上海燃气有限公司 | Address matching method and device based on semantic analysis and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106598953A (en) | Address resolution method and device | |
WO2021135919A1 (en) | Machine learning-based sql statement security testing method and apparatus, device, and medium | |
CN107729924B (en) | Picture review probability interval generation method and picture review determination method | |
CN112365202A (en) | Method for screening evaluation factors of multi-target object and related equipment thereof | |
CN106168959A (en) | Page layout method and device | |
KR20190017395A (en) | Method for providing data management service having automatic cell merging function and providing service server for performing the same | |
CN113434542B (en) | Data relationship identification method and device, electronic equipment and storage medium | |
CN111914101B (en) | File association relationship abnormality identification method and device and computer equipment | |
CN113779269A (en) | Power grid load data display method and device, electronic equipment and storage medium | |
CN113595246A (en) | Microgrid state online monitoring method and device, computer equipment and storage medium | |
CN110443072B (en) | Data signature method, data verification device and storage medium | |
CN108830663B (en) | Electric power customer value evaluation method and system and terminal equipment | |
CN113365113B (en) | Target node identification method and device | |
CN110399658A (en) | Accelerated factor value calculating method, device, equipment and the storage medium of battery | |
CN113656187B (en) | Public security big data computing power service system based on 5G | |
CN114123190A (en) | Method and device for determining target region to which ammeter belongs, electronic equipment and storage medium | |
WO2022105120A1 (en) | Text detection method and apparatus from image, computer device and storage medium | |
CN111738290B (en) | Image detection method, model construction and training method, device, equipment and medium | |
CN114565105A (en) | Data processing method and deep learning model training method and device | |
CN109696614A (en) | Circuit test optimization method and device | |
CN114398434A (en) | Structured information extraction method and device, electronic equipment and storage medium | |
CN114782668A (en) | Model aggregation method, device and system and electronic equipment | |
CN111222739A (en) | Task allocation method and task allocation system of nuclear power station | |
CN116992220B (en) | Low-redundancy electricity consumption data intelligent acquisition method | |
CN115509909B (en) | Test method, test device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170426 |