CN110516228A - Name entity recognition method, device, computer installation and computer readable storage medium - Google Patents
Name entity recognition method, device, computer installation and computer readable storage medium Download PDFInfo
- Publication number
- CN110516228A CN110516228A CN201910597499.1A CN201910597499A CN110516228A CN 110516228 A CN110516228 A CN 110516228A CN 201910597499 A CN201910597499 A CN 201910597499A CN 110516228 A CN110516228 A CN 110516228A
- Authority
- CN
- China
- Prior art keywords
- character
- output
- result
- network model
- entity recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Character Discrimination (AREA)
Abstract
The present invention is suitable for Internet technical field, provide a kind of name entity recognition method, device, computer installation and computer readable storage medium, this method comprises: obtaining positive output result according to the first weighted sum value and character vector by the first LSTM network model, it is obtained reversely exporting result according to the second weighted sum value and character vector by reversed first LSTM network model, the positive output result and the reversed output result are handled respectively by freeway net network layers highway layer, respectively obtain positive processing result and reverse process result, the positive processing result and the reverse process result are cascaded, obtain cascaded-output result;The cascaded-output result is handled by full articulamentum and CRF layers, obtains the name Entity recognition label of character.Name entity recognition method provided by the invention can be improved the accuracy rate of name Entity recognition, to improve name Entity recognition effect on the whole.
Description
Technical field
The invention belongs to Internet technical field more particularly to a kind of name entity recognition method, device, computer dresses
It sets and computer readable storage medium.
Background technique
Existing natural language processing technique generallys use two-way shot and long term memory-condition random field algorithm (Bi-
Directional Long Short-Term Memory-Conditional Random Field algorithm,
BiLSTM-CRF model framework), the framework on the language such as English effect it is fine.If being needed in Chinese using the framework
It first to be segmented, if mistake occurs in participle, will affect name Entity recognition as a result, therefore carrying out Chinese using the framework
The effect of identification is not satisfactory.Yue Zhang etc. proposes a kind of Lattice LSTM structure, which can
Effectively to utilize word information, but Lattice LSTM structure does not have and other semi-supervised technologies, such as language model combines
Get up and carry out joint training, and be embedded in dictionary by the word of the pre-training in the text of large-scale automatic word segmentation to be influenced, causes
Name the accuracy of Entity recognition relatively low.
Summary of the invention
The embodiment of the present invention provides a kind of name entity recognition method, device, computer installation and computer-readable storage
Medium, it is intended to solve the problems, such as that the accuracy that name Entity recognition exists in the prior art is relatively low.
The invention is realized in this way a kind of name entity recognition method, including following procedure:
According to the character forward sequence of sentence to be processed, the character vector of the sentence to be processed is sequentially input first
Shot and long term remembers LSTM network model, from bebinning character the working as to the first LSTM network model of the sentence to be processed
In segment between preceding input character, the first correlation word of the current input character is obtained, by first correlation word
Term vector input the 2nd LSTM network model handled, obtain first hide output, to described first hide output weighting
Summation, obtains the first weighted sum value, and the first weighted sum value is inputted the first LSTM network model, passes through institute
It states the first LSTM network model and forward direction is obtained according to the character vector of the first weighted sum value and the current input character
Export result;
According to the character reverse sequence of the sentence to be processed, the character vector of the sentence to be processed is sequentially input
Reversed first LSTM network model, from the final character of the sentence to be processed to the reversed first LSTM network model
In segment between current input character, the second correlation word of the current input character is obtained, by second related term
The term vector of language inputs the 2nd LSTM network model and is handled, and obtains second and hides output, hides to described second defeated
Weighted sum out obtains the second weighted sum value, and the second weighted sum value is inputted the reversed first LSTM network mould
Type, by the reversed first LSTM network model according to the word of the second weighted sum value and the current input character
Symbol vector obtains reversely exporting result;
By freeway net network layers highway layer respectively to the positive output result and the reversed output
As a result it is handled, respectively obtains positive processing result and reverse process as a result, by the positive processing result and described reversed
Processing result is cascaded, and cascaded-output result is obtained;
The cascaded-output result is handled by full articulamentum and CRF layers of condition random field, obtains the life of character
Name Entity recognition label.
Further, described to hide output weighted sum to described first, the first weighted sum value is obtained, including following
Process:
Described first is hidden in output input softmax function model, is calculated by softmax function model
Described first hides the corresponding probability of output;
The weight that the corresponding probability of output hides output as described first is hidden using described first, according to described first
The weight and the first hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Further, described that the cascaded-output result is handled by full articulamentum and CRF layers, obtain character
Name Entity recognition label, including following procedure:
By the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained, wherein a first
The dimension of the output of LSTM network model, c are the number of labels for naming Entity recognition;
First output information input is handled for CRF layers, the name Entity recognition label of character is obtained.
Further, all excessively full articulamentum and CRF layers of condition random field to the cascaded-output result at
Reason, after obtaining the name Entity recognition label of character, the name entity recognition method includes following procedure:
The positive output result is inputted into freeway net network layers, obtains the second output information;
By the full articulamentum of second output information input a*b, the prediction for obtaining the character late of current character is general
Rate, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
According to the prediction probability of the character late of the current character, label is identified to the name of the current character
It is corrected.
The present invention also provides a kind of name entity recognition devices, comprising:
Positive result processing module, for the character forward sequence according to sentence to be processed, by the sentence to be processed
Character vector sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to described
In segment between the current input character of first LSTM network model, the first related term of the current input character is obtained
The term vector of first correlation word is inputted the 2nd LSTM network model and handled by language, is obtained first and is hidden output,
Output weighted sum is hidden to described first, obtains the first weighted sum value, the first weighted sum value is inputted described the
One LSTM network model, by the first LSTM network model according to the first weighted sum value and the current input
The character vector of character obtains positive output result;
Reversed result processing module, for the character reverse sequence according to the sentence to be processed, by the language to be processed
The character vector of sentence sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described anti-
Into the segment between the current input character of the first LSTM network model, obtain the current input character second is related
The term vector of second correlation word is inputted the 2nd LSTM network model and handled by word, is obtained second and is hidden
Output hides output weighted sum to described second, obtains the second weighted sum value, the second weighted sum value is inputted
The reversed first LSTM network model, by the reversed first LSTM network model according to the second weighted sum value
It obtains reversely exporting result with the character vector of the current input character;
Cascade module, for by freeway net network layers highway layer respectively to the positive output result and
The reversed output result is handled, and respectively obtains positive processing result and reverse process as a result, by the positive processing knot
Fruit and the reverse process result are cascaded, and cascaded-output result is obtained;
Entity handling module is named, is used for through full articulamentum and CRF layers of condition random field to the cascaded-output result
It is handled, obtains the name Entity recognition label of character.
Further, the positive result processing module includes:
Computational submodule passes through softmax letter for hiding described first in output input softmax function model
Exponential model is calculated described first and hides the corresponding probability of output;
Submodule is handled, for hiding the power that the corresponding probability of output hides output as described first for described first
Weight, the weight for hiding output according to described first and the first hiding output are weighted summation, obtain the first weighted sum
As a result.
Further, the name entity handling module includes:
First input submodule, for obtaining the first output for the full articulamentum of cascaded-output result input 2a*c
Information, wherein a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition;
Second input submodule handles for CRF layers by first output information input, obtains the life of character
Name Entity recognition label.
Further, the name entity recognition device further include:
First input module obtains the second output letter for the positive output result to be inputted freeway net network layers
Breath;
Second input module, for obtaining current character for the full articulamentum of second output information input a*b
The prediction probability of character late, a are the dimension of the output of the first LSTM network model, and b is all characters in corpus
Quantity;
Correction module, for the prediction probability according to the character late of the current character, to the current character
Name identification label is corrected.
The present invention also provides a kind of computer installation, the computer installation includes processor, and the processor is for holding
It is realized when computer program such as the step of above-mentioned name entity recognition method in line storage.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer journey
It realizes when sequence is executed by processor such as the step of above-mentioned name entity recognition method.
Name entity recognition method provided by the invention, can in conjunction with a plurality of types of LSTM modes according to character vector,
Term vector obtains positive output result and direction output as a result, cascading to forward direction output result and direction output result, obtains
To cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, after reversed output result includes
The contextual information of part sentence, to can be obtained complete after forward direction output result and direction output result are cascaded
Context it is semantic, to handle by full articulamentum and CRF layers cascaded-output result, obtained character designation reality
Body identifies that label is more accurate, to improve the effect of name Entity recognition on the whole.
Detailed description of the invention
Fig. 1 is the implementation flow chart of name entity recognition method provided in an embodiment of the present invention;
Fig. 2 information flow direction schematic diagram provided in an embodiment of the present invention;
Fig. 3 is that the present invention implements to ask the described first hiding output weighting described in step S101 shown in the Fig. 1 provided
With obtain the implementation flow chart of the first weighted sum value;
Fig. 4 is the implementation flow chart that the present invention implements step S104 shown in the Fig. 1 provided;
Fig. 5 is another stream that the present invention implements the name entity recognition method after step S103 shown in the Fig. 1 provided
Cheng Tu;
Fig. 6 is the structural schematic diagram that the present invention implements a kind of name entity recognition device provided;
Fig. 7 is the structural schematic diagram of positive result processing module provided in an embodiment of the present invention;
Fig. 8 is a structural schematic diagram of name entity handling module provided in an embodiment of the present invention;
Fig. 9 is the structural schematic diagram of another name entity recognition device provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments,
The present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only used to explain this hair
It is bright, it is not intended to limit the present invention.
Fig. 1 show the flow chart of name entity recognition method provided in an embodiment of the present invention.The name Entity recognition
Method, including following procedure:
Step S101, according to the character forward sequence of sentence to be processed, successively by the character vector of the sentence to be processed
The first shot and long term memory LSTM network model is inputted, from the bebinning character of the sentence to be processed to the first LSTM network
In segment between the current input character of model, the first correlation word of the current input character is obtained, by described first
The term vector of correlation word inputs the 2nd LSTM network model and is handled, and obtains first and hides output, hides to described first
Weighted sum is exported, the first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network mould
Type, by the first LSTM network model according to the character of the first weighted sum value and the current input character to
Measure positive output result.
In the present embodiment, in advance as unit of sentence by the molecular list of sentence, then corpus can be divided into
Each sentence is respectively processed.Before treatment, all to what is occurred in corpus by processing modes such as word2vec
Character and vocabulary are trained, and obtain the character insertion dictionary and word insertion dictionary of a pre-training.The content of dictionary is: every
A character or word and its corresponding intensive vector, that is to say, that term vector is to segment sentence, is carried out as unit of word
Vectorization indicates;It is that unit carries out vectorization expression that character vector, which is by each character in sentence,.Below to term vector and word
Symbol vector is illustrated.Such as:
In the present embodiment, character in the sentence to be processed can be determined from the character insertion dictionary obtained in advance
Character vector, from the bebinning character of the sentence to be processed to the current input character of the first LSTM network model it
Between segment in, obtain the first correlation word of the current input character, from obtain in advance word insertion dictionary in determine institute
State the term vector of the first correlation word.
In the present embodiment, the first shot and long term memory (Long Short-Term Memory, LSTM) network model is length
Short-term memory network model is a kind of time recurrent neural networks model.The first LSTM network model is for handling word
The LSTM network model of vector is accorded with, the 2nd LSTM network model is the LSTM network model for handling term vector.Due to
Term vector is inputted in two LSTM network models, then the first hiding output includes lexical information.
It should be noted that the character forward sequence according to the sentence to be processed is successively defeated by the character vector of character
Enter the process of the first LSTM network model, that is, the character string of sentence is inputted from front to back.Such as sentence
" I likes Changsha ", character forward sequence are ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.
It further illustrates, the first correlation word is in segment from the beginning of sentence to current character with current word
Accord with the word of ending.First correlation word may or may not be present.In the case where there is the first correlation word, first is related
The number of word can be one or more.For example, in the segment on " Changsha Orange Islet square ", with the word of " field " ending
There are " Orange Islet square " and " square ", the first correlation word of character " field " has " Orange Islet square " and " square ".
Specifically, current input character has the first correlation word without the first correlation word and current input character, has
Different calculating process.Specifically, in the case where current input character is without the first correlation word, it can be according to formula
(1), formula (2) and formula (3) constitute the first LSTM network model.Formula (1)-(3) particular content is as follows:
Formula (1):
Wherein,Indicate input gate,Indicate out gate,It indicates to forget door,Indicate what current time step updated
Status information, σ indicate that Sigmoid function, Sigmoid function are often used as the threshold function table of neural network, variable mappings arrived
Between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh is also an activation primitive, tanh letter
Number may refer to pertinent texts or network encyclopaedia, WcTIndicate weight vectors, for training parameter,Indicate character vector,Indicate the output of a time step, bcIndicate offset vector and model training parameter.
Formula (2):
Wherein,Indicate current new state information,It indicates to forget door,Indicate the state letter an of time step
Breath,Indicate input gate,Indicate the status information that current time step updates.
Formula (3):
Wherein, h indicates current hidden layer output,Indicate out gate,Indicate that current new state information is logical
Cross the output after activation primitive.
In the case where current input character is without the first correlation word, according to formula (1), formula (2) to formula (3)
It is calculated, finally the output by the output being calculated in formula (3) as the first LSTM network model.
In the case where current input character has the first correlation word, formula (1) above-mentioned, formula (3) can be used as
One LSTM network model, formula (4)-(10) can form the 2nd LSTM network model.Formula (4)-(10) particular content is such as
Under:
Formula (4):
Wherein,Indicate input gate,It indicates to forget door,Indicate the status information that current time step updates, σ
Indicate Sigmoid function, Sigmoid function is an activation primitive, the threshold function table of neural network is often used as, by variable
It is mapped between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh indicates to be also an activation letter
Number, tanh function may refer to pertinent texts or network encyclopaedia,It is term vector,It is word all characters, WcTIt indicates
Weight vectors, for training parameter, bwIndicate offset vector and model training parameter.
Formula (5):
Wherein,Indicate current new state information,It indicates to forget door,Processing character sequence in representation formula (2)
New state information when column when time step b,Indicate input gate,Indicate the status information that current time step updates.
Formula (6):
Wherein,Indicate that input gate, t indicate non-linear conversion layer, i.e., the one full articulamentum with activation primitive, whTable
Show weight vectors,Indicate current new state information, bHIndicate offset vector.
Formula (7):
Wherein,It indicates the output for inputting the LSTM network model of character vector and is used for inputting related term vector
LSTM network model the sum of the weighting of output, b indicates the set that all b' is constituted, and b' indicates all with b' in sentence
The position of the word of beginning,Indicate that the term vector for starting to end up with j with b', D indicate current sentence,Indicate term vector
Weight,Indicate term vector,Indicate the weight of character vector,Indicate character vector.
Formula (8):
Wherein,Indicate the weight of term vector,Indicate that the input gate of word vector, b' indicate the element in b ", b " table
Show all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates
The input gate of term vector,Indicate the input gate of term vector.
Formula (9):
Wherein,Indicate the weight of word vector,Indicate that the input gate of word vector, b' indicate that the element in b ", b " indicate
All starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates
The input gate of term vector.
Formula (10):
Wherein, t indicates model output, and σ indicates activation primitive,Indicate weight vectors to be trained, bTIndicate to
Trained offset vector.
In the case where there is the first correlation word, calculated by the 2nd LSTM network model that formula (4)-(10) form
First weighted sum value of the first correlation word.First weighted sum value and character vector are passed through according to formula (1) and formula
(3) the first LSTM network model composed by obtains positive output result.
Referring to Fig. 2, sentence to be processed is " Changsha Orange Islet square " in Fig. 2, dictionary can be embedded in by character
The character vector for obtaining " length " " sand " " city " " tangerine " " son " " continent " " wide " " field " these characters respectively, in current input character
When for " city ", the first correlation word is " Changsha ", and when current input character is " field ", the first correlation word is " orange
Continent square " and " square " can obtain " Changsha ", " Orange Islet square ", " square " these words by insertion dictionary
Term vector.For example, when current input character is " field ", by the term vector on " Orange Islet square " and the word on " square " to
Amount the 2nd LSTM network model of input is handled, and is obtained first and is hidden output.Output weighted sum is hidden to described first,
The first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network model.Pass through described first
LSTM network model obtains the positive output result of " field " according to the character vector of the first weighted sum value and " field ".By
Hiding output in first includes lexical information, then the first weighted sum value also includes lexical information, in this way, positive output result
Contextual information including front subordinate sentence.
Step S102, according to the character reverse sequence of the sentence to be processed, by the character vector of the sentence to be processed
Reversed first LSTM network model is sequentially input, from the final character of the sentence to be processed to the reversed first LSTM net
In segment between the current input character of network model, the second correlation word of the current input character is obtained, by described
The term vectors of two correlation words inputs the 2nd LSTM network model and is handled, and obtains second and hides output, to described the
Two hide output weighted sum, obtain the second weighted sum value, and the second weighted sum value is inputted described reversed first
LSTM network model, by the reversed first LSTM network model according to the second weighted sum value and described current defeated
The character vector for entering character obtains reversely exporting result.
In the present embodiment, reversed first LSTM network model is also shot and long term memory network model, is to pass a kind of time
Return neural network model.Reversed first LSTM network model is used for the character reverse sequence according to sentence to be processed successively by word
Symbol vector is inputted.The specific calculating process of reversed first LSTM network model and the calculating of the first LSTM network model
Journey is similar, and to avoid repeating, this will not be repeated here.
It should be noted that successively the character vector of character is inputted according to the character reverse sequence of the sentence to be processed
The process of the reversed first LSTM network model, the as reversed process of input character vector, that is, by the character of sentence
Sequence inputs from back to front.Such as sentence " I likes Changsha ", forward direction is input to the sequence in the first LSTM network model
It is ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.And reversely it is input in the first LSTM network model mode that then input sequence is exactly
' sand ', ' length ', ' joyous ', ' happiness ', ' I '.For forward direction input, when ' joyous ' is arrived in the processing of the first LSTM network model, meeting
The 2nd LSTM network model will be passed through, the term vector corresponding second for obtaining ' liking ' hides output.
For the character reverse sequence according to the sentence to be processed, successively by the character vector of the sentence to be processed
Input reversed first LSTM network model, i.e., for reversed input, when ' happiness ' is arrived in the processing of reversed first LSTM network model,
The term vector ' liked ' corresponding second can will be obtained by the 2nd LSTM network model hides output.Referring to Fig. 2,
When current input character is " tangerine ", the second correlation word is " Orange Islet square ", and the term vector on " Orange Islet square " is inputted
2nd LSTM network model is handled, and is obtained second and is hidden output.Output weighted sum is hidden to described second, obtains the
Two weighted sum values, will
Step S103, by freeway net network layers highway layer respectively to the positive output result and described
Reversed output result is handled, respectively obtain positive processing result and reverse process as a result, by the positive processing result and
The reverse process result is cascaded, and cascaded-output result is obtained.
In the present embodiment, the English name of freeway net network layers be highwaylayer, highway layer it
Message part can be made to pass through, it is constant by dimension after highwaylayer for a vector, but each of vector
Component is different according to trained parameter, can only retain partial information.It is described to pass through freeway net network layers highwaylayer points
It is other that the positive output result and the reversed output result are handled, respectively obtain positive processing result and reversed place
Reason is as a result, include following procedure: by the positive output result pass through the freeway net network layers for name Entity recognition into
Row processing, obtains positive processing result, and the reversed output result is passed through the highway for name Entity recognition
Network layer is handled, and reverse process result is obtained.
In the present embodiment, the positive output result and the reversed output result can pass through highway respectively
It network layer and then is cascaded.The cascaded-output result includes the contextual information of preceding part and the context of rear part
Information, i.e., the described cascaded-output result include the contextual information of sentence, can effectively indicate full text semantic feature.
It should be noted that the cascade meaning is exactly to splice, the positive output result includes the context letter of preceding part
Breath, the reversed output result include the contextual information of rear part, then the cascaded-output result includes sentence context letter
Breath.
It further remarks additionally, includes the first LSTM network model 201, cascade model 202, language model in Fig. 2
203, Named Entity Extraction Model 204, the first freeway net network layers 205 and the second freeway net network layers 206.Described
One LSTM network model 201 is connect with the first freeway net network layers 205 and the second freeway net network layers 206 respectively,
In, the first freeway net network layers 205 are connect with language model 203, the second freeway net network layers 206 and cascade model 202
Connection, cascade model 202 and Named Entity Extraction Model 204.Language model 203 can remove, and will not influence name entity and know
Other model 204 executes name Entity recognition task, and after only calculating plus the language model of language model 203, name is real
The implementation effect of body identification mission can be more preferable.
Named Entity Extraction Model 204 includes full articulamentum and CRF layers.On the one hand full articulamentum can increase the quasi- of model
On the other hand conjunction ability can change the dimension of vector, such as an input is 200 neurons, is exported as 20 neurons
Full articulamentum, each 200 vector tieed up can be converted to the vector of 20 dimensions, subsequent CRF is facilitated to be handled.CRF layers
Effect be predict sentence in each character label.
The treatment process of name Entity recognition task is included in the second freeway net network layers 206, cascade model 202, complete
Articulamentum, CRF layers of progress data processing, carry out joint Tag Estimation to sentence by CRF layers, CRF layers of output is exactly to name
The prediction result of Entity recognition.
It should be noted that obtaining forward direction handling the positive output result by the second freeway net network layers 206
Processing result handles the reversed output result by freeway net network layers and obtains reverse process as a result, in cascade model
202 cascade positive processing result and reverse process result.Then cascade result is inputted into full articulamentum and CRF layers, led to
It crosses CRF layers and obtains the prediction result of name Entity recognition.
Step S104, the cascaded-output result is handled by full articulamentum and CRF layers of condition random field, is obtained
To the name Entity recognition label of character.
Before CRF layers, data have been processed into the tensor of n*k, and wherein n represents n character in a sentence, k generation
The number of species of table difference label, for each character, it is therefore an objective to selected inside from k label one as final mark
Note, if do not use CRF layers, individually each character is marked also possible, it is only necessary to by the k of each character tie up to
Then amount chooses the label as this character of maximum probability by softmax function model.But mark is not accounted in this way
Dependence between label, it is thus possible to will appear unreasonable sequence label, CRF layers calculate mark by a transfer matrix
The cost shifted between label calculates the overall labeling scheme of entire sentence.CRF marks the detailed of the NER label of each character
Step includes:
Assuming that the tensor of the n*k be P, this be CRF layers before layer processing result, PijRepresent i-th of word in sentence
Symbol is labeled as the score of j-th of label, for a forecasting sequence y=(y1, y2 ... ..., yn), is defined according to formula (11)
Its score:
Formula (11):
Wherein, s (X, y) indicates that sentence X is marked as the score of y sequence,It indicates from label yiTo label yi+1
Transfer score,Indicate that i-th of character is marked as yiScore.
Softmax function model is used on all possible sequence label, generates the probability of a flag sequence, it should
Probability can be indicated with formula (12):
Formula (12):Wherein, p (y | X) indicates that sentence X is marked as the probability of sequences y, Yx
Expression represents all possible sequence label, es(X,y)Indicate e be the truth of a matter, current markers sequence score be index power fortune
It calculates,Indicate e be the truth of a matter, all flags sequence score be index power operation sum.
During the training period, the log-likelihood for maximizing correct sequence label is determined according to formula (13):
Formula (13):
In formula (13), and p (y | X) indicate that sentence X is marked as the probability of sequences y, Yx represents all possible label sequence
Column, s (X, y) indicate the score of current markers sequence,Indicate e be the truth of a matter, all flags sequence score be finger
The sum of several power operations.
In trained or forecast period, the output sequence of largest score will be obtained as name Entity recognition label.
Name entity recognition method provided by the invention can obtain positive output result according to character vector, term vector
And direction output cascades as a result, exporting result to forward direction output result and direction, obtains cascaded-output as a result, due to just
It include the contextual information of front subordinate sentence to output result, it is reversed to export the contextual information that result includes rear portion subordinate sentence,
So that after forward direction output result and direction output result are cascaded complete context semanteme can be obtained, by connecting entirely
It connects layer and CRF layers of condition random field handles cascaded-output result, obtained character designation Entity recognition label is than calibrated
Really, to improve the effect of name Entity recognition on the whole.
Referring to Fig. 3, described hide to described first in the step S101 exports weighted sum, obtains the first weighting
Summing value, including following procedure:
Step S1011, described first is hidden in output input softmax function model, passes through softmax Function Modules
Type is calculated described first and hides the corresponding probability of output.
Step S1012, the weight that the corresponding probability of output hides output as described first, root are hidden using described first
The weight and described first for hiding output according to described first hide output and are weighted summation, obtain the first weighted sum result.
In the present embodiment, the normalization softmax function model can be Sequence Transformed for another by a number
Serial No., each of sequence after converting number are the values between 0 to 1, originally after bigger number conversion just
Close to 1, it originally is adjacent to 0 after smaller number conversion, can be used for expressing the concept of probability.
Supplementary explanation, can also be handled the hiding output by freeway net network layers, after processing
Hiding output input the softmax function model, obtain described first and hide the corresponding probability of output.It needs to illustrate
It is that freeway net network layers can be such that message part passes through, for a vector, not by dimension after freeway net network layers
Become, but each component of vector is different according to trained parameter, can only retain partial information.In this way, obtained likelihood ratio
Relatively accurate, effect is more preferable.
Explanation is needed further exist for, freeway net network layers can preferably increase the expression ability of model, extensive energy
Power.Since word relevant to each character will can make each word by freeway net network layers, freeway net network layers
By information it is different, and then the significance level and current character that give expression to each word are more concerned with which word.In this way, getting
First hide output the likelihood ratio respectively exported it is more accurate, can effectively ensure that subsequent NER identification is accurate.
It remarks additionally, described hide to described second in the step S102 exports weighted sum, obtains second
Weighted sum value, including following procedure:
Described second is hidden in output input softmax function model, is calculated by softmax function model
Described second hides the corresponding probability of output.
The weight that the corresponding probability of output hides output as described second is hidden using described second, according to described second
The weight and the second hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Referring to Fig. 4, step S104 can with the following steps are included:
Step S1041, by the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained,
In, a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition.
Step S1042, it is handled for CRF layers by first output information input, obtains the name Entity recognition of character
Label.
In the present embodiment, need to be used to name the positive freeway net network layers of Entity recognition export it is described just
The reverse process result exported to processing result and reversed freeway net network layers is cascaded, and then connects full connection again
Layer and CRF layers, thus need to be arranged the full articulamentum of 2a*c.
Refering to Fig. 5, after the step S104, the name entity recognition method is further comprising the steps of:
The positive output result is inputted freeway net network layers, obtains the second output information by step S105;
The full articulamentum of second output information input a*b is obtained next word of current character by step S106
The prediction probability of symbol, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
Step S107, the name according to the prediction probability of the character late of the current character, to the current character
Identification label is corrected.
For example, referring to Fig. 2, if current character is " city ", according to the sequence before the character " city "
Information and lexical information predict that next word is the probability in " day ".It says for another example, sentence is " I likes Changsha ", works as character
After " happiness " exports the first output information by the first LSTM network model, predict that next word is " joyous " according to the first output information
Probability.
Supplementary explanation, in the present embodiment, after step slol, can carry out step S105 and step S106,
Prediction after the prediction probability that step S106 obtains the character late of current character, to the character late of current character
Probability is stored.After step s 104, step S107 is carried out, according to the character late of stored current character
Prediction probability is corrected the name identification label of the current character.
In this way, can predict the probability distribution of character late, according to the character late of the current character
Prediction probability, the name of current character identification label is corrected, the accurate of name identification label can be improved
Degree, so that the effect of name Entity recognition is more preferable.
Name entity recognition method provided in an embodiment of the present invention, can be in conjunction with a plurality of types of LSTM modes according to word
Symbol vector, term vector obtain positive output result and direction output as a result, carrying out to forward direction output result and direction output result
Cascade obtains cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, reversed output knot
Fruit includes the contextual information of rear portion subordinate sentence, thus after forward direction output result and direction output result are cascaded, it can be with
It is semantic to obtain complete context, thus by full articulamentum and CRF layers of condition random field to cascaded-output result at
Reason, obtained character designation Entity recognition label is more accurate, to improve the effect of name Entity recognition on the whole.
Fig. 6 shows a kind of structural schematic diagram for naming entity recognition device 600 provided in an embodiment of the present invention, in order to
Convenient for explanation, illustrates only and implement relevant part in the present invention.The name entity recognition device 600, comprising:
Positive result processing module 601, for the character forward sequence according to sentence to be processed, by the language to be processed
The character vector of sentence sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to
In segment between the current input character of the first LSTM network model, the first phase of the current input character is obtained
Close word, the term vector of first correlation word inputted into the 2nd LSTM network model and is handled, obtain first hide it is defeated
Out, output weighted sum is hidden to described first, obtains the first weighted sum value, the first weighted sum value is inputted into institute
The first LSTM network model is stated, by the first LSTM network model according to the first weighted sum value and described current
The character vector of input character obtains positive output result.
In the present embodiment, in advance as unit of sentence by the molecular list of sentence, then corpus can be divided into
Each sentence is respectively processed.Before treatment, all to what is occurred in corpus by processing modes such as word2vec
Character and vocabulary are trained, and obtain the character insertion dictionary and word insertion dictionary of a pre-training.The content of dictionary is: every
A character or word and its corresponding intensive vector, that is to say, that term vector is to segment sentence, is carried out as unit of word
Vectorization indicates;It is that unit carries out vectorization expression that character vector, which is by each character in sentence,.Below to term vector and word
Symbol vector is illustrated.Such as:
In the present embodiment, character in the sentence to be processed can be determined from the character insertion dictionary obtained in advance
Character vector, from the bebinning character of the sentence to be processed to the current input character of the first LSTM network model it
Between segment in, obtain the first correlation word of the current input character, from obtain in advance word insertion dictionary in determine institute
State the term vector of the first correlation word.
In the present embodiment, the first shot and long term memory (Long Short-Term Memory, LSTM) network model is length
Short-term memory network model is a kind of time recurrent neural networks model.The first LSTM network model is for handling word
The LSTM network model of vector is accorded with, the 2nd LSTM network model is the LSTM network model for handling term vector.Due to
Term vector is inputted in two LSTM network models, then the first hiding output includes lexical information.
It should be noted that the character forward sequence according to the sentence to be processed is successively defeated by the character vector of character
Enter the process of the first LSTM network model, that is, the character string of sentence is inputted from front to back.Such as sentence
" I likes Changsha ", character forward sequence are ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.
It further illustrates, the first correlation word is in segment from the beginning of sentence to current character with current word
Accord with the word of ending.First correlation word may or may not be present.In the case where there is the first correlation word, first is related
The number of word can be one or more.For example, in the segment on " Changsha Orange Islet square ", with the word of " field " ending
There are " Orange Islet square " and " square ", the first correlation word of character " field " has " Orange Islet square " and " square ".
Specifically, current input character has the first correlation word without the first correlation word and current input character, has
Different calculating process.Specifically, in the case where current input character is without the first correlation word, it can be according to formula
(1), formula (2) and formula (3) constitute the first LSTM network model.Formula (1)-(3) particular content is as follows:
Formula (1):
Wherein,Indicate input gate,Indicate out gate,It indicates to forget door,Indicate what current time step updated
Status information, σ indicate that Sigmoid function, Sigmoid function are often used as the threshold function table of neural network, variable mappings arrived
Between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh is also an activation primitive, tanh letter
Number may refer to pertinent texts or network encyclopaedia, WcTIndicate weight vectors, for training parameter,Indicate character vector,Indicate the output of a time step, bcIndicate offset vector and model training parameter.
Formula (2):
Wherein,Indicate current new state information,It indicates to forget door,Indicate the state letter an of time step
Breath,Indicate input gate,Indicate the status information that current time step updates.
Formula (3):
Wherein, h indicates current hidden layer output,Indicate out gate,Indicate that current new state information is logical
Cross the output after activation primitive.
In the case where current input character is without the first correlation word, information flow is according to formula (1), formula (2) to public affairs
The sequential flowing of formula (3), output of the output being finally calculated from formula (3) as the first LSTM network model.
In the case where current input character has the first correlation word, formula (1) above-mentioned, formula (3) can be used as
One LSTM network model, formula (4)-(10) can form the 2nd LSTM network model.Formula (4)-(10) particular content is such as
Under:
Formula (4):
Wherein,Indicate input gate,It indicates to forget door,Indicate the status information that current time step updates, σ
Indicate Sigmoid function, Sigmoid function is an activation primitive, the threshold function table of neural network is often used as, by variable
It is mapped between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh indicates to be also an activation letter
Number, tanh function may refer to pertinent texts or network encyclopaedia,It is term vector,It is word all characters, WcTIt indicates
Weight vectors, for training parameter, bwIndicate offset vector and model training parameter.
Formula (5):
Wherein,Indicate current new state information,It indicates to forget door,Processing character sequence in representation formula (2)
New state information when column when time step b,Indicate input gate,Indicate the status information that current time step updates.
Formula (6):
Wherein,Indicate that input gate, t indicate non-linear conversion layer, i.e., the one full articulamentum with activation primitive, whTable
Show weight vectors,Indicate current new state information, bHIndicate offset vector.
Formula (7):
Wherein,It indicates the output for inputting the LSTM network model of character vector and is used for inputting related term vector
LSTM network model the sum of the weighting of output, b indicates the set that all b' is constituted, and b' indicates all with b' in sentence
The position of the word of beginning,Indicate that the term vector for starting to end up with j with b', D indicate current sentence,Indicate term vector
Weight,Indicate term vector,Indicate the weight of character vector,Indicate character vector.
Formula (8):
Wherein,Indicate the weight of term vector,Indicate that the input gate of word vector, b' indicate the element in b ", b " table
Show all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates
The input gate of term vector,Indicate the input gate of term vector.
Formula (9):Wherein,Indicate the weight of word vector,It indicates
The input gate of word vector, b' indicate that the element in b ", b " indicate all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,Indicate the input gate of term vector.
Formula (10):
Wherein, t indicates model output, and σ indicates activation primitive,Indicate weight vectors to be trained, bTIndicate to
Trained offset vector.
In the case where there is the first correlation word, calculated by the 2nd LSTM network model that formula (4)-(10) form
First weighted sum value of the first correlation word.First weighted sum value and character vector are passed through according to formula (1) and formula
(3) the first LSTM network model composed by obtains positive output result.
Referring to Fig. 2, sentence to be processed is " Changsha Orange Islet square " in Fig. 2, dictionary can be embedded in by character
The character vector for obtaining " length " " sand " " city " " tangerine " " son " " continent " " wide " " field " these characters respectively, in current input character
When for " city ", the first correlation word is " Changsha ", and when current input character is " field ", the first correlation word is " orange
Continent square " and " square " can obtain " Changsha ", " Orange Islet square ", " square " these words by insertion dictionary
Term vector.For example, when current input character is " field ", by the term vector on " Orange Islet square " and the word on " square " to
Amount the 2nd LSTM network model of input is handled, and is obtained first and is hidden output.Output weighted sum is hidden to described first,
The first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network model.Pass through described first
LSTM network model root
Reverse process module 602, for the character reverse sequence according to the sentence to be processed, by the language to be processed
The character vector of sentence sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described anti-
Into the segment between the current input character of the first LSTM network model, obtain the current input character second is related
The term vector of second correlation word is inputted the 2nd LSTM network model and handled by word, is obtained second and is hidden
Output hides output weighted sum to described second, obtains the second weighted sum value, the second weighted sum value is inputted
The reversed first LSTM network model, by the reversed first LSTM network model according to the second weighted sum value
It obtains reversely exporting result with the character vector of the current input character.
In the present embodiment, reversed first LSTM network model is also shot and long term memory network model, is to pass a kind of time
Return neural network model.Reversed first LSTM network model is used for the character reverse sequence according to sentence to be processed successively by word
Symbol vector is inputted.The specific calculating process of reversed first LSTM network model and the calculating of the first LSTM network model
Journey is similar, and to avoid repeating, this will not be repeated here.
It should be noted that successively the character vector of character is inputted according to the character reverse sequence of the sentence to be processed
The process of the reversed first LSTM network model, the as reversed process of input character vector, that is, by the character of sentence
Sequence inputs from back to front.Such as sentence " I likes Changsha ", forward direction is input to the sequence in the first LSTM network model
It is ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.And reversely it is input in the first LSTM network model mode that then input sequence is exactly
' sand ', ' length ', ' joyous ', ' happiness ', ' I '.For forward direction input, when ' joyous ' is arrived in the processing of the first LSTM network model, meeting
The 2nd LSTM network model will be passed through, the term vector corresponding second for obtaining ' liking ' hides output.
For the character reverse sequence according to the sentence to be processed, successively by the character vector of the sentence to be processed
Input reversed first LSTM network model, i.e., for reversed input, when ' happiness ' is arrived in the processing of reversed first LSTM network model,
The term vector ' liked ' corresponding second can will be obtained by the 2nd LSTM network model hides output.Referring to Fig. 2,
When current input character is " tangerine ", the second correlation word is " Orange Islet square ", and the term vector on " Orange Islet square " is inputted
2nd LSTM network model is handled, and is obtained second and is hidden output.Output weighted sum is hidden to described second, obtains the
The second weighted sum value is inputted the reversed first LSTM network model by two weighted sum values.By described reversed
First LSTM network model is tied according to the reversed output that the character vector of the second weighted sum value and " tangerine " obtains " tangerine "
Fruit.Since the second hiding output includes lexical information, then the second weighted sum value also includes lexical information, in this way, reversed output
It as a result include the contextual information of rear portion subordinate sentence.
Cascade module 603, for being tied respectively to the positive output by freeway net network layers highway layer
Fruit and the reversed output result are handled, and respectively obtain positive processing result and reverse process as a result, by the forward direction
Reason result and the reverse process result are cascaded, and cascaded-output result is obtained.
In the present embodiment, the English name of freeway net network layers be highwaylayer, highway layer it
Message part can be made to pass through, it is constant by dimension after highwaylayer for a vector, but each of vector
Component is different according to trained parameter, can only retain partial information.It is described to pass through freeway net network layers highwaylayer points
It is other that the positive output result and the reversed output result are handled, respectively obtain positive processing result and reversed place
Reason is as a result, include following procedure: by the positive output result pass through the freeway net network layers for name Entity recognition into
Row processing, obtains positive processing result, and the reversed output result is passed through the highway for name Entity recognition
Network layer is handled, and reverse process result is obtained.
In the present embodiment, the positive output result and the reversed output result can pass through highway respectively
It network layer and then is cascaded.The cascaded-output result includes the contextual information of preceding part and the context of rear part
Information, i.e., the described cascaded-output result include the contextual information of sentence, can effectively indicate full text semantic feature.
It should be noted that the cascade meaning is exactly to splice, the positive output result includes the context letter of preceding part
Breath, the reversed output result include the contextual information of rear part, then the cascaded-output result includes sentence context letter
Breath.
It further remarks additionally, includes the first LSTM network model 201, cascade model 202, language model in Fig. 2
203, Named Entity Extraction Model 204, the first freeway net network layers 205 and the second freeway net network layers 206.Described
One LSTM network model 201 is connect with, the first freeway net network layers 205 and the second freeway net network layers 206 respectively,
In, the first freeway net network layers 205 are connect with language model 203, the second freeway net network layers 206 and cascade model 202
Connection, cascade model 202 and Named Entity Extraction Model 204.Language model 203 can remove completely, and it is real to will not influence name
Body identification model 204 executes name Entity recognition task, after only calculating plus the language model of language model 203, life
The implementation effect of name Entity recognition task can be more preferable.
Named Entity Extraction Model 204 includes full articulamentum and CRF layers.On the one hand full articulamentum can increase the quasi- of model
On the other hand conjunction ability can change the dimension of vector, such as an input is 200 neurons, is exported as 20 neurons
Full articulamentum, each 200 vector tieed up can be converted to the vector of 20 dimensions, subsequent CRF is facilitated to be handled.CRF layers
Effect be predict sentence in each character label.
The treatment process of name Entity recognition task is included in the second freeway net network layers 206, cascade model 202, complete
Articulamentum, CRF layers of progress data processing, carry out joint Tag Estimation to sentence by CRF layers, CRF layers of output is exactly to name
The prediction result of Entity recognition.
It should be noted that obtaining forward direction handling the positive output result by the second freeway net network layers 206
Processing result handles the reversed output result by freeway net network layers and obtains reverse process as a result, in cascade model
202 cascade positive processing result and reverse process result.Then cascade result is inputted into full articulamentum and CRF layers, led to
It crosses CRF layers and obtains the prediction result of name Entity recognition.
Entity handling module 604 is named, is used for through full articulamentum and CRF layers of condition random field to the cascaded-output
As a result it is handled, obtains the name Entity recognition label of character.
Before CRF layers, data have been processed into the tensor of n*k, and wherein n represents n character in a sentence, k generation
The number of species of table difference label, for each character, it is therefore an objective to selected inside from k label one as final mark
Note, if do not use CRF layers, individually each character is marked also possible, it is only necessary to by the k of each character tie up to
Then amount chooses the label as this character of maximum probability by softmax function model.But mark is not accounted in this way
Dependence between label, it is thus possible to will appear unreasonable sequence label, CRF layers calculate mark by a transfer matrix
The cost shifted between label calculates the overall labeling scheme of entire sentence.CRF marks the detailed of the NER label of each character
Step includes:
Assuming that the tensor of the n*k be P, this be CRF layers before layer processing result, PijRepresent i-th of word in sentence
Symbol is labeled as the score of j-th of label, for a forecasting sequence y=(y1, y2 ... ..., yn), is defined according to formula (11)
Its score:
Formula (11):
Wherein, s (X, y) indicates that sentence X is marked as the score of y sequence,It indicates from label yiTo label yi+1
Transfer score,Indicate that i-th of character is marked as yiScore.
Softmax function model is used on all possible sequence label, generates the probability of a flag sequence, it should
Probability can be indicated with formula (12):
Formula (12):Wherein, p (y | X) indicates that sentence X is marked as the probability of sequences y, Yx
Expression represents all possible sequence label, es(X,y)Indicate e be the truth of a matter, current markers sequence score be index power fortune
It calculates,
Indicate e be the truth of a matter, all flags sequence score be index power operation sum.
During the training period, the log-likelihood for maximizing correct sequence label is determined according to formula (13):
Formula (13):
In formula (13), and p (y | X) indicate that sentence X is marked as the probability of sequences y, Yx represents all possible label sequence
Column, s (X, y) indicate the score of current markers sequence,Indicate e be the truth of a matter, all flags sequence score be finger
The sum of several power operations.
In trained or forecast period, the output sequence of largest score will be obtained as name Entity recognition label.
Name entity recognition device provided by the invention can obtain positive output result according to character vector, term vector
And direction output cascades as a result, exporting result to forward direction output result and direction, obtains cascaded-output as a result, due to just
It include the contextual information of front subordinate sentence to output result, it is reversed to export the contextual information that result includes rear portion subordinate sentence,
So that after forward direction output result and direction output result are cascaded complete context semanteme can be obtained, by connecting entirely
It connects layer and CRF layers of condition random field handles cascaded-output result, obtained character designation Entity recognition label is than calibrated
Really, to improve the effect of name Entity recognition on the whole.
Referring to Fig. 7, the forward direction result processing module 601, comprising:
Computational submodule 6011 passes through for hiding described first in output input softmax function model
Softmax function model is calculated described first and hides the corresponding probability of output.
Submodule 6012 is handled, hides output for hiding the corresponding probability of output as described first for described first
Weight, according to described first hide output weight and it is described first hide output be weighted summation, obtain the first weighting
Summed result.
In the present embodiment, the normalization softmax function model can be Sequence Transformed for another by a number
Serial No., each of sequence after converting number are the values between 0 to 1, originally after bigger number conversion just
Close to 1, it originally is adjacent to 0 after smaller number conversion, can be used for expressing the concept of probability.
Supplementary explanation, can also be handled the hiding output by freeway net network layers, after processing
Hiding output input the softmax function model, obtain described first and hide the corresponding probability of output.It needs to illustrate
It is that freeway net network layers can be such that message part passes through, for a vector, not by dimension after freeway net network layers
Become, but each component of vector is different according to trained parameter, can only retain partial information.In this way, obtained likelihood ratio
Relatively accurate, effect is more preferable.
Explanation is needed further exist for, freeway net network layers can preferably increase the expression ability of model, extensive energy
Power.Since word relevant to each character will can make each word by freeway net network layers, freeway net network layers
By information it is different, and then the significance level and current character that give expression to each word are more concerned with which word.In this way, getting
First hide output the likelihood ratio respectively exported it is more accurate, can effectively ensure that subsequent NER identification is accurate.
Supplementary explanation, the reversed result processing module 602 are also used to:
Described second is hidden in output input softmax function model, is calculated by softmax function model
Described second hides the corresponding probability of output.
The weight that the corresponding probability of output hides output as described second is hidden using described second, according to described second
The weight and the second hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Referring to Fig. 8, name entity handling module 604 includes:
First input submodule 6041, for obtaining first for the full articulamentum of cascaded-output result input 2a*c
Output information, wherein a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition.
Second input submodule 6042 handles for CRF layers by first output information input, obtains character
Name Entity recognition label.
In the present embodiment, need to be used to name the positive freeway net network layers of Entity recognition export it is described just
The reverse process result exported to processing result and reversed freeway net network layers is cascaded, and the cascade meaning is exactly
Splicing connects full articulamentum and CRF layers after splicing again, thus needs to be arranged the full articulamentum of 2a*c.
Refering to Fig. 9, the name entity recognition device further include:
It is defeated to obtain second for the positive output result to be inputted freeway net network layers for first input module 605
Information out;
Second input module 606, for obtaining current character for the full articulamentum of second output information input a*b
Character late prediction probability, a be the first LSTM network model output dimension, b be corpus in all characters
Quantity;
Correction module 607, for the prediction probability according to the character late of the current character, to the current word
The name identification label of symbol is corrected.
Supplementary explanation obtains character in name entity handling module 604 in one embodiment of the present embodiment
After naming Entity recognition label, the first input module 605, the second input module 606 and correction module 607 execute phase respectively
It should operate, complete the name identification label to current character and be corrected.
In another embodiment of the present embodiment, after positive result processing module 601 obtains positive output result,
First input module 605, the second input module 606 execute corresponding operating respectively, obtain and store next word of current character
The prediction probability of symbol.After name entity handling module 604 obtains the name Entity recognition label of character, correction module 607
According to the prediction probability of the character late of stored current character, the name identification label of the current character is carried out
Correction.
For example, referring to Fig. 2, if current character is " city ", according to the sequence before the character " city "
Information and lexical information predict that next word is the probability in " day ".It says for another example, sentence is " I likes Changsha ", works as character
After " happiness " exports the first output information by the first LSTM network model, predict that next word is " joyous " according to the first output information
Probability.
In this way, can predict the probability distribution of character late, according to the character late of the current character
Prediction probability, the name of current character identification label is corrected, the accurate of name identification label can be improved
Degree, so that the effect of name Entity recognition is more preferable.
Name entity recognition device provided in an embodiment of the present invention, can be in conjunction with a plurality of types of LSTM modes according to word
Symbol vector, term vector obtain positive output result and direction output as a result, carrying out to forward direction output result and direction output result
Cascade obtains cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, reversed output knot
Fruit includes the contextual information of rear portion subordinate sentence, thus after forward direction output result and direction output result are cascaded, it can be with
It is semantic to obtain complete context, thus by full articulamentum and CRF layers of condition random field to cascaded-output result at
Reason, obtained character designation Entity recognition label is more accurate, to improve the effect of name Entity recognition on the whole.
The embodiment of the present invention provides a kind of computer installation, which includes processor, and processor is for executing
The step of name entity recognition method that above-mentioned each embodiment of the method provides is realized in memory when computer program.
Illustratively, computer program can be divided into one or more modules, one or more module is stored
In memory, and by processor it executes, to complete the present invention.One or more modules, which can be, can complete specific function
Series of computation machine program instruction section, the instruction segment is for describing implementation procedure of the computer program in computer installation.
For example, computer program can be divided into the step of name entity recognition method that above-mentioned each embodiment of the method provides.
It will be understood by those skilled in the art that the description of above-mentioned computer installation is only example, do not constitute to calculating
The restriction of machine device may include component more more or fewer than foregoing description, perhaps combine certain components or different
Component, such as may include input-output equipment, network access equipment, bus etc..
Alleged processor can be central processing unit (Central ProcessingUnit, CPU), can also be it
His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic device
Part, discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processing
Device etc., the processor are the control centres of the computer installation, are filled using various interfaces and the entire computer of connection
The various pieces set.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes
Computer program in the memory and/or module are stored, and calls the data being stored in memory, realizes institute
State the various functions of computer installation.The memory can mainly include storing program area and storage data area, wherein storage
Program area can application program needed for storage program area, at least one function (for example sound-playing function, image play function
Energy is equal) etc.;Storage data area, which can be stored, uses created data (such as audio data, phone directory etc.) etc. according to mobile phone.
It can also include nonvolatile memory in addition, memory may include high-speed random access memory, such as hard disk, interior
It deposits, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD)
Card, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states
Part.
If the integrated module/unit of the computer installation is realized in the form of SFU software functional unit and as independence
Product when selling or using, can store in a computer readable storage medium.Based on this understanding, this hair
All or part of the process in bright realization above-described embodiment method can also instruct relevant hardware by computer program
It completes, the computer program can be stored in a computer readable storage medium, the computer program is by processor
When execution, it can be achieved that the step of above-mentioned each name entity recognition method embodiment.Wherein, the computer program includes meter
Calculation machine program code, the computer program code can for source code form, object identification code form, executable file or certain
A little intermediate forms etc..The computer-readable medium may include: any entity that can carry the computer program code
Or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, electric signal and
Software distribution medium etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (10)
1. a kind of name entity recognition method, which is characterized in that the name entity recognition method includes:
According to the character forward sequence of sentence to be processed, the character vector of the sentence to be processed is sequentially input into the first shot and long term
LSTM network model is remembered, from the bebinning character of the sentence to be processed to the current input bytes of the first LSTM network model
In segment between symbol, the first correlation word of the current input character is obtained, by the term vector of first correlation word
It inputs the 2nd LSTM network model to be handled, obtains first and hide output, hide output weighted sum to described first, obtain
The first weighted sum value is inputted the first LSTM network model, passes through the first LSTM by the first weighted sum value
Network model obtains positive output result according to the character vector of the first weighted sum value and the current input character;
According to the character reverse sequence of the sentence to be processed, the character vector of the sentence to be processed is sequentially input reversed
One LSTM network model, from the final character of the sentence to be processed to the current input of the reversed first LSTM network model
In segment between character, obtain the second correlation word of the current input character, by the word of second correlation word to
Amount inputs the 2nd LSTM network model and is handled, and obtains second and hides output, hides output weighting to described second and asks
With obtain the second weighted sum value, the second weighted sum value inputted into the reversed first LSTM network model, passes through institute
Reversed first LSTM network model is stated to be obtained according to the character vector of the second weighted sum value and the current input character
Reversed output result;
By freeway net network layers highway layer respectively to the positive output result and the reversed output result into
Row processing respectively obtains positive processing result and reverse process as a result, by the positive processing result and the reverse process knot
Fruit is cascaded, and cascaded-output result is obtained;
The cascaded-output result is handled by full articulamentum and CRF layers of condition random field, the name for obtaining character is real
Body identifies label.
2. name entity recognition method according to claim 1, which is characterized in that described to add to the described first hiding output
Power summation, obtains the first weighted sum value, including following procedure:
Described first is hidden in output input softmax function model, is calculated described the by softmax function model
One hides the corresponding probability of output;
The weight that the corresponding probability of output hides output as described first is hidden using described first, is hidden according to described first defeated
Weight and described first out hides output and is weighted summation, obtains the first weighted sum result.
3. name entity recognition method according to claim 1, which is characterized in that described to pass through full articulamentum and CRF layers
The cascaded-output result is handled, the name Entity recognition label of character, including following procedure are obtained:
By the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained, wherein a is the first LSTM net
The dimension of the output of network model, c are the number of labels for naming Entity recognition;
First output information input is handled for CRF layers, the name Entity recognition label of character is obtained.
4. name entity recognition method according to any one of claim 1 to 3, which is characterized in that described all excessively complete
Articulamentum and CRF layers of condition random field handle the cascaded-output result, obtain the name Entity recognition label of character
Later, the name entity recognition method includes following procedure:
The positive output result is inputted into freeway net network layers, obtains the second output information;
By the full articulamentum of second output information input a*b, the prediction probability of the character late of current character, a are obtained
For the dimension of the output of the first LSTM network model, b is the quantity of all characters in corpus;
According to the prediction probability of the character late of the current character, school is carried out to the name identification label of the current character
Just.
5. a kind of name entity recognition device, which is characterized in that the name entity recognition device includes:
Positive result processing module, for the character forward sequence according to sentence to be processed, by the character of the sentence to be processed
Vector sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to described first
In segment between the current input character of LSTM network model, the first correlation word of the current input character is obtained, it will
The term vector of first correlation word inputs the 2nd LSTM network model and is handled, and obtains first and hides output, to described
First hides output weighted sum, obtains the first weighted sum value, and the first weighted sum value is inputted the first LSTM
Network model, by the first LSTM network model according to the word of the first weighted sum value and the current input character
Symbol vector obtains positive output result;
Reversed result processing module, for the character reverse sequence according to the sentence to be processed, by the sentence to be processed
Character vector sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described reversed first
In segment between the current input character of LSTM network model, the second correlation word of the current input character is obtained, it will
The term vector of second correlation word inputs the 2nd LSTM network model and is handled, and obtains second and hides output, right
Described second hides output weighted sum, obtains the second weighted sum value, and the second weighted sum value input is described reversed
First LSTM network model, by the reversed first LSTM network model according to the second weighted sum value and described current
The character vector of input character obtains reversely exporting result;
Cascade module, for by freeway net network layers highway layer respectively to the positive output result and described
Reversed output result is handled, respectively obtain positive processing result and reverse process as a result, by the positive processing result and
The reverse process result is cascaded, and cascaded-output result is obtained;
Entity handling module is named, for carrying out by full articulamentum and CRF layers of condition random field to the cascaded-output result
Processing, obtains the name Entity recognition label of character.
6. name entity recognition device according to claim 5, which is characterized in that the forward direction result processing module packet
It includes:
Computational submodule passes through softmax Function Modules for hiding described first in output input softmax function model
Type is calculated described first and hides the corresponding probability of output;
Submodule is handled, for hiding the weight that the corresponding probability of output hides output as described first, root for described first
The weight and described first for hiding output according to described first hide output and are weighted summation, obtain the first weighted sum result.
7. name entity recognition device according to claim 5, which is characterized in that the name entity handling module packet
It includes:
First input submodule, for obtaining the first output information for the full articulamentum of cascaded-output result input 2a*c,
Wherein, a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition;
Second input submodule handles for CRF layers by first output information input, obtains the name entity of character
Identify label.
8. name entity recognition device according to any one of claims 5 to 7, which is characterized in that the name entity
Identification device further include:
First input module obtains the second output information for the positive output result to be inputted freeway net network layers;
Second input module, for obtaining the next of current character for the full articulamentum of second output information input a*b
The prediction probability of character, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
Correction module, the name for the prediction probability according to the character late of the current character, to the current character
Identification label is corrected.
9. a kind of computer installation, which is characterized in that the computer installation includes processor, and the processor is deposited for executing
The step of entity recognition method is named as described in any one of claim 1-4 is realized in reservoir when computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, it is characterised in that: the computer program
The step of entity recognition method is named as described in any one of claim 1-4 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910597499.1A CN110516228A (en) | 2019-07-04 | 2019-07-04 | Name entity recognition method, device, computer installation and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910597499.1A CN110516228A (en) | 2019-07-04 | 2019-07-04 | Name entity recognition method, device, computer installation and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110516228A true CN110516228A (en) | 2019-11-29 |
Family
ID=68623642
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910597499.1A Withdrawn CN110516228A (en) | 2019-07-04 | 2019-07-04 | Name entity recognition method, device, computer installation and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516228A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191459A (en) * | 2019-12-25 | 2020-05-22 | 医渡云(北京)技术有限公司 | Text processing method and device, readable medium and electronic equipment |
CN111460820A (en) * | 2020-03-06 | 2020-07-28 | 中国科学院信息工程研究所 | Network space security domain named entity recognition method and device based on pre-training model BERT |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615589A (en) * | 2015-02-15 | 2015-05-13 | 百度在线网络技术(北京)有限公司 | Named-entity recognition model training method and named-entity recognition method and device |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN109034264A (en) * | 2018-08-15 | 2018-12-18 | 云南大学 | Traffic accident seriousness predicts CSP-CNN model and its modeling method |
CN109388807A (en) * | 2018-10-30 | 2019-02-26 | 中山大学 | The method, apparatus and storage medium of electronic health record name Entity recognition |
CN109492227A (en) * | 2018-11-16 | 2019-03-19 | 大连理工大学 | It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations |
CN109741732A (en) * | 2018-08-30 | 2019-05-10 | 京东方科技集团股份有限公司 | Name entity recognition method, name entity recognition device, equipment and medium |
-
2019
- 2019-07-04 CN CN201910597499.1A patent/CN110516228A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615589A (en) * | 2015-02-15 | 2015-05-13 | 百度在线网络技术(北京)有限公司 | Named-entity recognition model training method and named-entity recognition method and device |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN109034264A (en) * | 2018-08-15 | 2018-12-18 | 云南大学 | Traffic accident seriousness predicts CSP-CNN model and its modeling method |
CN109741732A (en) * | 2018-08-30 | 2019-05-10 | 京东方科技集团股份有限公司 | Name entity recognition method, name entity recognition device, equipment and medium |
CN109388807A (en) * | 2018-10-30 | 2019-02-26 | 中山大学 | The method, apparatus and storage medium of electronic health record name Entity recognition |
CN109492227A (en) * | 2018-11-16 | 2019-03-19 | 大连理工大学 | It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations |
Non-Patent Citations (4)
Title |
---|
BAE JANGSEONG 等: "Korean Semantic Role Labeling with Highway BiLSTM-CRFs", 《韩国语言信息科学协会:学术大会论文集》 * |
ZHAO DONGYANG 等: "A Joint Decoding Algorithm for Named Entity Recognition", 《2018 IEEE THIRD INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC)》 * |
ZHAO DONGYANG 等: "Chinese Name Entity Recognition Using Highway-LSTM-CRF", 《PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE》 * |
ZHAO DONGYANG: "Named Entity Recognition Based on BiRHN and CRF", 《INTERNATIONAL CONFERENCE ON GREEN, PERVASIVE, AND CLOUD COMPUTING》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191459A (en) * | 2019-12-25 | 2020-05-22 | 医渡云(北京)技术有限公司 | Text processing method and device, readable medium and electronic equipment |
CN111191459B (en) * | 2019-12-25 | 2023-12-12 | 医渡云(北京)技术有限公司 | Text processing method and device, readable medium and electronic equipment |
CN111460820A (en) * | 2020-03-06 | 2020-07-28 | 中国科学院信息工程研究所 | Network space security domain named entity recognition method and device based on pre-training model BERT |
CN111460820B (en) * | 2020-03-06 | 2022-06-17 | 中国科学院信息工程研究所 | Network space security domain named entity recognition method and device based on pre-training model BERT |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106202010B (en) | Method and apparatus based on deep neural network building Law Text syntax tree | |
CN110162749A (en) | Information extracting method, device, computer equipment and computer readable storage medium | |
CN109657226B (en) | Multi-linkage attention reading understanding model, system and method | |
CN108628823A (en) | In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training | |
CN108416384A (en) | A kind of image tag mask method, system, equipment and readable storage medium storing program for executing | |
CN111753081A (en) | Text classification system and method based on deep SKIP-GRAM network | |
CN106294684A (en) | The file classification method of term vector and terminal unit | |
CN108959482A (en) | Single-wheel dialogue data classification method, device and electronic equipment based on deep learning | |
CN106326346A (en) | Text classification method and terminal device | |
CN113220876B (en) | Multi-label classification method and system for English text | |
CN110188195B (en) | Text intention recognition method, device and equipment based on deep learning | |
CN110222184A (en) | A kind of emotion information recognition methods of text and relevant apparatus | |
CN110188175A (en) | A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium | |
CN108763556A (en) | Usage mining method and device based on demand word | |
CN112800239B (en) | Training method of intention recognition model, and intention recognition method and device | |
CN111723569A (en) | Event extraction method and device and computer readable storage medium | |
CN108664512B (en) | Text object classification method and device | |
CN111460830B (en) | Method and system for extracting economic events in judicial texts | |
CN112836502B (en) | Financial field event implicit causal relation extraction method | |
CN112420191A (en) | Traditional Chinese medicine auxiliary decision making system and method | |
CN107357785A (en) | Theme feature word abstracting method and system, feeling polarities determination methods and system | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
CN113268561B (en) | Problem generation method based on multi-task joint training | |
CN112686049A (en) | Text auditing method, device, equipment and storage medium | |
CN110516228A (en) | Name entity recognition method, device, computer installation and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20191129 |
|
WW01 | Invention patent application withdrawn after publication |