CN110516228A - Name entity recognition method, device, computer installation and computer readable storage medium - Google Patents

Name entity recognition method, device, computer installation and computer readable storage medium Download PDF

Info

Publication number
CN110516228A
CN110516228A CN201910597499.1A CN201910597499A CN110516228A CN 110516228 A CN110516228 A CN 110516228A CN 201910597499 A CN201910597499 A CN 201910597499A CN 110516228 A CN110516228 A CN 110516228A
Authority
CN
China
Prior art keywords
character
output
result
network model
entity recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910597499.1A
Other languages
Chinese (zh)
Inventor
周忠诚
赵东阳
段炼
郭建京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Xinghan Shuzhi Technology Co Ltd
Original Assignee
Hunan Xinghan Shuzhi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Xinghan Shuzhi Technology Co Ltd filed Critical Hunan Xinghan Shuzhi Technology Co Ltd
Priority to CN201910597499.1A priority Critical patent/CN110516228A/en
Publication of CN110516228A publication Critical patent/CN110516228A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention is suitable for Internet technical field, provide a kind of name entity recognition method, device, computer installation and computer readable storage medium, this method comprises: obtaining positive output result according to the first weighted sum value and character vector by the first LSTM network model, it is obtained reversely exporting result according to the second weighted sum value and character vector by reversed first LSTM network model, the positive output result and the reversed output result are handled respectively by freeway net network layers highway layer, respectively obtain positive processing result and reverse process result, the positive processing result and the reverse process result are cascaded, obtain cascaded-output result;The cascaded-output result is handled by full articulamentum and CRF layers, obtains the name Entity recognition label of character.Name entity recognition method provided by the invention can be improved the accuracy rate of name Entity recognition, to improve name Entity recognition effect on the whole.

Description

Name entity recognition method, device, computer installation and computer readable storage medium
Technical field
The invention belongs to Internet technical field more particularly to a kind of name entity recognition method, device, computer dresses It sets and computer readable storage medium.
Background technique
Existing natural language processing technique generallys use two-way shot and long term memory-condition random field algorithm (Bi- Directional Long Short-Term Memory-Conditional Random Field algorithm, BiLSTM-CRF model framework), the framework on the language such as English effect it is fine.If being needed in Chinese using the framework It first to be segmented, if mistake occurs in participle, will affect name Entity recognition as a result, therefore carrying out Chinese using the framework The effect of identification is not satisfactory.Yue Zhang etc. proposes a kind of Lattice LSTM structure, which can Effectively to utilize word information, but Lattice LSTM structure does not have and other semi-supervised technologies, such as language model combines Get up and carry out joint training, and be embedded in dictionary by the word of the pre-training in the text of large-scale automatic word segmentation to be influenced, causes Name the accuracy of Entity recognition relatively low.
Summary of the invention
The embodiment of the present invention provides a kind of name entity recognition method, device, computer installation and computer-readable storage Medium, it is intended to solve the problems, such as that the accuracy that name Entity recognition exists in the prior art is relatively low.
The invention is realized in this way a kind of name entity recognition method, including following procedure:
According to the character forward sequence of sentence to be processed, the character vector of the sentence to be processed is sequentially input first Shot and long term remembers LSTM network model, from bebinning character the working as to the first LSTM network model of the sentence to be processed In segment between preceding input character, the first correlation word of the current input character is obtained, by first correlation word Term vector input the 2nd LSTM network model handled, obtain first hide output, to described first hide output weighting Summation, obtains the first weighted sum value, and the first weighted sum value is inputted the first LSTM network model, passes through institute It states the first LSTM network model and forward direction is obtained according to the character vector of the first weighted sum value and the current input character Export result;
According to the character reverse sequence of the sentence to be processed, the character vector of the sentence to be processed is sequentially input Reversed first LSTM network model, from the final character of the sentence to be processed to the reversed first LSTM network model In segment between current input character, the second correlation word of the current input character is obtained, by second related term The term vector of language inputs the 2nd LSTM network model and is handled, and obtains second and hides output, hides to described second defeated Weighted sum out obtains the second weighted sum value, and the second weighted sum value is inputted the reversed first LSTM network mould Type, by the reversed first LSTM network model according to the word of the second weighted sum value and the current input character Symbol vector obtains reversely exporting result;
By freeway net network layers highway layer respectively to the positive output result and the reversed output As a result it is handled, respectively obtains positive processing result and reverse process as a result, by the positive processing result and described reversed Processing result is cascaded, and cascaded-output result is obtained;
The cascaded-output result is handled by full articulamentum and CRF layers of condition random field, obtains the life of character Name Entity recognition label.
Further, described to hide output weighted sum to described first, the first weighted sum value is obtained, including following Process:
Described first is hidden in output input softmax function model, is calculated by softmax function model Described first hides the corresponding probability of output;
The weight that the corresponding probability of output hides output as described first is hidden using described first, according to described first The weight and the first hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Further, described that the cascaded-output result is handled by full articulamentum and CRF layers, obtain character Name Entity recognition label, including following procedure:
By the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained, wherein a first The dimension of the output of LSTM network model, c are the number of labels for naming Entity recognition;
First output information input is handled for CRF layers, the name Entity recognition label of character is obtained.
Further, all excessively full articulamentum and CRF layers of condition random field to the cascaded-output result at Reason, after obtaining the name Entity recognition label of character, the name entity recognition method includes following procedure:
The positive output result is inputted into freeway net network layers, obtains the second output information;
By the full articulamentum of second output information input a*b, the prediction for obtaining the character late of current character is general Rate, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
According to the prediction probability of the character late of the current character, label is identified to the name of the current character It is corrected.
The present invention also provides a kind of name entity recognition devices, comprising:
Positive result processing module, for the character forward sequence according to sentence to be processed, by the sentence to be processed Character vector sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to described In segment between the current input character of first LSTM network model, the first related term of the current input character is obtained The term vector of first correlation word is inputted the 2nd LSTM network model and handled by language, is obtained first and is hidden output, Output weighted sum is hidden to described first, obtains the first weighted sum value, the first weighted sum value is inputted described the One LSTM network model, by the first LSTM network model according to the first weighted sum value and the current input The character vector of character obtains positive output result;
Reversed result processing module, for the character reverse sequence according to the sentence to be processed, by the language to be processed The character vector of sentence sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described anti- Into the segment between the current input character of the first LSTM network model, obtain the current input character second is related The term vector of second correlation word is inputted the 2nd LSTM network model and handled by word, is obtained second and is hidden Output hides output weighted sum to described second, obtains the second weighted sum value, the second weighted sum value is inputted The reversed first LSTM network model, by the reversed first LSTM network model according to the second weighted sum value It obtains reversely exporting result with the character vector of the current input character;
Cascade module, for by freeway net network layers highway layer respectively to the positive output result and The reversed output result is handled, and respectively obtains positive processing result and reverse process as a result, by the positive processing knot Fruit and the reverse process result are cascaded, and cascaded-output result is obtained;
Entity handling module is named, is used for through full articulamentum and CRF layers of condition random field to the cascaded-output result It is handled, obtains the name Entity recognition label of character.
Further, the positive result processing module includes:
Computational submodule passes through softmax letter for hiding described first in output input softmax function model Exponential model is calculated described first and hides the corresponding probability of output;
Submodule is handled, for hiding the power that the corresponding probability of output hides output as described first for described first Weight, the weight for hiding output according to described first and the first hiding output are weighted summation, obtain the first weighted sum As a result.
Further, the name entity handling module includes:
First input submodule, for obtaining the first output for the full articulamentum of cascaded-output result input 2a*c Information, wherein a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition;
Second input submodule handles for CRF layers by first output information input, obtains the life of character Name Entity recognition label.
Further, the name entity recognition device further include:
First input module obtains the second output letter for the positive output result to be inputted freeway net network layers Breath;
Second input module, for obtaining current character for the full articulamentum of second output information input a*b The prediction probability of character late, a are the dimension of the output of the first LSTM network model, and b is all characters in corpus Quantity;
Correction module, for the prediction probability according to the character late of the current character, to the current character Name identification label is corrected.
The present invention also provides a kind of computer installation, the computer installation includes processor, and the processor is for holding It is realized when computer program such as the step of above-mentioned name entity recognition method in line storage.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the computer journey It realizes when sequence is executed by processor such as the step of above-mentioned name entity recognition method.
Name entity recognition method provided by the invention, can in conjunction with a plurality of types of LSTM modes according to character vector, Term vector obtains positive output result and direction output as a result, cascading to forward direction output result and direction output result, obtains To cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, after reversed output result includes The contextual information of part sentence, to can be obtained complete after forward direction output result and direction output result are cascaded Context it is semantic, to handle by full articulamentum and CRF layers cascaded-output result, obtained character designation reality Body identifies that label is more accurate, to improve the effect of name Entity recognition on the whole.
Detailed description of the invention
Fig. 1 is the implementation flow chart of name entity recognition method provided in an embodiment of the present invention;
Fig. 2 information flow direction schematic diagram provided in an embodiment of the present invention;
Fig. 3 is that the present invention implements to ask the described first hiding output weighting described in step S101 shown in the Fig. 1 provided With obtain the implementation flow chart of the first weighted sum value;
Fig. 4 is the implementation flow chart that the present invention implements step S104 shown in the Fig. 1 provided;
Fig. 5 is another stream that the present invention implements the name entity recognition method after step S103 shown in the Fig. 1 provided Cheng Tu;
Fig. 6 is the structural schematic diagram that the present invention implements a kind of name entity recognition device provided;
Fig. 7 is the structural schematic diagram of positive result processing module provided in an embodiment of the present invention;
Fig. 8 is a structural schematic diagram of name entity handling module provided in an embodiment of the present invention;
Fig. 9 is the structural schematic diagram of another name entity recognition device provided in an embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, The present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only used to explain this hair It is bright, it is not intended to limit the present invention.
Fig. 1 show the flow chart of name entity recognition method provided in an embodiment of the present invention.The name Entity recognition Method, including following procedure:
Step S101, according to the character forward sequence of sentence to be processed, successively by the character vector of the sentence to be processed The first shot and long term memory LSTM network model is inputted, from the bebinning character of the sentence to be processed to the first LSTM network In segment between the current input character of model, the first correlation word of the current input character is obtained, by described first The term vector of correlation word inputs the 2nd LSTM network model and is handled, and obtains first and hides output, hides to described first Weighted sum is exported, the first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network mould Type, by the first LSTM network model according to the character of the first weighted sum value and the current input character to Measure positive output result.
In the present embodiment, in advance as unit of sentence by the molecular list of sentence, then corpus can be divided into Each sentence is respectively processed.Before treatment, all to what is occurred in corpus by processing modes such as word2vec Character and vocabulary are trained, and obtain the character insertion dictionary and word insertion dictionary of a pre-training.The content of dictionary is: every A character or word and its corresponding intensive vector, that is to say, that term vector is to segment sentence, is carried out as unit of word Vectorization indicates;It is that unit carries out vectorization expression that character vector, which is by each character in sentence,.Below to term vector and word Symbol vector is illustrated.Such as:
In the present embodiment, character in the sentence to be processed can be determined from the character insertion dictionary obtained in advance Character vector, from the bebinning character of the sentence to be processed to the current input character of the first LSTM network model it Between segment in, obtain the first correlation word of the current input character, from obtain in advance word insertion dictionary in determine institute State the term vector of the first correlation word.
In the present embodiment, the first shot and long term memory (Long Short-Term Memory, LSTM) network model is length Short-term memory network model is a kind of time recurrent neural networks model.The first LSTM network model is for handling word The LSTM network model of vector is accorded with, the 2nd LSTM network model is the LSTM network model for handling term vector.Due to Term vector is inputted in two LSTM network models, then the first hiding output includes lexical information.
It should be noted that the character forward sequence according to the sentence to be processed is successively defeated by the character vector of character Enter the process of the first LSTM network model, that is, the character string of sentence is inputted from front to back.Such as sentence " I likes Changsha ", character forward sequence are ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.
It further illustrates, the first correlation word is in segment from the beginning of sentence to current character with current word Accord with the word of ending.First correlation word may or may not be present.In the case where there is the first correlation word, first is related The number of word can be one or more.For example, in the segment on " Changsha Orange Islet square ", with the word of " field " ending There are " Orange Islet square " and " square ", the first correlation word of character " field " has " Orange Islet square " and " square ".
Specifically, current input character has the first correlation word without the first correlation word and current input character, has Different calculating process.Specifically, in the case where current input character is without the first correlation word, it can be according to formula (1), formula (2) and formula (3) constitute the first LSTM network model.Formula (1)-(3) particular content is as follows:
Formula (1):
Wherein,Indicate input gate,Indicate out gate,It indicates to forget door,Indicate what current time step updated Status information, σ indicate that Sigmoid function, Sigmoid function are often used as the threshold function table of neural network, variable mappings arrived Between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh is also an activation primitive, tanh letter Number may refer to pertinent texts or network encyclopaedia, WcTIndicate weight vectors, for training parameter,Indicate character vector,Indicate the output of a time step, bcIndicate offset vector and model training parameter.
Formula (2):
Wherein,Indicate current new state information,It indicates to forget door,Indicate the state letter an of time step Breath,Indicate input gate,Indicate the status information that current time step updates.
Formula (3):
Wherein, h indicates current hidden layer output,Indicate out gate,Indicate that current new state information is logical Cross the output after activation primitive.
In the case where current input character is without the first correlation word, according to formula (1), formula (2) to formula (3) It is calculated, finally the output by the output being calculated in formula (3) as the first LSTM network model.
In the case where current input character has the first correlation word, formula (1) above-mentioned, formula (3) can be used as One LSTM network model, formula (4)-(10) can form the 2nd LSTM network model.Formula (4)-(10) particular content is such as Under:
Formula (4):
Wherein,Indicate input gate,It indicates to forget door,Indicate the status information that current time step updates, σ Indicate Sigmoid function, Sigmoid function is an activation primitive, the threshold function table of neural network is often used as, by variable It is mapped between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh indicates to be also an activation letter Number, tanh function may refer to pertinent texts or network encyclopaedia,It is term vector,It is word all characters, WcTIt indicates Weight vectors, for training parameter, bwIndicate offset vector and model training parameter.
Formula (5):
Wherein,Indicate current new state information,It indicates to forget door,Processing character sequence in representation formula (2) New state information when column when time step b,Indicate input gate,Indicate the status information that current time step updates.
Formula (6):
Wherein,Indicate that input gate, t indicate non-linear conversion layer, i.e., the one full articulamentum with activation primitive, whTable Show weight vectors,Indicate current new state information, bHIndicate offset vector.
Formula (7):
Wherein,It indicates the output for inputting the LSTM network model of character vector and is used for inputting related term vector LSTM network model the sum of the weighting of output, b indicates the set that all b' is constituted, and b' indicates all with b' in sentence The position of the word of beginning,Indicate that the term vector for starting to end up with j with b', D indicate current sentence,Indicate term vector Weight,Indicate term vector,Indicate the weight of character vector,Indicate character vector.
Formula (8):
Wherein,Indicate the weight of term vector,Indicate that the input gate of word vector, b' indicate the element in b ", b " table Show all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates The input gate of term vector,Indicate the input gate of term vector.
Formula (9):
Wherein,Indicate the weight of word vector,Indicate that the input gate of word vector, b' indicate that the element in b ", b " indicate All starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates The input gate of term vector.
Formula (10):
Wherein, t indicates model output, and σ indicates activation primitive,Indicate weight vectors to be trained, bTIndicate to Trained offset vector.
In the case where there is the first correlation word, calculated by the 2nd LSTM network model that formula (4)-(10) form First weighted sum value of the first correlation word.First weighted sum value and character vector are passed through according to formula (1) and formula (3) the first LSTM network model composed by obtains positive output result.
Referring to Fig. 2, sentence to be processed is " Changsha Orange Islet square " in Fig. 2, dictionary can be embedded in by character The character vector for obtaining " length " " sand " " city " " tangerine " " son " " continent " " wide " " field " these characters respectively, in current input character When for " city ", the first correlation word is " Changsha ", and when current input character is " field ", the first correlation word is " orange Continent square " and " square " can obtain " Changsha ", " Orange Islet square ", " square " these words by insertion dictionary Term vector.For example, when current input character is " field ", by the term vector on " Orange Islet square " and the word on " square " to Amount the 2nd LSTM network model of input is handled, and is obtained first and is hidden output.Output weighted sum is hidden to described first, The first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network model.Pass through described first LSTM network model obtains the positive output result of " field " according to the character vector of the first weighted sum value and " field ".By Hiding output in first includes lexical information, then the first weighted sum value also includes lexical information, in this way, positive output result Contextual information including front subordinate sentence.
Step S102, according to the character reverse sequence of the sentence to be processed, by the character vector of the sentence to be processed Reversed first LSTM network model is sequentially input, from the final character of the sentence to be processed to the reversed first LSTM net In segment between the current input character of network model, the second correlation word of the current input character is obtained, by described The term vectors of two correlation words inputs the 2nd LSTM network model and is handled, and obtains second and hides output, to described the Two hide output weighted sum, obtain the second weighted sum value, and the second weighted sum value is inputted described reversed first LSTM network model, by the reversed first LSTM network model according to the second weighted sum value and described current defeated The character vector for entering character obtains reversely exporting result.
In the present embodiment, reversed first LSTM network model is also shot and long term memory network model, is to pass a kind of time Return neural network model.Reversed first LSTM network model is used for the character reverse sequence according to sentence to be processed successively by word Symbol vector is inputted.The specific calculating process of reversed first LSTM network model and the calculating of the first LSTM network model Journey is similar, and to avoid repeating, this will not be repeated here.
It should be noted that successively the character vector of character is inputted according to the character reverse sequence of the sentence to be processed The process of the reversed first LSTM network model, the as reversed process of input character vector, that is, by the character of sentence Sequence inputs from back to front.Such as sentence " I likes Changsha ", forward direction is input to the sequence in the first LSTM network model It is ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.And reversely it is input in the first LSTM network model mode that then input sequence is exactly ' sand ', ' length ', ' joyous ', ' happiness ', ' I '.For forward direction input, when ' joyous ' is arrived in the processing of the first LSTM network model, meeting The 2nd LSTM network model will be passed through, the term vector corresponding second for obtaining ' liking ' hides output.
For the character reverse sequence according to the sentence to be processed, successively by the character vector of the sentence to be processed Input reversed first LSTM network model, i.e., for reversed input, when ' happiness ' is arrived in the processing of reversed first LSTM network model, The term vector ' liked ' corresponding second can will be obtained by the 2nd LSTM network model hides output.Referring to Fig. 2, When current input character is " tangerine ", the second correlation word is " Orange Islet square ", and the term vector on " Orange Islet square " is inputted 2nd LSTM network model is handled, and is obtained second and is hidden output.Output weighted sum is hidden to described second, obtains the Two weighted sum values, will
Step S103, by freeway net network layers highway layer respectively to the positive output result and described Reversed output result is handled, respectively obtain positive processing result and reverse process as a result, by the positive processing result and The reverse process result is cascaded, and cascaded-output result is obtained.
In the present embodiment, the English name of freeway net network layers be highwaylayer, highway layer it Message part can be made to pass through, it is constant by dimension after highwaylayer for a vector, but each of vector Component is different according to trained parameter, can only retain partial information.It is described to pass through freeway net network layers highwaylayer points It is other that the positive output result and the reversed output result are handled, respectively obtain positive processing result and reversed place Reason is as a result, include following procedure: by the positive output result pass through the freeway net network layers for name Entity recognition into Row processing, obtains positive processing result, and the reversed output result is passed through the highway for name Entity recognition Network layer is handled, and reverse process result is obtained.
In the present embodiment, the positive output result and the reversed output result can pass through highway respectively It network layer and then is cascaded.The cascaded-output result includes the contextual information of preceding part and the context of rear part Information, i.e., the described cascaded-output result include the contextual information of sentence, can effectively indicate full text semantic feature.
It should be noted that the cascade meaning is exactly to splice, the positive output result includes the context letter of preceding part Breath, the reversed output result include the contextual information of rear part, then the cascaded-output result includes sentence context letter Breath.
It further remarks additionally, includes the first LSTM network model 201, cascade model 202, language model in Fig. 2 203, Named Entity Extraction Model 204, the first freeway net network layers 205 and the second freeway net network layers 206.Described One LSTM network model 201 is connect with the first freeway net network layers 205 and the second freeway net network layers 206 respectively, In, the first freeway net network layers 205 are connect with language model 203, the second freeway net network layers 206 and cascade model 202 Connection, cascade model 202 and Named Entity Extraction Model 204.Language model 203 can remove, and will not influence name entity and know Other model 204 executes name Entity recognition task, and after only calculating plus the language model of language model 203, name is real The implementation effect of body identification mission can be more preferable.
Named Entity Extraction Model 204 includes full articulamentum and CRF layers.On the one hand full articulamentum can increase the quasi- of model On the other hand conjunction ability can change the dimension of vector, such as an input is 200 neurons, is exported as 20 neurons Full articulamentum, each 200 vector tieed up can be converted to the vector of 20 dimensions, subsequent CRF is facilitated to be handled.CRF layers Effect be predict sentence in each character label.
The treatment process of name Entity recognition task is included in the second freeway net network layers 206, cascade model 202, complete Articulamentum, CRF layers of progress data processing, carry out joint Tag Estimation to sentence by CRF layers, CRF layers of output is exactly to name The prediction result of Entity recognition.
It should be noted that obtaining forward direction handling the positive output result by the second freeway net network layers 206 Processing result handles the reversed output result by freeway net network layers and obtains reverse process as a result, in cascade model 202 cascade positive processing result and reverse process result.Then cascade result is inputted into full articulamentum and CRF layers, led to It crosses CRF layers and obtains the prediction result of name Entity recognition.
Step S104, the cascaded-output result is handled by full articulamentum and CRF layers of condition random field, is obtained To the name Entity recognition label of character.
Before CRF layers, data have been processed into the tensor of n*k, and wherein n represents n character in a sentence, k generation The number of species of table difference label, for each character, it is therefore an objective to selected inside from k label one as final mark Note, if do not use CRF layers, individually each character is marked also possible, it is only necessary to by the k of each character tie up to Then amount chooses the label as this character of maximum probability by softmax function model.But mark is not accounted in this way Dependence between label, it is thus possible to will appear unreasonable sequence label, CRF layers calculate mark by a transfer matrix The cost shifted between label calculates the overall labeling scheme of entire sentence.CRF marks the detailed of the NER label of each character Step includes:
Assuming that the tensor of the n*k be P, this be CRF layers before layer processing result, PijRepresent i-th of word in sentence Symbol is labeled as the score of j-th of label, for a forecasting sequence y=(y1, y2 ... ..., yn), is defined according to formula (11) Its score:
Formula (11):
Wherein, s (X, y) indicates that sentence X is marked as the score of y sequence,It indicates from label yiTo label yi+1 Transfer score,Indicate that i-th of character is marked as yiScore.
Softmax function model is used on all possible sequence label, generates the probability of a flag sequence, it should Probability can be indicated with formula (12):
Formula (12):Wherein, p (y | X) indicates that sentence X is marked as the probability of sequences y, Yx Expression represents all possible sequence label, es(X,y)Indicate e be the truth of a matter, current markers sequence score be index power fortune It calculates,Indicate e be the truth of a matter, all flags sequence score be index power operation sum.
During the training period, the log-likelihood for maximizing correct sequence label is determined according to formula (13):
Formula (13):
In formula (13), and p (y | X) indicate that sentence X is marked as the probability of sequences y, Yx represents all possible label sequence Column, s (X, y) indicate the score of current markers sequence,Indicate e be the truth of a matter, all flags sequence score be finger The sum of several power operations.
In trained or forecast period, the output sequence of largest score will be obtained as name Entity recognition label.
Name entity recognition method provided by the invention can obtain positive output result according to character vector, term vector And direction output cascades as a result, exporting result to forward direction output result and direction, obtains cascaded-output as a result, due to just It include the contextual information of front subordinate sentence to output result, it is reversed to export the contextual information that result includes rear portion subordinate sentence, So that after forward direction output result and direction output result are cascaded complete context semanteme can be obtained, by connecting entirely It connects layer and CRF layers of condition random field handles cascaded-output result, obtained character designation Entity recognition label is than calibrated Really, to improve the effect of name Entity recognition on the whole.
Referring to Fig. 3, described hide to described first in the step S101 exports weighted sum, obtains the first weighting Summing value, including following procedure:
Step S1011, described first is hidden in output input softmax function model, passes through softmax Function Modules Type is calculated described first and hides the corresponding probability of output.
Step S1012, the weight that the corresponding probability of output hides output as described first, root are hidden using described first The weight and described first for hiding output according to described first hide output and are weighted summation, obtain the first weighted sum result.
In the present embodiment, the normalization softmax function model can be Sequence Transformed for another by a number Serial No., each of sequence after converting number are the values between 0 to 1, originally after bigger number conversion just Close to 1, it originally is adjacent to 0 after smaller number conversion, can be used for expressing the concept of probability.
Supplementary explanation, can also be handled the hiding output by freeway net network layers, after processing Hiding output input the softmax function model, obtain described first and hide the corresponding probability of output.It needs to illustrate It is that freeway net network layers can be such that message part passes through, for a vector, not by dimension after freeway net network layers Become, but each component of vector is different according to trained parameter, can only retain partial information.In this way, obtained likelihood ratio Relatively accurate, effect is more preferable.
Explanation is needed further exist for, freeway net network layers can preferably increase the expression ability of model, extensive energy Power.Since word relevant to each character will can make each word by freeway net network layers, freeway net network layers By information it is different, and then the significance level and current character that give expression to each word are more concerned with which word.In this way, getting First hide output the likelihood ratio respectively exported it is more accurate, can effectively ensure that subsequent NER identification is accurate.
It remarks additionally, described hide to described second in the step S102 exports weighted sum, obtains second Weighted sum value, including following procedure:
Described second is hidden in output input softmax function model, is calculated by softmax function model Described second hides the corresponding probability of output.
The weight that the corresponding probability of output hides output as described second is hidden using described second, according to described second The weight and the second hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Referring to Fig. 4, step S104 can with the following steps are included:
Step S1041, by the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained, In, a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition.
Step S1042, it is handled for CRF layers by first output information input, obtains the name Entity recognition of character Label.
In the present embodiment, need to be used to name the positive freeway net network layers of Entity recognition export it is described just The reverse process result exported to processing result and reversed freeway net network layers is cascaded, and then connects full connection again Layer and CRF layers, thus need to be arranged the full articulamentum of 2a*c.
Refering to Fig. 5, after the step S104, the name entity recognition method is further comprising the steps of:
The positive output result is inputted freeway net network layers, obtains the second output information by step S105;
The full articulamentum of second output information input a*b is obtained next word of current character by step S106 The prediction probability of symbol, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
Step S107, the name according to the prediction probability of the character late of the current character, to the current character Identification label is corrected.
For example, referring to Fig. 2, if current character is " city ", according to the sequence before the character " city " Information and lexical information predict that next word is the probability in " day ".It says for another example, sentence is " I likes Changsha ", works as character After " happiness " exports the first output information by the first LSTM network model, predict that next word is " joyous " according to the first output information Probability.
Supplementary explanation, in the present embodiment, after step slol, can carry out step S105 and step S106, Prediction after the prediction probability that step S106 obtains the character late of current character, to the character late of current character Probability is stored.After step s 104, step S107 is carried out, according to the character late of stored current character Prediction probability is corrected the name identification label of the current character.
In this way, can predict the probability distribution of character late, according to the character late of the current character Prediction probability, the name of current character identification label is corrected, the accurate of name identification label can be improved Degree, so that the effect of name Entity recognition is more preferable.
Name entity recognition method provided in an embodiment of the present invention, can be in conjunction with a plurality of types of LSTM modes according to word Symbol vector, term vector obtain positive output result and direction output as a result, carrying out to forward direction output result and direction output result Cascade obtains cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, reversed output knot Fruit includes the contextual information of rear portion subordinate sentence, thus after forward direction output result and direction output result are cascaded, it can be with It is semantic to obtain complete context, thus by full articulamentum and CRF layers of condition random field to cascaded-output result at Reason, obtained character designation Entity recognition label is more accurate, to improve the effect of name Entity recognition on the whole.
Fig. 6 shows a kind of structural schematic diagram for naming entity recognition device 600 provided in an embodiment of the present invention, in order to Convenient for explanation, illustrates only and implement relevant part in the present invention.The name entity recognition device 600, comprising:
Positive result processing module 601, for the character forward sequence according to sentence to be processed, by the language to be processed The character vector of sentence sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to In segment between the current input character of the first LSTM network model, the first phase of the current input character is obtained Close word, the term vector of first correlation word inputted into the 2nd LSTM network model and is handled, obtain first hide it is defeated Out, output weighted sum is hidden to described first, obtains the first weighted sum value, the first weighted sum value is inputted into institute The first LSTM network model is stated, by the first LSTM network model according to the first weighted sum value and described current The character vector of input character obtains positive output result.
In the present embodiment, in advance as unit of sentence by the molecular list of sentence, then corpus can be divided into Each sentence is respectively processed.Before treatment, all to what is occurred in corpus by processing modes such as word2vec Character and vocabulary are trained, and obtain the character insertion dictionary and word insertion dictionary of a pre-training.The content of dictionary is: every A character or word and its corresponding intensive vector, that is to say, that term vector is to segment sentence, is carried out as unit of word Vectorization indicates;It is that unit carries out vectorization expression that character vector, which is by each character in sentence,.Below to term vector and word Symbol vector is illustrated.Such as:
In the present embodiment, character in the sentence to be processed can be determined from the character insertion dictionary obtained in advance Character vector, from the bebinning character of the sentence to be processed to the current input character of the first LSTM network model it Between segment in, obtain the first correlation word of the current input character, from obtain in advance word insertion dictionary in determine institute State the term vector of the first correlation word.
In the present embodiment, the first shot and long term memory (Long Short-Term Memory, LSTM) network model is length Short-term memory network model is a kind of time recurrent neural networks model.The first LSTM network model is for handling word The LSTM network model of vector is accorded with, the 2nd LSTM network model is the LSTM network model for handling term vector.Due to Term vector is inputted in two LSTM network models, then the first hiding output includes lexical information.
It should be noted that the character forward sequence according to the sentence to be processed is successively defeated by the character vector of character Enter the process of the first LSTM network model, that is, the character string of sentence is inputted from front to back.Such as sentence " I likes Changsha ", character forward sequence are ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.
It further illustrates, the first correlation word is in segment from the beginning of sentence to current character with current word Accord with the word of ending.First correlation word may or may not be present.In the case where there is the first correlation word, first is related The number of word can be one or more.For example, in the segment on " Changsha Orange Islet square ", with the word of " field " ending There are " Orange Islet square " and " square ", the first correlation word of character " field " has " Orange Islet square " and " square ".
Specifically, current input character has the first correlation word without the first correlation word and current input character, has Different calculating process.Specifically, in the case where current input character is without the first correlation word, it can be according to formula (1), formula (2) and formula (3) constitute the first LSTM network model.Formula (1)-(3) particular content is as follows:
Formula (1):
Wherein,Indicate input gate,Indicate out gate,It indicates to forget door,Indicate what current time step updated Status information, σ indicate that Sigmoid function, Sigmoid function are often used as the threshold function table of neural network, variable mappings arrived Between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh is also an activation primitive, tanh letter Number may refer to pertinent texts or network encyclopaedia, WcTIndicate weight vectors, for training parameter,Indicate character vector,Indicate the output of a time step, bcIndicate offset vector and model training parameter.
Formula (2):
Wherein,Indicate current new state information,It indicates to forget door,Indicate the state letter an of time step Breath,Indicate input gate,Indicate the status information that current time step updates.
Formula (3):
Wherein, h indicates current hidden layer output,Indicate out gate,Indicate that current new state information is logical Cross the output after activation primitive.
In the case where current input character is without the first correlation word, information flow is according to formula (1), formula (2) to public affairs The sequential flowing of formula (3), output of the output being finally calculated from formula (3) as the first LSTM network model.
In the case where current input character has the first correlation word, formula (1) above-mentioned, formula (3) can be used as One LSTM network model, formula (4)-(10) can form the 2nd LSTM network model.Formula (4)-(10) particular content is such as Under:
Formula (4):
Wherein,Indicate input gate,It indicates to forget door,Indicate the status information that current time step updates, σ Indicate Sigmoid function, Sigmoid function is an activation primitive, the threshold function table of neural network is often used as, by variable It is mapped between 0,1, Sigmoid function may refer to pertinent texts or network encyclopaedia, and tanh indicates to be also an activation letter Number, tanh function may refer to pertinent texts or network encyclopaedia,It is term vector,It is word all characters, WcTIt indicates Weight vectors, for training parameter, bwIndicate offset vector and model training parameter.
Formula (5):
Wherein,Indicate current new state information,It indicates to forget door,Processing character sequence in representation formula (2) New state information when column when time step b,Indicate input gate,Indicate the status information that current time step updates.
Formula (6):
Wherein,Indicate that input gate, t indicate non-linear conversion layer, i.e., the one full articulamentum with activation primitive, whTable Show weight vectors,Indicate current new state information, bHIndicate offset vector.
Formula (7):
Wherein,It indicates the output for inputting the LSTM network model of character vector and is used for inputting related term vector LSTM network model the sum of the weighting of output, b indicates the set that all b' is constituted, and b' indicates all with b' in sentence The position of the word of beginning,Indicate that the term vector for starting to end up with j with b', D indicate current sentence,Indicate term vector Weight,Indicate term vector,Indicate the weight of character vector,Indicate character vector.
Formula (8):
Wherein,Indicate the weight of term vector,Indicate that the input gate of word vector, b' indicate the element in b ", b " table Show all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,It indicates The input gate of term vector,Indicate the input gate of term vector.
Formula (9):Wherein,Indicate the weight of word vector,It indicates The input gate of word vector, b' indicate that the element in b ", b " indicate all starting positions with the word in the current sentence of j ending,Indicate that term vector, D indicate current sentence,Indicate the input gate of term vector.
Formula (10):
Wherein, t indicates model output, and σ indicates activation primitive,Indicate weight vectors to be trained, bTIndicate to Trained offset vector.
In the case where there is the first correlation word, calculated by the 2nd LSTM network model that formula (4)-(10) form First weighted sum value of the first correlation word.First weighted sum value and character vector are passed through according to formula (1) and formula (3) the first LSTM network model composed by obtains positive output result.
Referring to Fig. 2, sentence to be processed is " Changsha Orange Islet square " in Fig. 2, dictionary can be embedded in by character The character vector for obtaining " length " " sand " " city " " tangerine " " son " " continent " " wide " " field " these characters respectively, in current input character When for " city ", the first correlation word is " Changsha ", and when current input character is " field ", the first correlation word is " orange Continent square " and " square " can obtain " Changsha ", " Orange Islet square ", " square " these words by insertion dictionary Term vector.For example, when current input character is " field ", by the term vector on " Orange Islet square " and the word on " square " to Amount the 2nd LSTM network model of input is handled, and is obtained first and is hidden output.Output weighted sum is hidden to described first, The first weighted sum value is obtained, the first weighted sum value is inputted into the first LSTM network model.Pass through described first LSTM network model root
Reverse process module 602, for the character reverse sequence according to the sentence to be processed, by the language to be processed The character vector of sentence sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described anti- Into the segment between the current input character of the first LSTM network model, obtain the current input character second is related The term vector of second correlation word is inputted the 2nd LSTM network model and handled by word, is obtained second and is hidden Output hides output weighted sum to described second, obtains the second weighted sum value, the second weighted sum value is inputted The reversed first LSTM network model, by the reversed first LSTM network model according to the second weighted sum value It obtains reversely exporting result with the character vector of the current input character.
In the present embodiment, reversed first LSTM network model is also shot and long term memory network model, is to pass a kind of time Return neural network model.Reversed first LSTM network model is used for the character reverse sequence according to sentence to be processed successively by word Symbol vector is inputted.The specific calculating process of reversed first LSTM network model and the calculating of the first LSTM network model Journey is similar, and to avoid repeating, this will not be repeated here.
It should be noted that successively the character vector of character is inputted according to the character reverse sequence of the sentence to be processed The process of the reversed first LSTM network model, the as reversed process of input character vector, that is, by the character of sentence Sequence inputs from back to front.Such as sentence " I likes Changsha ", forward direction is input to the sequence in the first LSTM network model It is ' I ', ' happiness ', ' joyous ', ' length ', ' sand '.And reversely it is input in the first LSTM network model mode that then input sequence is exactly ' sand ', ' length ', ' joyous ', ' happiness ', ' I '.For forward direction input, when ' joyous ' is arrived in the processing of the first LSTM network model, meeting The 2nd LSTM network model will be passed through, the term vector corresponding second for obtaining ' liking ' hides output.
For the character reverse sequence according to the sentence to be processed, successively by the character vector of the sentence to be processed Input reversed first LSTM network model, i.e., for reversed input, when ' happiness ' is arrived in the processing of reversed first LSTM network model, The term vector ' liked ' corresponding second can will be obtained by the 2nd LSTM network model hides output.Referring to Fig. 2, When current input character is " tangerine ", the second correlation word is " Orange Islet square ", and the term vector on " Orange Islet square " is inputted 2nd LSTM network model is handled, and is obtained second and is hidden output.Output weighted sum is hidden to described second, obtains the The second weighted sum value is inputted the reversed first LSTM network model by two weighted sum values.By described reversed First LSTM network model is tied according to the reversed output that the character vector of the second weighted sum value and " tangerine " obtains " tangerine " Fruit.Since the second hiding output includes lexical information, then the second weighted sum value also includes lexical information, in this way, reversed output It as a result include the contextual information of rear portion subordinate sentence.
Cascade module 603, for being tied respectively to the positive output by freeway net network layers highway layer Fruit and the reversed output result are handled, and respectively obtain positive processing result and reverse process as a result, by the forward direction Reason result and the reverse process result are cascaded, and cascaded-output result is obtained.
In the present embodiment, the English name of freeway net network layers be highwaylayer, highway layer it Message part can be made to pass through, it is constant by dimension after highwaylayer for a vector, but each of vector Component is different according to trained parameter, can only retain partial information.It is described to pass through freeway net network layers highwaylayer points It is other that the positive output result and the reversed output result are handled, respectively obtain positive processing result and reversed place Reason is as a result, include following procedure: by the positive output result pass through the freeway net network layers for name Entity recognition into Row processing, obtains positive processing result, and the reversed output result is passed through the highway for name Entity recognition Network layer is handled, and reverse process result is obtained.
In the present embodiment, the positive output result and the reversed output result can pass through highway respectively It network layer and then is cascaded.The cascaded-output result includes the contextual information of preceding part and the context of rear part Information, i.e., the described cascaded-output result include the contextual information of sentence, can effectively indicate full text semantic feature.
It should be noted that the cascade meaning is exactly to splice, the positive output result includes the context letter of preceding part Breath, the reversed output result include the contextual information of rear part, then the cascaded-output result includes sentence context letter Breath.
It further remarks additionally, includes the first LSTM network model 201, cascade model 202, language model in Fig. 2 203, Named Entity Extraction Model 204, the first freeway net network layers 205 and the second freeway net network layers 206.Described One LSTM network model 201 is connect with, the first freeway net network layers 205 and the second freeway net network layers 206 respectively, In, the first freeway net network layers 205 are connect with language model 203, the second freeway net network layers 206 and cascade model 202 Connection, cascade model 202 and Named Entity Extraction Model 204.Language model 203 can remove completely, and it is real to will not influence name Body identification model 204 executes name Entity recognition task, after only calculating plus the language model of language model 203, life The implementation effect of name Entity recognition task can be more preferable.
Named Entity Extraction Model 204 includes full articulamentum and CRF layers.On the one hand full articulamentum can increase the quasi- of model On the other hand conjunction ability can change the dimension of vector, such as an input is 200 neurons, is exported as 20 neurons Full articulamentum, each 200 vector tieed up can be converted to the vector of 20 dimensions, subsequent CRF is facilitated to be handled.CRF layers Effect be predict sentence in each character label.
The treatment process of name Entity recognition task is included in the second freeway net network layers 206, cascade model 202, complete Articulamentum, CRF layers of progress data processing, carry out joint Tag Estimation to sentence by CRF layers, CRF layers of output is exactly to name The prediction result of Entity recognition.
It should be noted that obtaining forward direction handling the positive output result by the second freeway net network layers 206 Processing result handles the reversed output result by freeway net network layers and obtains reverse process as a result, in cascade model 202 cascade positive processing result and reverse process result.Then cascade result is inputted into full articulamentum and CRF layers, led to It crosses CRF layers and obtains the prediction result of name Entity recognition.
Entity handling module 604 is named, is used for through full articulamentum and CRF layers of condition random field to the cascaded-output As a result it is handled, obtains the name Entity recognition label of character.
Before CRF layers, data have been processed into the tensor of n*k, and wherein n represents n character in a sentence, k generation The number of species of table difference label, for each character, it is therefore an objective to selected inside from k label one as final mark Note, if do not use CRF layers, individually each character is marked also possible, it is only necessary to by the k of each character tie up to Then amount chooses the label as this character of maximum probability by softmax function model.But mark is not accounted in this way Dependence between label, it is thus possible to will appear unreasonable sequence label, CRF layers calculate mark by a transfer matrix The cost shifted between label calculates the overall labeling scheme of entire sentence.CRF marks the detailed of the NER label of each character Step includes:
Assuming that the tensor of the n*k be P, this be CRF layers before layer processing result, PijRepresent i-th of word in sentence Symbol is labeled as the score of j-th of label, for a forecasting sequence y=(y1, y2 ... ..., yn), is defined according to formula (11) Its score:
Formula (11):
Wherein, s (X, y) indicates that sentence X is marked as the score of y sequence,It indicates from label yiTo label yi+1 Transfer score,Indicate that i-th of character is marked as yiScore.
Softmax function model is used on all possible sequence label, generates the probability of a flag sequence, it should Probability can be indicated with formula (12):
Formula (12):Wherein, p (y | X) indicates that sentence X is marked as the probability of sequences y, Yx Expression represents all possible sequence label, es(X,y)Indicate e be the truth of a matter, current markers sequence score be index power fortune It calculates,
Indicate e be the truth of a matter, all flags sequence score be index power operation sum.
During the training period, the log-likelihood for maximizing correct sequence label is determined according to formula (13):
Formula (13):
In formula (13), and p (y | X) indicate that sentence X is marked as the probability of sequences y, Yx represents all possible label sequence Column, s (X, y) indicate the score of current markers sequence,Indicate e be the truth of a matter, all flags sequence score be finger The sum of several power operations.
In trained or forecast period, the output sequence of largest score will be obtained as name Entity recognition label.
Name entity recognition device provided by the invention can obtain positive output result according to character vector, term vector And direction output cascades as a result, exporting result to forward direction output result and direction, obtains cascaded-output as a result, due to just It include the contextual information of front subordinate sentence to output result, it is reversed to export the contextual information that result includes rear portion subordinate sentence, So that after forward direction output result and direction output result are cascaded complete context semanteme can be obtained, by connecting entirely It connects layer and CRF layers of condition random field handles cascaded-output result, obtained character designation Entity recognition label is than calibrated Really, to improve the effect of name Entity recognition on the whole.
Referring to Fig. 7, the forward direction result processing module 601, comprising:
Computational submodule 6011 passes through for hiding described first in output input softmax function model Softmax function model is calculated described first and hides the corresponding probability of output.
Submodule 6012 is handled, hides output for hiding the corresponding probability of output as described first for described first Weight, according to described first hide output weight and it is described first hide output be weighted summation, obtain the first weighting Summed result.
In the present embodiment, the normalization softmax function model can be Sequence Transformed for another by a number Serial No., each of sequence after converting number are the values between 0 to 1, originally after bigger number conversion just Close to 1, it originally is adjacent to 0 after smaller number conversion, can be used for expressing the concept of probability.
Supplementary explanation, can also be handled the hiding output by freeway net network layers, after processing Hiding output input the softmax function model, obtain described first and hide the corresponding probability of output.It needs to illustrate It is that freeway net network layers can be such that message part passes through, for a vector, not by dimension after freeway net network layers Become, but each component of vector is different according to trained parameter, can only retain partial information.In this way, obtained likelihood ratio Relatively accurate, effect is more preferable.
Explanation is needed further exist for, freeway net network layers can preferably increase the expression ability of model, extensive energy Power.Since word relevant to each character will can make each word by freeway net network layers, freeway net network layers By information it is different, and then the significance level and current character that give expression to each word are more concerned with which word.In this way, getting First hide output the likelihood ratio respectively exported it is more accurate, can effectively ensure that subsequent NER identification is accurate.
Supplementary explanation, the reversed result processing module 602 are also used to:
Described second is hidden in output input softmax function model, is calculated by softmax function model Described second hides the corresponding probability of output.
The weight that the corresponding probability of output hides output as described second is hidden using described second, according to described second The weight and the second hiding output for hiding output are weighted summation, obtain the first weighted sum result.
Referring to Fig. 8, name entity handling module 604 includes:
First input submodule 6041, for obtaining first for the full articulamentum of cascaded-output result input 2a*c Output information, wherein a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition.
Second input submodule 6042 handles for CRF layers by first output information input, obtains character Name Entity recognition label.
In the present embodiment, need to be used to name the positive freeway net network layers of Entity recognition export it is described just The reverse process result exported to processing result and reversed freeway net network layers is cascaded, and the cascade meaning is exactly Splicing connects full articulamentum and CRF layers after splicing again, thus needs to be arranged the full articulamentum of 2a*c.
Refering to Fig. 9, the name entity recognition device further include:
It is defeated to obtain second for the positive output result to be inputted freeway net network layers for first input module 605 Information out;
Second input module 606, for obtaining current character for the full articulamentum of second output information input a*b Character late prediction probability, a be the first LSTM network model output dimension, b be corpus in all characters Quantity;
Correction module 607, for the prediction probability according to the character late of the current character, to the current word The name identification label of symbol is corrected.
Supplementary explanation obtains character in name entity handling module 604 in one embodiment of the present embodiment After naming Entity recognition label, the first input module 605, the second input module 606 and correction module 607 execute phase respectively It should operate, complete the name identification label to current character and be corrected.
In another embodiment of the present embodiment, after positive result processing module 601 obtains positive output result, First input module 605, the second input module 606 execute corresponding operating respectively, obtain and store next word of current character The prediction probability of symbol.After name entity handling module 604 obtains the name Entity recognition label of character, correction module 607 According to the prediction probability of the character late of stored current character, the name identification label of the current character is carried out Correction.
For example, referring to Fig. 2, if current character is " city ", according to the sequence before the character " city " Information and lexical information predict that next word is the probability in " day ".It says for another example, sentence is " I likes Changsha ", works as character After " happiness " exports the first output information by the first LSTM network model, predict that next word is " joyous " according to the first output information Probability.
In this way, can predict the probability distribution of character late, according to the character late of the current character Prediction probability, the name of current character identification label is corrected, the accurate of name identification label can be improved Degree, so that the effect of name Entity recognition is more preferable.
Name entity recognition device provided in an embodiment of the present invention, can be in conjunction with a plurality of types of LSTM modes according to word Symbol vector, term vector obtain positive output result and direction output as a result, carrying out to forward direction output result and direction output result Cascade obtains cascaded-output as a result, since positive output result includes the contextual information of front subordinate sentence, reversed output knot Fruit includes the contextual information of rear portion subordinate sentence, thus after forward direction output result and direction output result are cascaded, it can be with It is semantic to obtain complete context, thus by full articulamentum and CRF layers of condition random field to cascaded-output result at Reason, obtained character designation Entity recognition label is more accurate, to improve the effect of name Entity recognition on the whole.
The embodiment of the present invention provides a kind of computer installation, which includes processor, and processor is for executing The step of name entity recognition method that above-mentioned each embodiment of the method provides is realized in memory when computer program.
Illustratively, computer program can be divided into one or more modules, one or more module is stored In memory, and by processor it executes, to complete the present invention.One or more modules, which can be, can complete specific function Series of computation machine program instruction section, the instruction segment is for describing implementation procedure of the computer program in computer installation. For example, computer program can be divided into the step of name entity recognition method that above-mentioned each embodiment of the method provides.
It will be understood by those skilled in the art that the description of above-mentioned computer installation is only example, do not constitute to calculating The restriction of machine device may include component more more or fewer than foregoing description, perhaps combine certain components or different Component, such as may include input-output equipment, network access equipment, bus etc..
Alleged processor can be central processing unit (Central ProcessingUnit, CPU), can also be it His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic device Part, discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processing Device etc., the processor are the control centres of the computer installation, are filled using various interfaces and the entire computer of connection The various pieces set.
The memory can be used for storing the computer program and/or module, and the processor is by operation or executes Computer program in the memory and/or module are stored, and calls the data being stored in memory, realizes institute State the various functions of computer installation.The memory can mainly include storing program area and storage data area, wherein storage Program area can application program needed for storage program area, at least one function (for example sound-playing function, image play function Energy is equal) etc.;Storage data area, which can be stored, uses created data (such as audio data, phone directory etc.) etc. according to mobile phone. It can also include nonvolatile memory in addition, memory may include high-speed random access memory, such as hard disk, interior It deposits, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) Card, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states Part.
If the integrated module/unit of the computer installation is realized in the form of SFU software functional unit and as independence Product when selling or using, can store in a computer readable storage medium.Based on this understanding, this hair All or part of the process in bright realization above-described embodiment method can also instruct relevant hardware by computer program It completes, the computer program can be stored in a computer readable storage medium, the computer program is by processor When execution, it can be achieved that the step of above-mentioned each name entity recognition method embodiment.Wherein, the computer program includes meter Calculation machine program code, the computer program code can for source code form, object identification code form, executable file or certain A little intermediate forms etc..The computer-readable medium may include: any entity that can carry the computer program code Or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read- Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, electric signal and Software distribution medium etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (10)

1. a kind of name entity recognition method, which is characterized in that the name entity recognition method includes:
According to the character forward sequence of sentence to be processed, the character vector of the sentence to be processed is sequentially input into the first shot and long term LSTM network model is remembered, from the bebinning character of the sentence to be processed to the current input bytes of the first LSTM network model In segment between symbol, the first correlation word of the current input character is obtained, by the term vector of first correlation word It inputs the 2nd LSTM network model to be handled, obtains first and hide output, hide output weighted sum to described first, obtain The first weighted sum value is inputted the first LSTM network model, passes through the first LSTM by the first weighted sum value Network model obtains positive output result according to the character vector of the first weighted sum value and the current input character;
According to the character reverse sequence of the sentence to be processed, the character vector of the sentence to be processed is sequentially input reversed One LSTM network model, from the final character of the sentence to be processed to the current input of the reversed first LSTM network model In segment between character, obtain the second correlation word of the current input character, by the word of second correlation word to Amount inputs the 2nd LSTM network model and is handled, and obtains second and hides output, hides output weighting to described second and asks With obtain the second weighted sum value, the second weighted sum value inputted into the reversed first LSTM network model, passes through institute Reversed first LSTM network model is stated to be obtained according to the character vector of the second weighted sum value and the current input character Reversed output result;
By freeway net network layers highway layer respectively to the positive output result and the reversed output result into Row processing respectively obtains positive processing result and reverse process as a result, by the positive processing result and the reverse process knot Fruit is cascaded, and cascaded-output result is obtained;
The cascaded-output result is handled by full articulamentum and CRF layers of condition random field, the name for obtaining character is real Body identifies label.
2. name entity recognition method according to claim 1, which is characterized in that described to add to the described first hiding output Power summation, obtains the first weighted sum value, including following procedure:
Described first is hidden in output input softmax function model, is calculated described the by softmax function model One hides the corresponding probability of output;
The weight that the corresponding probability of output hides output as described first is hidden using described first, is hidden according to described first defeated Weight and described first out hides output and is weighted summation, obtains the first weighted sum result.
3. name entity recognition method according to claim 1, which is characterized in that described to pass through full articulamentum and CRF layers The cascaded-output result is handled, the name Entity recognition label of character, including following procedure are obtained:
By the full articulamentum of cascaded-output result input 2a*c, the first output information is obtained, wherein a is the first LSTM net The dimension of the output of network model, c are the number of labels for naming Entity recognition;
First output information input is handled for CRF layers, the name Entity recognition label of character is obtained.
4. name entity recognition method according to any one of claim 1 to 3, which is characterized in that described all excessively complete Articulamentum and CRF layers of condition random field handle the cascaded-output result, obtain the name Entity recognition label of character Later, the name entity recognition method includes following procedure:
The positive output result is inputted into freeway net network layers, obtains the second output information;
By the full articulamentum of second output information input a*b, the prediction probability of the character late of current character, a are obtained For the dimension of the output of the first LSTM network model, b is the quantity of all characters in corpus;
According to the prediction probability of the character late of the current character, school is carried out to the name identification label of the current character Just.
5. a kind of name entity recognition device, which is characterized in that the name entity recognition device includes:
Positive result processing module, for the character forward sequence according to sentence to be processed, by the character of the sentence to be processed Vector sequentially inputs the first shot and long term memory LSTM network model, from the bebinning character of the sentence to be processed to described first In segment between the current input character of LSTM network model, the first correlation word of the current input character is obtained, it will The term vector of first correlation word inputs the 2nd LSTM network model and is handled, and obtains first and hides output, to described First hides output weighted sum, obtains the first weighted sum value, and the first weighted sum value is inputted the first LSTM Network model, by the first LSTM network model according to the word of the first weighted sum value and the current input character Symbol vector obtains positive output result;
Reversed result processing module, for the character reverse sequence according to the sentence to be processed, by the sentence to be processed Character vector sequentially inputs reversed first LSTM network model, from the final character of the sentence to be processed to described reversed first In segment between the current input character of LSTM network model, the second correlation word of the current input character is obtained, it will The term vector of second correlation word inputs the 2nd LSTM network model and is handled, and obtains second and hides output, right Described second hides output weighted sum, obtains the second weighted sum value, and the second weighted sum value input is described reversed First LSTM network model, by the reversed first LSTM network model according to the second weighted sum value and described current The character vector of input character obtains reversely exporting result;
Cascade module, for by freeway net network layers highway layer respectively to the positive output result and described Reversed output result is handled, respectively obtain positive processing result and reverse process as a result, by the positive processing result and The reverse process result is cascaded, and cascaded-output result is obtained;
Entity handling module is named, for carrying out by full articulamentum and CRF layers of condition random field to the cascaded-output result Processing, obtains the name Entity recognition label of character.
6. name entity recognition device according to claim 5, which is characterized in that the forward direction result processing module packet It includes:
Computational submodule passes through softmax Function Modules for hiding described first in output input softmax function model Type is calculated described first and hides the corresponding probability of output;
Submodule is handled, for hiding the weight that the corresponding probability of output hides output as described first, root for described first The weight and described first for hiding output according to described first hide output and are weighted summation, obtain the first weighted sum result.
7. name entity recognition device according to claim 5, which is characterized in that the name entity handling module packet It includes:
First input submodule, for obtaining the first output information for the full articulamentum of cascaded-output result input 2a*c, Wherein, a is the dimension of the output of the first LSTM network model, and c is the number of labels for naming Entity recognition;
Second input submodule handles for CRF layers by first output information input, obtains the name entity of character Identify label.
8. name entity recognition device according to any one of claims 5 to 7, which is characterized in that the name entity Identification device further include:
First input module obtains the second output information for the positive output result to be inputted freeway net network layers;
Second input module, for obtaining the next of current character for the full articulamentum of second output information input a*b The prediction probability of character, a are the dimension of the output of the first LSTM network model, and b is the quantity of all characters in corpus;
Correction module, the name for the prediction probability according to the character late of the current character, to the current character Identification label is corrected.
9. a kind of computer installation, which is characterized in that the computer installation includes processor, and the processor is deposited for executing The step of entity recognition method is named as described in any one of claim 1-4 is realized in reservoir when computer program.
10. a kind of computer readable storage medium, is stored thereon with computer program, it is characterised in that: the computer program The step of entity recognition method is named as described in any one of claim 1-4 is realized when being executed by processor.
CN201910597499.1A 2019-07-04 2019-07-04 Name entity recognition method, device, computer installation and computer readable storage medium Withdrawn CN110516228A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910597499.1A CN110516228A (en) 2019-07-04 2019-07-04 Name entity recognition method, device, computer installation and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910597499.1A CN110516228A (en) 2019-07-04 2019-07-04 Name entity recognition method, device, computer installation and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110516228A true CN110516228A (en) 2019-11-29

Family

ID=68623642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910597499.1A Withdrawn CN110516228A (en) 2019-07-04 2019-07-04 Name entity recognition method, device, computer installation and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110516228A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191459A (en) * 2019-12-25 2020-05-22 医渡云(北京)技术有限公司 Text processing method and device, readable medium and electronic equipment
CN111460820A (en) * 2020-03-06 2020-07-28 中国科学院信息工程研究所 Network space security domain named entity recognition method and device based on pre-training model BERT

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN107797992A (en) * 2017-11-10 2018-03-13 北京百分点信息科技有限公司 Name entity recognition method and device
CN109034264A (en) * 2018-08-15 2018-12-18 云南大学 Traffic accident seriousness predicts CSP-CNN model and its modeling method
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109492227A (en) * 2018-11-16 2019-03-19 大连理工大学 It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations
CN109741732A (en) * 2018-08-30 2019-05-10 京东方科技集团股份有限公司 Name entity recognition method, name entity recognition device, equipment and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN107797992A (en) * 2017-11-10 2018-03-13 北京百分点信息科技有限公司 Name entity recognition method and device
CN109034264A (en) * 2018-08-15 2018-12-18 云南大学 Traffic accident seriousness predicts CSP-CNN model and its modeling method
CN109741732A (en) * 2018-08-30 2019-05-10 京东方科技集团股份有限公司 Name entity recognition method, name entity recognition device, equipment and medium
CN109388807A (en) * 2018-10-30 2019-02-26 中山大学 The method, apparatus and storage medium of electronic health record name Entity recognition
CN109492227A (en) * 2018-11-16 2019-03-19 大连理工大学 It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BAE JANGSEONG 等: "Korean Semantic Role Labeling with Highway BiLSTM-CRFs", 《韩国语言信息科学协会:学术大会论文集》 *
ZHAO DONGYANG 等: "A Joint Decoding Algorithm for Named Entity Recognition", 《2018 IEEE THIRD INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC)》 *
ZHAO DONGYANG 等: "Chinese Name Entity Recognition Using Highway-LSTM-CRF", 《PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE》 *
ZHAO DONGYANG: "Named Entity Recognition Based on BiRHN and CRF", 《INTERNATIONAL CONFERENCE ON GREEN, PERVASIVE, AND CLOUD COMPUTING》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191459A (en) * 2019-12-25 2020-05-22 医渡云(北京)技术有限公司 Text processing method and device, readable medium and electronic equipment
CN111191459B (en) * 2019-12-25 2023-12-12 医渡云(北京)技术有限公司 Text processing method and device, readable medium and electronic equipment
CN111460820A (en) * 2020-03-06 2020-07-28 中国科学院信息工程研究所 Network space security domain named entity recognition method and device based on pre-training model BERT
CN111460820B (en) * 2020-03-06 2022-06-17 中国科学院信息工程研究所 Network space security domain named entity recognition method and device based on pre-training model BERT

Similar Documents

Publication Publication Date Title
CN106202010B (en) Method and apparatus based on deep neural network building Law Text syntax tree
CN110162749A (en) Information extracting method, device, computer equipment and computer readable storage medium
CN109657226B (en) Multi-linkage attention reading understanding model, system and method
CN108628823A (en) In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training
CN108416384A (en) A kind of image tag mask method, system, equipment and readable storage medium storing program for executing
CN111753081A (en) Text classification system and method based on deep SKIP-GRAM network
CN106294684A (en) The file classification method of term vector and terminal unit
CN108959482A (en) Single-wheel dialogue data classification method, device and electronic equipment based on deep learning
CN106326346A (en) Text classification method and terminal device
CN113220876B (en) Multi-label classification method and system for English text
CN110188195B (en) Text intention recognition method, device and equipment based on deep learning
CN110222184A (en) A kind of emotion information recognition methods of text and relevant apparatus
CN110188175A (en) A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium
CN108763556A (en) Usage mining method and device based on demand word
CN112800239B (en) Training method of intention recognition model, and intention recognition method and device
CN111723569A (en) Event extraction method and device and computer readable storage medium
CN108664512B (en) Text object classification method and device
CN111460830B (en) Method and system for extracting economic events in judicial texts
CN112836502B (en) Financial field event implicit causal relation extraction method
CN112420191A (en) Traditional Chinese medicine auxiliary decision making system and method
CN107357785A (en) Theme feature word abstracting method and system, feeling polarities determination methods and system
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN113268561B (en) Problem generation method based on multi-task joint training
CN112686049A (en) Text auditing method, device, equipment and storage medium
CN110516228A (en) Name entity recognition method, device, computer installation and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20191129

WW01 Invention patent application withdrawn after publication