CN112115721A - Named entity identification method and device - Google Patents
Named entity identification method and device Download PDFInfo
- Publication number
- CN112115721A CN112115721A CN202011039983.1A CN202011039983A CN112115721A CN 112115721 A CN112115721 A CN 112115721A CN 202011039983 A CN202011039983 A CN 202011039983A CN 112115721 A CN112115721 A CN 112115721A
- Authority
- CN
- China
- Prior art keywords
- word
- text
- matrix
- feature matrix
- recognized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 239000011159 matrix material Substances 0.000 claims abstract description 314
- 238000012549 training Methods 0.000 claims abstract description 157
- 238000012545 processing Methods 0.000 claims abstract description 56
- 230000004927 fusion Effects 0.000 claims abstract description 52
- 239000013598 vector Substances 0.000 claims description 70
- 230000011218 segmentation Effects 0.000 claims description 20
- 230000015654 memory Effects 0.000 claims description 18
- 230000007246 mechanism Effects 0.000 claims description 15
- 238000013527 convolutional neural network Methods 0.000 claims description 14
- 230000002457 bidirectional effect Effects 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 9
- 230000006403 short-term memory Effects 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 3
- 238000007499 fusion processing Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 13
- 201000010099 disease Diseases 0.000 description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000004913 activation Effects 0.000 description 3
- 238000004880 explosion Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000005562 fading Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Character Discrimination (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a named entity identification method and a device, wherein the method comprises the following steps: inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix; processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized. In the method, the character feature matrix and the word feature matrix are subjected to fusion processing, and the character and word fusion feature matrix is processed, so that the accuracy of the recognition result of the text to be recognized is further improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a named entity identification method and apparatus.
Background
Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc. The NER technique is to mark the position and type of the related entity from a piece of natural language text, and extract the required entity, such as some organization names, person names, identification of disease and symptom etc. in the medical field. The method is widely applied to tasks such as knowledge graph construction, information extraction, information retrieval, machine translation, automatic question answering, public opinion monitoring and the like, and is the basis of natural language processing.
NER generally uses sequence labeling to associate entity boundaries and determine entity types. However, since the first step of NER is to determine the boundaries of words, i.e., segmentation, chinese text has no boundary markers like the spaces in english text for explicit labeled words; and special entity types in Chinese, such as foreigner name translation and place name translation, exist in addition to the entities defined in English; and the polysemy of Chinese words; the existing Chinese named entity recognition method still has certain limitation.
In the prior art, characters, word roots or words are mapped into a single vector, and the NER technology is realized through training models such as a corresponding Convolutional Neural Network (CNN) and a Long Short-Term Memory network (LSTM). However, in order to strengthen the correlation between the characters and the characters or between the characters and the characters, a great deal of manual intervention is needed to construct the characteristics of the characters or the characters, and the method is time-consuming and labor-consuming. And the method is difficult to ensure the accuracy of named entity identification in the practical process. Especially for sentences with long entities, it is more difficult to identify the boundaries of the entities, resulting in a lower accuracy of named entity identification.
Therefore, there is a need for a method and an apparatus for identifying a named entity, which can improve the accuracy of identifying the named entity.
Disclosure of Invention
The embodiment of the invention provides a named entity identification method and device, which can improve the accuracy of named entity identification.
In a first aspect, an embodiment of the present invention provides a method for identifying a named entity, where the method includes:
inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix; processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In the method, a word feature matrix and a word feature matrix are obtained by using a first training model and a second training model. The dimension of the character feature matrix is the same as that of the word feature matrix, so that the character feature matrix and the word feature matrix are fused, and on one hand, the accuracy of the recognition result of the text to be recognized can be improved; on the other hand, the phenomenon that gradient explosion occurs due to overhigh dimensionality caused by calculation modes such as splicing the character feature matrix and the word feature matrix and the like, and the running efficiency of the model is reduced. And the character and word fusion characteristic matrix is processed through the third training model, so that the accuracy of the recognition result of the text to be recognized is further improved.
Optionally, before the first word sequence matrix of the text to be recognized is input into the first training model to obtain the first word feature matrix of the text to be recognized, the method further includes: setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
In the method, the first parameters are respectively set for the first training model, and the second parameters are set for the second training model, so that the obtained character feature matrix and the obtained word feature matrix have the same dimension, the fusion processing of the first character feature matrix and the first word feature matrix is facilitated, and the accuracy of named entity recognition is further facilitated to be improved.
Optionally, before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further includes: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
In the method, a first word vector and a second word vector of the text to be recognized are determined in a first mode and a second mode respectively, and further, according to each participle of the text to be recognized and the second word vector corresponding to each participle, the second word vector of a plurality of words of each participle is subjected to same-dimension processing to determine the word vector of each participle, and further, a first word sequence matrix is obtained. Therefore, the first word sequence matrix not only contains the word segmentation information of the text to be recognized, but also contains the semantic information of each word of the text to be recognized, and the global informativeness of the word is kept. Therefore, the accuracy of the named entity recognition result of the text to be recognized is improved.
Optionally, the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
Optionally, the third training model includes a bidirectional Long Short-Term Memory network (Bi-directional Long Short-Term Memory) model and a self-attention mechanism model; processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix; processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix; and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
In the method, the semantic information contained in the first word fusion characteristic matrix corresponding to the text to be recognized is increased through the BilSTM model, and the reliability and the accuracy of the semantic information contained in the first word fusion characteristic matrix are improved. Furthermore, the weight of the corresponding named entity in the first word feature matrix is increased through the self-attention mechanism model, so that the named entity in the second word feature matrix is prominent in position, when the named entity of the text to be recognized is obtained according to the second word feature matrix, the recognition of the named entity is more definite, and the accuracy of the recognition result of the named entity is increased.
Optionally, the third training model further includes a CRF (conditional random field); processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
In the method, the sequence of the second word feature matrix is optimized through the CRF model, so that the sequence of the named entities obtained by identifying the text to be identified is the optimal sequence on the premise of improving the accuracy of the recognition result of the named entities.
Optionally, the method further includes: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
In the method, the second training model and the third training model which are not trained are trained by using the first training model which is trained to be mature, so that the accuracy of each relevant parameter of the second training model and the third training model is improved, and the matching degree of the first training model, the second training model and the third training model is improved. Therefore, the recognition result of the text to be recognized is more accurate.
In a second aspect, an embodiment of the present invention provides a named entity identifying apparatus, where the apparatus includes:
the acquisition module is used for inputting a first character sequence matrix of a text to be recognized into a first training model to acquire a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
the processing module is used for processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In a third aspect, an embodiment of the present application further provides a computing device, including: a memory for storing a program; a processor for calling the program stored in said memory and executing the method as described in the various possible designs of the first aspect according to the obtained program.
In a fourth aspect, embodiments of the present application further provide a computer-readable non-transitory storage medium including a computer-readable program which, when read and executed by a computer, causes the computer to perform the method as described in the various possible designs of the first aspect.
These and other implementations of the present application will be more readily understood from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of an architecture for named entity recognition according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a named entity identification method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a BilSTM model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a BilSTM model according to an embodiment of the present invention;
fig. 5 is a schematic flowchart of a named entity recognition method according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a named entity recognition apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a system architecture for named entity recognition according to an embodiment of the present invention, where a text to be recognized is input into a word feature training model 101, and the word feature training model 101 inputs a first word feature matrix in the text to be recognized into a word feature fusion model 103 according to the first word feature matrix; the word feature training model 102 inputs a first word feature matrix into the word feature fusion model 103 according to the first word feature matrix in the text to be recognized; the word feature fusion model 103 performs fusion processing on a first word feature matrix and a first word feature matrix with the same dimension to obtain a first word fusion feature matrix, and inputs the first word fusion feature matrix into the global semantic training model 104; training the first word fusion feature matrix through the global semantic training model 104 to increase global semantic information of the first word fusion feature matrix to the text to be recognized, so as to obtain a first word feature matrix; the global semantic training model 104 inputs the obtained first word feature matrix into the named entity weight training model 105; the named entity recognition weight model 105 trains the first word feature matrix to increase the weight of the named entity in the text to be recognized corresponding to the first word feature matrix to obtain a second word feature matrix; the named entity weight training model 105 inputs the obtained second word feature matrix into the named entity sequence training model 106; the named entity sequence training model 106 trains the second word feature matrix to optimize the arrangement sequence of the named entities in the text to be recognized corresponding to the second word feature matrix, so as to obtain a third word feature matrix, and a named entity recognition result is obtained according to the third word feature matrix.
Based on this, an embodiment of the present application provides a process of a named entity identification method, as shown in fig. 2, including:
and 204, processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In the method, a word feature matrix and a word feature matrix are obtained by using a first training model and a second training model. The dimension of the character feature matrix is the same as that of the word feature matrix, so that the character feature matrix and the word feature matrix are fused, and on one hand, the accuracy of the recognition result of the text to be recognized can be improved; on the other hand, the phenomenon that gradient explosion occurs due to overhigh dimensionality caused by calculation modes such as splicing the character feature matrix and the word feature matrix and the like, and the running efficiency of the model is reduced. And the character and word fusion characteristic matrix is processed through the third training model, so that the accuracy of the recognition result of the text to be recognized is further improved.
Before the first word sequence matrix of the text to be recognized is input into the first training model to obtain the first word feature matrix of the text to be recognized, the embodiment of the application further provides a dimension obtaining method, which further includes:
setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality. That is, the matrix dimensions output by the first training model and the second training model may be made the same by setting the first parameters of the first training model and the second parameters of the second training model. Namely, the dimensionality of the first word feature matrix output by the first training model is the same as that of the first word feature matrix output by the second training model.
The embodiment of the present application further provides a method for acquiring a first word sequence matrix and a first word sequence matrix, before inputting the first word sequence matrix of a text to be recognized into a second training model to acquire the first word feature matrix of the text to be recognized, the method further includes: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining the text to be recognized through a second modeA second word vector corresponding to each word of the text; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix. That is, the first word vectors are obtained in a first manner, and the first word sequence matrix is obtained according to each first word vector of the text to be recognized:Cia set of first word vectors representing an ith sentence,a first word vector representing the nth word of the ith sentence. A second word vector is obtained by a second means,performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognizedSiA set of word vectors representing the ith sentence,a word vector representing the mth word of the ith sentence. If word vectorCorresponding to the second word vectorThenOrThe way of obtaining the word vector by the second word vector here may be addition orThe way of obtaining the word vector through the second word vector is not particularly limited, other than the subtraction. Thus, the word vector is obtained according to the second word vector, so that the first word sequence matrix obtained according to the word vector not only contains the word segmentation information of the text to be recognized, but also contains the semantic information of each word of the text to be recognized, and the global informativeness of the word is kept. Therefore, the accuracy of the named entity recognition result of the text to be recognized is improved.
For example, the text to be recognized is "Jiangsu Suzhou disease control center", and the first word vectors corresponding to the words obtained by the first method are respectively: river (159), Su (357), State (489), disease (621), accuse (741), Zhongzhong (963), Heart (452); the first word vector forming the first word sequence matrix may be:obtaining second word vectors corresponding to the words through a second mode respectively as follows: river (321), Su (355), Su (557), State (499), disease (622), accuse (451), Zhong (564), Heart (877); performing word segmentation on the text to be recognized to obtain each word segmentation: "Jiangsu, Suzhou, disease control, center", the word vector of each participle is obtained by adding and averaging the second word vectors of each participle: jiangsuSuzhou provinceDisease control center", the second word vector forming the first word sequence matrix may be:the text to be recognized is only an example, and may also be a date, a symbol, or the like; the text to be recognized is not particularly limited. Averaging the second word vectors here is only an example of a same-dimension process to determine the word vectors for each participleThe dimension may be unchanged, such as subtraction. The dimension-invariant processing method is not particularly limited herein.
The embodiment of the application provides a named entity identification method, wherein the first training model is a BERT model (Bidirectional Encoder representation from transforms of a Bidirectional Encoder of a transformer); the second training model is a CNN model (Convolutional Neural Networks).
Here, the first training model is a BERT model, and the training process of the BERT model mainly encodes the input vector by using a bidirectional Transformer as an encoder. In particular, the BERT model is applied to the first word sequence matrixDividing according to words, if the first word sequence matrix of max _ length is exceeded, truncating the exceeded data, and adopting [ PAD ] when the length does not reach the value of max _ length]And (6) filling. Here, max _ length may refer to a row length or a column length of the first word sequence matrix, and is set according to specific needs. And then, labeling the first word sequence matrix, and labeling the beginning, middle and end of the sentence, the characters or words and the like to memorize the structure of the text to be recognized. Further, to train a deep bidirectional token, the empty task (size task) can be completed, that is, samples are determined by simply and randomly blocking x% of the text to be recognized, a first word sequence matrix of the blocked samples is obtained, the first word sequence matrix of the original blocked positions replaced by random words and the original first word sequence matrix are used as samples to be fed to an output softmax (logistic regression model), and then the blocked contents are predicted. For example, assuming that the original sentence is "my dog is hairpin", and 15% of the positions of tokens in the sentence are randomly selected for occlusion, and assuming that the fourth token position is randomly selected to be occluded, i.e., to occlude the hairpin, the occlusion process can be described as follows, wherein, assuming that a period of time is required for the whole prediction process to make each sample repeatedly input into the model in multiple epochs process, the text corresponding to the input matrixComprises the following steps:
80% time: replace the target word with [ MASK ], such as: my dog is hairpin- - > my dog is [ MASK ].
10% time: replacing the target word with a random word, for example: my dog is hairpin- > my dog is apple.
10% time: do not change the target word, for example: my dog is hairpin- - > my dog is hairpin.
Thus, the Transformer encoder model in the BERT model has to maintain a representation distribution (a distribution conditional representation) of one context per input token. That is, if the Transformer encoder model learns what the word to be predicted is, the learning of the context information is lost, and if the Transformer encoder model cannot learn which word to be predicted in the model training process, the word to be predicted must be judged by learning the information of token context, and such a model has the feature expression capability for the sentence.
The timing information and the position information of the samples in the above method can be characterized by the following formulas in the BERT model:
where pos denotes the position index of the word in the matrix, i denotes dimension, 2i denotes the even number therein, 2i +1 denotes the odd number therein, the even number position is encoded using sine, and the odd number position is encoded using cosine.
Wherein, softmax (logistic regression model) multiplies the input matrix by three parameters, Wq, Wv, Wk, respectively, to transform the matrix to obtain query, key, and values. Performing linear transformation on the input matrix according to Wq and Wk respectively through a Transformer encoder to obtain a matrix K corresponding to a matrix Q, Key corresponding to Query; and performing linear transformation on the input matrix according to Wv through a Transformer Decoder to obtain a matrix V corresponding to the Values. The softmax normalization process was further performed by the following formula, and the data was processed to be between 0 and 1. The importance of each word is adjusted using these correlations to obtain a new expression for each word:
Then, in order to increase semantic information of the output first word feature matrix, a Transformal "multi-head" mode is adopted, and the formula is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headk)Wo (1)
as in formula (5), different attition results of Q, K, V are obtained by changing the W parameters (three parameters Wq, Wv, Wk), and the obtained result is used as a head. In the formula (1), k heads are spliced and then the parameters W are usedoThe value obtained by performing a linear transformation once is obtained as a result of multi-head attention, i.e., MultiHead (Q, K, V). Finally, a fitting calculation is performed on the MultiHead (Q, K, V) through the fully connected feedforward network, and the application formula is as follows:
FFN(X)=max(0,ZW1+b1)W2+b2
where the output is represented as X, i.e., the first word feature vector; b is a bias vector; z is input Multihead (Q, K, V); w is a parameter of the fully-connected feed-forward network. The second training model is a CNN model, and the CNN model further extracts a first word sequence matrixIs represented by the following formula:
finally, C, which can be output according to the BERT modeliAnd M of CNN model outputiObtaining a first word fusion feature matrix R by summingi。
The embodiment of the application provides a named entity identification method, wherein the third training model comprises a bidirectional Long Short Term Memory network (Bi-directional Long Short-Term Memory network) BilS model and an automatic attention mechanism model; processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix;
here, the feature matrix R is fused to the first word by the BilSTM modeliProcessing is performed, as shown in fig. 3, the diagram is an internal structure diagram of the BiLSTM model, the BiLSTM model mainly includes three gates, namely a forgetting gate (Forget gate), an Input gate (Input gate), and an Output gate (Output gate), and the middle Cell is called a memory Cell and is used for storing a current memory state.
(1) Forget the door: the function of the forgetting gate is to determine the information discarded in the memory cell. The sigmod activation function is adopted to normalize the numerical value, the weight value is set to be a value between 0 and 1, the data of the sigmod activation function is derived from the current input, the hidden layer state at the previous moment and the memory cell at the previous moment, and the formula of forward propagation is as follows:
ft=sigmoid(Wf.[ht-1+xt]+bf
ftvalues of 0 or 1, 0 indicating complete discard and 1 indicating complete retention. f. oftOutput of forgetting gate layer at time t, ht-1Representing the hidden layer output vector, x, at time t-1tIndicating input at time t, WfIndicating x for input in the f statetWeight matrix of bfRepresenting a bias vector.
(2) An input gate: the input gate determines that additional content is required. The method is characterized in that a sigmod activation function is adopted for normalization, and then a new candidate value vector C forward propagation formula is created through a tanh function:
it=σ(Wf.[ht-1,xt]+bi)
Ct=Tanh(Wc.[ht-1,xt]+bc
itthe value of (1) is 0 or 1, 0 indicates that the current content is not added, and 1 indicates that the current content is newly added. i.e. itIs the output of the output gate layer at time t, WiIndicating for input x in the i statetWeight matrix of biRepresenting the bias vector in the i state. WcIndicates for input x in the C statetIs given, bc denotes the offset vector, CtThe candidate vectors generated for time t. It should be noted here that the data input in the calculation formulas of the forgetting gate and the input gate are the same, and the functions for distinguishing the two are the corresponding weight matrix and the bias.
(3) Memory cell: the memory cell stores the memorized content, and determines whether the past memory is retained at the current moment (i.e. f)tValue of) and whether to remember the new content (i.e., i)tOf) is determined, then the memory cell is updated, that is, after the candidate vector is determined, a state update is performed based on the previously obtained outputs of the forgetting gate and the input gate, where Ct-1Is the state vector at time t-1, CtIs the state vector at time t. The formula is as follows:
Ct=ftCt-1+Ct
among these, the formula for updating cells can be understood as follows: ct-1What represents the LSTM model remembers at time t-1, when at time t there are two questions to be faced, do one continue to remember what was previously (time t-1)? And whether new content is currently remembered? There will therefore be four cases:
i, when f t0 and itWhen equal to 0, C t0, i.e. forget all content in the past and not remember new content;
II when f t0 and itWhen 1, Ct=ZtForget to complete the content in the past, but remember the new content;
III when f t1 and itWhen equal to 0, Ct=Ct-1Namely, reserving the previous content and ignoring the new content;
IV, when f t1 and itWhen 1, Ct=Ct-1+ZtI.e. both retain the previous content and remember the new content.
Where the sigmod function is not binary (i.e. it is a value between 0-1), so for ftAnd itIn fact, it is decided how much to keep remembering past content and to choose to remember new content, respectively, for example f t1 indicates that all past content is retained, ftA value of 0.5 indicates forgetting half of the past content or fading past memory.
(4) An output gate: the output gate determines what content is output, i.e. for the current time t, if OtWhen 0, no output is indicated, and when O is not outputtAnd 1 represents output, and the third Sigmoid function determines that part of information needs to be output. Then processing through a tanh function to obtain a value between-1 and multiplying it with the output of the Sigmoid function to finally obtain the output:
Ot=σ(Wo.[ht-1,xt]+bo
ht=Ot*Tanh(Ct)
wherein, Tanh (C)t) The content memorized in the memory cell at the current moment is processed to make the value range between-1 and 1. O istIs the output at time t, WoIndicating x for input in the o statetWeight matrix of boDenotes an offset vector, htThe vectors of the hidden layer are for time t. Because of the three gating mechanisms, the LSTM can effectively process the long-term dependence problem, and solves the problems of gradient disappearance and gradient explosion to a certain extent.
Thus summarizing the principle of the LSTM model: at the t-th moment, firstly, whether past memory contents are reserved or not is judged, secondly, whether new contents need to be added or not is judged, and then, whether the contents at the current moment need to be output or not is judged after memory cells are updated.
Finally, by the above method, as shown in fig. 4, two opposite forward and backward LSTM layers are set, the forward LSTM layer represents a sequential sequence, and the backward LSTM layer represents a reverse sequence. Outputting through forward LSTM to show the past information; the future information is represented by the backward LSTM output. And combining the forward direction and the backward direction to obtain the output of the BilSTM layer, and obtaining a first word characteristic matrix. According to the method, the semantic information contained in the first word fusion characteristic matrix corresponding to the text to be recognized is increased through the BilSTM model, and the reliability and the accuracy of the semantic information contained in the first word fusion characteristic matrix are improved.
Then, continuously processing the first word characteristic matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word characteristic matrix to obtain a second word characteristic matrix;
i.e. the first word feature matrix GiInput formulaObtaining the output second word characteristic matrix Xi. Finally, according to the second character feature matrix XiAnd acquiring a named entity recognition result of the text to be recognized. Thus, it is aAdding a first word feature matrix G through a self-attention mechanism modeliSo that the second word feature matrix X isiSo that the feature matrix X is based on the second wordiWhen the named entity of the text to be recognized is obtained, the named entity is recognized more definitely, and the accuracy of the recognition result of the named entity is improved.
The embodiment of the application also provides a feature matrix X belonging to a second wordiA method of performing sequence optimization, said third training model further comprising a CRF model (conditional random field);
processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix. For the second word feature matrix X, it is assumed that K is the output score matrix through the self-attention mechanism, the size of K is n × K, n is the number of words, K is the number of labels, and K isijScore representing jth label of ith word, for predicted sequenceTo say, its score function is obtained:
wherein A represents a transition score matrix, A represents the score of label i transitioning to label j, and the probability of generating the prediction sequence Y is as follows:
taking logarithms at two ends to obtain a likelihood function of the prediction sequence:
in the formula,representing the actual tag sequence, YXRepresenting all possible tag sequences, the final output optimal sequence is:
therefore, the sequence of the named entities obtained by identifying the text to be identified is the optimal sequence Y on the premise of improving the accuracy of the named entity identification result by performing sequence optimization on the second word feature matrix through the CRF model*。
The embodiment of the application also provides a model training method, which comprises the following steps: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result. That is, if the text to be recognized is recognized through the combined model of the first training model, the second training model and the third training model, the sample text may be recognized through the first training model which is trained well, the second training model which is not trained and the third training model, and the relevant parameters of the second training model and the third training model are continuously adjusted in the process of recognizing the sample text through the first training model, the second training model and the third training model; and finishing the training of a combined model formed by the first training model, the second training model and the third training model.
Based on the above method flow, an embodiment of the present application provides a flow of a named entity identification method, as shown in fig. 5, including:
And step 504, inputting the second word fusion feature matrix into an untrained third training model, and obtaining a second named entity recognition result of the sample text.
And 505, adjusting relevant parameters of the second training model and the third training model according to the second named entity recognition result, and re-executing the steps 501 to 505 until the obtained second named entity recognition result reaches a preset accuracy rate.
And step 508, performing same-dimension processing on the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix.
And 509, inputting the first word fusion feature matrix into a third training model to obtain a named entity recognition result.
It should be noted that, in the above-mentioned flow, steps 501 to 504 are to train the second training model and the third training model through the trained first training model to obtain a mature combined model of the first training model, the second training model and the third training model. Steps 501 to 504 in the above flow may be executed in a loop until it is determined that the recognition accuracy of the combined model of the current first training model, the current second training model, and the current third training model reaches the required accuracy. And executing the steps 506 to 509 according to the models obtained in the steps 501 to 505 to obtain the named entity recognition result of the text to be recognized.
The accuracy of the recognition results of several types of named entities with respect to the above method is provided herein, including activity name (activity _ name), address (address), index data (data), organization name (organization _ name), and time (time). The evaluation index of the application uses methods of accuracy (P), recall (R), and F1 values. The specific formula is as follows:
wherein P is the ratio of the named entity correctly labeled by the method to the total amount of the entity identified in the text to be identified; correct number of entities marked as correct number of entities; missing is the number of entities identifying the error; spurious is the correct number of entities which are not recognized; r is the ratio of the named entity which is correctly marked to the total number of the entities in the test set; f1 is a weighted geometric mean of P and R. The application also provides an accuracy (P), a recall (R) and an F1 value of the named entity identified in the government affairs report; as shown in the following table:
based on the same concept, an embodiment of the present invention provides a named entity recognition apparatus, and fig. 6 is a schematic diagram of the named entity recognition apparatus provided in the embodiment of the present application, as shown in fig. 6, including:
an obtaining module 601, configured to input a first word sequence matrix of a text to be recognized into a first training model to obtain a first word feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
a processing module 602, configured to process the first word feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
Optionally, the processing module 602 is further configured to: setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
Optionally, the processing module 602 is further configured to: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
Optionally, the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
Optionally, the third training model includes a bidirectional Long Short-Term Memory network (Bi-directional Long Short-Term Memory) model and a self-attention mechanism model; optionally, the processing module 602 is specifically configured to: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix; processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix; and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
Optionally, the third training model further includes a CRF (conditional random field); the processing module 602 is further configured to: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
Optionally, the processing module 602 is further configured to: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (10)
1. A named entity recognition method, comprising:
inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized;
inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix;
and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
2. The method of claim 1, wherein before inputting the first word sequence matrix of the text to be recognized into the first training model to obtain the first word feature matrix of the text to be recognized, the method further comprises:
setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality;
before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps:
and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
3. The method of claim 1, wherein before inputting the first word sequence matrix of the text to be recognized into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises:
determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix;
determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode;
performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized;
and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
4. The method of claim 1, wherein the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
5. The method of any of claims 1 to 4, wherein the third training model comprises a bidirectional Long Short Term Memory network (BilL TM) model (Bi-directional Long Short-Term Memory network) and a self-attention mechanism model;
processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps:
processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix;
processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix;
and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
6. The method of claim 5, wherein the third training model further comprises a CRF model (conditional random field);
processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes:
performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix;
and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
7. The method of any of claims 1-6, further comprising:
inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model;
inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix;
processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix;
processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text;
and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
8. An apparatus for named entity recognition, the apparatus comprising:
the acquisition module is used for inputting a first character sequence matrix of a text to be recognized into a first training model to acquire a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
the processing module is used for processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
9. A computer-readable storage medium, characterized in that the storage medium stores a program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 7.
10. A computer device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory to execute the method of any of claims 1 to 7 in accordance with the obtained program.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039983.1A CN112115721B (en) | 2020-09-28 | 2020-09-28 | Named entity recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011039983.1A CN112115721B (en) | 2020-09-28 | 2020-09-28 | Named entity recognition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112115721A true CN112115721A (en) | 2020-12-22 |
CN112115721B CN112115721B (en) | 2024-05-17 |
Family
ID=73798679
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011039983.1A Active CN112115721B (en) | 2020-09-28 | 2020-09-28 | Named entity recognition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112115721B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112487820A (en) * | 2021-02-05 | 2021-03-12 | 南京邮电大学 | Chinese medical named entity recognition method |
CN112699683A (en) * | 2020-12-31 | 2021-04-23 | 大唐融合通信股份有限公司 | Named entity identification method and device fusing neural network and rule |
CN112802570A (en) * | 2021-02-07 | 2021-05-14 | 成都延华西部健康医疗信息产业研究院有限公司 | Named entity recognition system and method for electronic medical record |
CN112949310A (en) * | 2021-03-01 | 2021-06-11 | 创新奇智(上海)科技有限公司 | Model training method, traditional Chinese medicine name recognition method and device and network model |
CN112989834A (en) * | 2021-04-15 | 2021-06-18 | 杭州一知智能科技有限公司 | Named entity identification method and system based on flat grid enhanced linear converter |
CN113051500A (en) * | 2021-03-25 | 2021-06-29 | 武汉大学 | Phishing website identification method and system fusing multi-source data |
CN113268538A (en) * | 2021-05-17 | 2021-08-17 | 哈尔滨工业大学(威海) | Complex equipment fault tracing method and system based on domain knowledge graph |
CN113449524A (en) * | 2021-04-01 | 2021-09-28 | 山东英信计算机技术有限公司 | Named entity identification method, system, equipment and medium |
CN114417873A (en) * | 2022-01-17 | 2022-04-29 | 软通动力信息技术(集团)股份有限公司 | Few-sample entity identification method, device, medium and equipment |
CN114970666A (en) * | 2022-03-29 | 2022-08-30 | 北京百度网讯科技有限公司 | Spoken language processing method and device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017162134A1 (en) * | 2016-03-22 | 2017-09-28 | 索尼公司 | Electronic device and method for text processing |
CN108536679A (en) * | 2018-04-13 | 2018-09-14 | 腾讯科技(成都)有限公司 | Name entity recognition method, device, equipment and computer readable storage medium |
CN111191453A (en) * | 2019-12-25 | 2020-05-22 | 中国电子科技集团公司第十五研究所 | Named entity recognition method based on confrontation training |
CN111310470A (en) * | 2020-01-17 | 2020-06-19 | 西安交通大学 | Chinese named entity recognition method fusing word and word features |
WO2020133039A1 (en) * | 2018-12-27 | 2020-07-02 | 深圳市优必选科技有限公司 | Entity identification method and apparatus in dialogue corpus, and computer device |
-
2020
- 2020-09-28 CN CN202011039983.1A patent/CN112115721B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017162134A1 (en) * | 2016-03-22 | 2017-09-28 | 索尼公司 | Electronic device and method for text processing |
CN108536679A (en) * | 2018-04-13 | 2018-09-14 | 腾讯科技(成都)有限公司 | Name entity recognition method, device, equipment and computer readable storage medium |
WO2020133039A1 (en) * | 2018-12-27 | 2020-07-02 | 深圳市优必选科技有限公司 | Entity identification method and apparatus in dialogue corpus, and computer device |
CN111191453A (en) * | 2019-12-25 | 2020-05-22 | 中国电子科技集团公司第十五研究所 | Named entity recognition method based on confrontation training |
CN111310470A (en) * | 2020-01-17 | 2020-06-19 | 西安交通大学 | Chinese named entity recognition method fusing word and word features |
Non-Patent Citations (2)
Title |
---|
谢腾;杨俊安;刘辉;: "基于BERT-BiLSTM-CRF模型的中文实体识别", 计算机系统应用, no. 07 * |
赵平;孙连英;万莹;葛娜;: "基于BERT+BiLSTM+CRF的中文景点命名实体识别", 计算机系统应用, no. 06 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699683A (en) * | 2020-12-31 | 2021-04-23 | 大唐融合通信股份有限公司 | Named entity identification method and device fusing neural network and rule |
CN112487820A (en) * | 2021-02-05 | 2021-03-12 | 南京邮电大学 | Chinese medical named entity recognition method |
CN112487820B (en) * | 2021-02-05 | 2021-05-25 | 南京邮电大学 | Chinese medical named entity recognition method |
CN112802570A (en) * | 2021-02-07 | 2021-05-14 | 成都延华西部健康医疗信息产业研究院有限公司 | Named entity recognition system and method for electronic medical record |
CN112949310A (en) * | 2021-03-01 | 2021-06-11 | 创新奇智(上海)科技有限公司 | Model training method, traditional Chinese medicine name recognition method and device and network model |
CN113051500A (en) * | 2021-03-25 | 2021-06-29 | 武汉大学 | Phishing website identification method and system fusing multi-source data |
CN113051500B (en) * | 2021-03-25 | 2022-08-16 | 武汉大学 | Phishing website identification method and system fusing multi-source data |
CN113449524A (en) * | 2021-04-01 | 2021-09-28 | 山东英信计算机技术有限公司 | Named entity identification method, system, equipment and medium |
CN112989834A (en) * | 2021-04-15 | 2021-06-18 | 杭州一知智能科技有限公司 | Named entity identification method and system based on flat grid enhanced linear converter |
CN113268538A (en) * | 2021-05-17 | 2021-08-17 | 哈尔滨工业大学(威海) | Complex equipment fault tracing method and system based on domain knowledge graph |
CN114417873A (en) * | 2022-01-17 | 2022-04-29 | 软通动力信息技术(集团)股份有限公司 | Few-sample entity identification method, device, medium and equipment |
CN114970666A (en) * | 2022-03-29 | 2022-08-30 | 北京百度网讯科技有限公司 | Spoken language processing method and device, electronic equipment and storage medium |
CN114970666B (en) * | 2022-03-29 | 2023-08-29 | 北京百度网讯科技有限公司 | Spoken language processing method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN112115721B (en) | 2024-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112115721B (en) | Named entity recognition method and device | |
CN111444726B (en) | Chinese semantic information extraction method and device based on long-short-term memory network of bidirectional lattice structure | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN111310471B (en) | Travel named entity identification method based on BBLC model | |
CN111985239B (en) | Entity identification method, entity identification device, electronic equipment and storage medium | |
Badjatiya et al. | Attention-based neural text segmentation | |
CN112270193A (en) | Chinese named entity identification method based on BERT-FLAT | |
CN112487820B (en) | Chinese medical named entity recognition method | |
CN111966812B (en) | Automatic question answering method based on dynamic word vector and storage medium | |
CN111145914B (en) | Method and device for determining text entity of lung cancer clinical disease seed bank | |
CN114385803B (en) | Extraction type reading understanding method based on external knowledge and fragment selection | |
CN114548101A (en) | Event detection method and system based on backtracking sequence generation method | |
CN113641809A (en) | XLNET-BiGRU-CRF-based intelligent question answering method | |
CN115600597A (en) | Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium | |
CN113836891A (en) | Method and device for extracting structured information based on multi-element labeling strategy | |
CN112685561A (en) | Small sample clinical medical text post-structuring processing method across disease categories | |
CN111914553A (en) | Financial information negative subject judgment method based on machine learning | |
CN115238026A (en) | Medical text subject segmentation method and device based on deep learning | |
CN114330328A (en) | Tibetan word segmentation method based on Transformer-CRF | |
CN114239584A (en) | Named entity identification method based on self-supervision learning | |
CN116362242A (en) | Small sample slot value extraction method, device, equipment and storage medium | |
CN112733526B (en) | Extraction method for automatically identifying tax collection object in financial file | |
Wu et al. | Analyzing the Application of Multimedia Technology Assisted English Grammar Teaching in Colleges | |
CN114925695A (en) | Named entity identification method, system, equipment and storage medium | |
CN114372467A (en) | Named entity extraction method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |