CN112115721A - Named entity identification method and device - Google Patents

Named entity identification method and device Download PDF

Info

Publication number
CN112115721A
CN112115721A CN202011039983.1A CN202011039983A CN112115721A CN 112115721 A CN112115721 A CN 112115721A CN 202011039983 A CN202011039983 A CN 202011039983A CN 112115721 A CN112115721 A CN 112115721A
Authority
CN
China
Prior art keywords
word
text
matrix
feature matrix
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011039983.1A
Other languages
Chinese (zh)
Other versions
CN112115721B (en
Inventor
于腾
葛通
李晓雨
孙凯
徐文权
潘汉祺
胡永利
申彦明
陈维强
孙永良
于涛
王玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense TransTech Co Ltd
Original Assignee
Hisense TransTech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense TransTech Co Ltd filed Critical Hisense TransTech Co Ltd
Priority to CN202011039983.1A priority Critical patent/CN112115721B/en
Publication of CN112115721A publication Critical patent/CN112115721A/en
Application granted granted Critical
Publication of CN112115721B publication Critical patent/CN112115721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Machine Translation (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the invention provides a named entity identification method and a device, wherein the method comprises the following steps: inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix; processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized. In the method, the character feature matrix and the word feature matrix are subjected to fusion processing, and the character and word fusion feature matrix is processed, so that the accuracy of the recognition result of the text to be recognized is further improved.

Description

Named entity identification method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a named entity identification method and apparatus.
Background
Named Entity Recognition (NER), also called "proper name Recognition", refers to recognizing entities with specific meaning in text, mainly including names of people, places, organizations, proper nouns, etc. The NER technique is to mark the position and type of the related entity from a piece of natural language text, and extract the required entity, such as some organization names, person names, identification of disease and symptom etc. in the medical field. The method is widely applied to tasks such as knowledge graph construction, information extraction, information retrieval, machine translation, automatic question answering, public opinion monitoring and the like, and is the basis of natural language processing.
NER generally uses sequence labeling to associate entity boundaries and determine entity types. However, since the first step of NER is to determine the boundaries of words, i.e., segmentation, chinese text has no boundary markers like the spaces in english text for explicit labeled words; and special entity types in Chinese, such as foreigner name translation and place name translation, exist in addition to the entities defined in English; and the polysemy of Chinese words; the existing Chinese named entity recognition method still has certain limitation.
In the prior art, characters, word roots or words are mapped into a single vector, and the NER technology is realized through training models such as a corresponding Convolutional Neural Network (CNN) and a Long Short-Term Memory network (LSTM). However, in order to strengthen the correlation between the characters and the characters or between the characters and the characters, a great deal of manual intervention is needed to construct the characteristics of the characters or the characters, and the method is time-consuming and labor-consuming. And the method is difficult to ensure the accuracy of named entity identification in the practical process. Especially for sentences with long entities, it is more difficult to identify the boundaries of the entities, resulting in a lower accuracy of named entity identification.
Therefore, there is a need for a method and an apparatus for identifying a named entity, which can improve the accuracy of identifying the named entity.
Disclosure of Invention
The embodiment of the invention provides a named entity identification method and device, which can improve the accuracy of named entity identification.
In a first aspect, an embodiment of the present invention provides a method for identifying a named entity, where the method includes:
inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix; processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In the method, a word feature matrix and a word feature matrix are obtained by using a first training model and a second training model. The dimension of the character feature matrix is the same as that of the word feature matrix, so that the character feature matrix and the word feature matrix are fused, and on one hand, the accuracy of the recognition result of the text to be recognized can be improved; on the other hand, the phenomenon that gradient explosion occurs due to overhigh dimensionality caused by calculation modes such as splicing the character feature matrix and the word feature matrix and the like, and the running efficiency of the model is reduced. And the character and word fusion characteristic matrix is processed through the third training model, so that the accuracy of the recognition result of the text to be recognized is further improved.
Optionally, before the first word sequence matrix of the text to be recognized is input into the first training model to obtain the first word feature matrix of the text to be recognized, the method further includes: setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
In the method, the first parameters are respectively set for the first training model, and the second parameters are set for the second training model, so that the obtained character feature matrix and the obtained word feature matrix have the same dimension, the fusion processing of the first character feature matrix and the first word feature matrix is facilitated, and the accuracy of named entity recognition is further facilitated to be improved.
Optionally, before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further includes: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
In the method, a first word vector and a second word vector of the text to be recognized are determined in a first mode and a second mode respectively, and further, according to each participle of the text to be recognized and the second word vector corresponding to each participle, the second word vector of a plurality of words of each participle is subjected to same-dimension processing to determine the word vector of each participle, and further, a first word sequence matrix is obtained. Therefore, the first word sequence matrix not only contains the word segmentation information of the text to be recognized, but also contains the semantic information of each word of the text to be recognized, and the global informativeness of the word is kept. Therefore, the accuracy of the named entity recognition result of the text to be recognized is improved.
Optionally, the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
Optionally, the third training model includes a bidirectional Long Short-Term Memory network (Bi-directional Long Short-Term Memory) model and a self-attention mechanism model; processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix; processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix; and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
In the method, the semantic information contained in the first word fusion characteristic matrix corresponding to the text to be recognized is increased through the BilSTM model, and the reliability and the accuracy of the semantic information contained in the first word fusion characteristic matrix are improved. Furthermore, the weight of the corresponding named entity in the first word feature matrix is increased through the self-attention mechanism model, so that the named entity in the second word feature matrix is prominent in position, when the named entity of the text to be recognized is obtained according to the second word feature matrix, the recognition of the named entity is more definite, and the accuracy of the recognition result of the named entity is increased.
Optionally, the third training model further includes a CRF (conditional random field); processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
In the method, the sequence of the second word feature matrix is optimized through the CRF model, so that the sequence of the named entities obtained by identifying the text to be identified is the optimal sequence on the premise of improving the accuracy of the recognition result of the named entities.
Optionally, the method further includes: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
In the method, the second training model and the third training model which are not trained are trained by using the first training model which is trained to be mature, so that the accuracy of each relevant parameter of the second training model and the third training model is improved, and the matching degree of the first training model, the second training model and the third training model is improved. Therefore, the recognition result of the text to be recognized is more accurate.
In a second aspect, an embodiment of the present invention provides a named entity identifying apparatus, where the apparatus includes:
the acquisition module is used for inputting a first character sequence matrix of a text to be recognized into a first training model to acquire a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
the processing module is used for processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In a third aspect, an embodiment of the present application further provides a computing device, including: a memory for storing a program; a processor for calling the program stored in said memory and executing the method as described in the various possible designs of the first aspect according to the obtained program.
In a fourth aspect, embodiments of the present application further provide a computer-readable non-transitory storage medium including a computer-readable program which, when read and executed by a computer, causes the computer to perform the method as described in the various possible designs of the first aspect.
These and other implementations of the present application will be more readily understood from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of an architecture for named entity recognition according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of a named entity identification method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a BilSTM model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a BilSTM model according to an embodiment of the present invention;
fig. 5 is a schematic flowchart of a named entity recognition method according to an embodiment of the present invention;
fig. 6 is a schematic diagram of a named entity recognition apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a system architecture for named entity recognition according to an embodiment of the present invention, where a text to be recognized is input into a word feature training model 101, and the word feature training model 101 inputs a first word feature matrix in the text to be recognized into a word feature fusion model 103 according to the first word feature matrix; the word feature training model 102 inputs a first word feature matrix into the word feature fusion model 103 according to the first word feature matrix in the text to be recognized; the word feature fusion model 103 performs fusion processing on a first word feature matrix and a first word feature matrix with the same dimension to obtain a first word fusion feature matrix, and inputs the first word fusion feature matrix into the global semantic training model 104; training the first word fusion feature matrix through the global semantic training model 104 to increase global semantic information of the first word fusion feature matrix to the text to be recognized, so as to obtain a first word feature matrix; the global semantic training model 104 inputs the obtained first word feature matrix into the named entity weight training model 105; the named entity recognition weight model 105 trains the first word feature matrix to increase the weight of the named entity in the text to be recognized corresponding to the first word feature matrix to obtain a second word feature matrix; the named entity weight training model 105 inputs the obtained second word feature matrix into the named entity sequence training model 106; the named entity sequence training model 106 trains the second word feature matrix to optimize the arrangement sequence of the named entities in the text to be recognized corresponding to the second word feature matrix, so as to obtain a third word feature matrix, and a named entity recognition result is obtained according to the third word feature matrix.
Based on this, an embodiment of the present application provides a process of a named entity identification method, as shown in fig. 2, including:
step 201, inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized;
step 202, inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word feature matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
step 203, processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix;
and 204, processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
In the method, a word feature matrix and a word feature matrix are obtained by using a first training model and a second training model. The dimension of the character feature matrix is the same as that of the word feature matrix, so that the character feature matrix and the word feature matrix are fused, and on one hand, the accuracy of the recognition result of the text to be recognized can be improved; on the other hand, the phenomenon that gradient explosion occurs due to overhigh dimensionality caused by calculation modes such as splicing the character feature matrix and the word feature matrix and the like, and the running efficiency of the model is reduced. And the character and word fusion characteristic matrix is processed through the third training model, so that the accuracy of the recognition result of the text to be recognized is further improved.
Before the first word sequence matrix of the text to be recognized is input into the first training model to obtain the first word feature matrix of the text to be recognized, the embodiment of the application further provides a dimension obtaining method, which further includes:
setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality. That is, the matrix dimensions output by the first training model and the second training model may be made the same by setting the first parameters of the first training model and the second parameters of the second training model. Namely, the dimensionality of the first word feature matrix output by the first training model is the same as that of the first word feature matrix output by the second training model.
The embodiment of the present application further provides a method for acquiring a first word sequence matrix and a first word sequence matrix, before inputting the first word sequence matrix of a text to be recognized into a second training model to acquire the first word feature matrix of the text to be recognized, the method further includes: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining the text to be recognized through a second modeA second word vector corresponding to each word of the text; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix. That is, the first word vectors are obtained in a first manner, and the first word sequence matrix is obtained according to each first word vector of the text to be recognized:
Figure BDA0002706315080000081
Cia set of first word vectors representing an ith sentence,
Figure BDA0002706315080000082
a first word vector representing the nth word of the ith sentence. A second word vector is obtained by a second means,
Figure BDA0002706315080000083
performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized
Figure BDA0002706315080000084
SiA set of word vectors representing the ith sentence,
Figure BDA0002706315080000085
a word vector representing the mth word of the ith sentence. If word vector
Figure BDA0002706315080000086
Corresponding to the second word vector
Figure BDA0002706315080000087
Then
Figure BDA0002706315080000088
Or
Figure BDA0002706315080000089
The way of obtaining the word vector by the second word vector here may be addition orThe way of obtaining the word vector through the second word vector is not particularly limited, other than the subtraction. Thus, the word vector is obtained according to the second word vector, so that the first word sequence matrix obtained according to the word vector not only contains the word segmentation information of the text to be recognized, but also contains the semantic information of each word of the text to be recognized, and the global informativeness of the word is kept. Therefore, the accuracy of the named entity recognition result of the text to be recognized is improved.
For example, the text to be recognized is "Jiangsu Suzhou disease control center", and the first word vectors corresponding to the words obtained by the first method are respectively: river (159), Su (357), State (489), disease (621), accuse (741), Zhongzhong (963), Heart (452); the first word vector forming the first word sequence matrix may be:
Figure BDA00027063150800000810
obtaining second word vectors corresponding to the words through a second mode respectively as follows: river (321), Su (355), Su (557), State (499), disease (622), accuse (451), Zhong (564), Heart (877); performing word segmentation on the text to be recognized to obtain each word segmentation: "Jiangsu, Suzhou, disease control, center", the word vector of each participle is obtained by adding and averaging the second word vectors of each participle: jiangsu
Figure BDA0002706315080000091
Suzhou province
Figure BDA0002706315080000092
Disease control center
Figure BDA0002706315080000093
", the second word vector forming the first word sequence matrix may be:
Figure BDA0002706315080000094
the text to be recognized is only an example, and may also be a date, a symbol, or the like; the text to be recognized is not particularly limited. Averaging the second word vectors here is only an example of a same-dimension process to determine the word vectors for each participleThe dimension may be unchanged, such as subtraction. The dimension-invariant processing method is not particularly limited herein.
The embodiment of the application provides a named entity identification method, wherein the first training model is a BERT model (Bidirectional Encoder representation from transforms of a Bidirectional Encoder of a transformer); the second training model is a CNN model (Convolutional Neural Networks).
Here, the first training model is a BERT model, and the training process of the BERT model mainly encodes the input vector by using a bidirectional Transformer as an encoder. In particular, the BERT model is applied to the first word sequence matrix
Figure BDA0002706315080000095
Dividing according to words, if the first word sequence matrix of max _ length is exceeded, truncating the exceeded data, and adopting [ PAD ] when the length does not reach the value of max _ length]And (6) filling. Here, max _ length may refer to a row length or a column length of the first word sequence matrix, and is set according to specific needs. And then, labeling the first word sequence matrix, and labeling the beginning, middle and end of the sentence, the characters or words and the like to memorize the structure of the text to be recognized. Further, to train a deep bidirectional token, the empty task (size task) can be completed, that is, samples are determined by simply and randomly blocking x% of the text to be recognized, a first word sequence matrix of the blocked samples is obtained, the first word sequence matrix of the original blocked positions replaced by random words and the original first word sequence matrix are used as samples to be fed to an output softmax (logistic regression model), and then the blocked contents are predicted. For example, assuming that the original sentence is "my dog is hairpin", and 15% of the positions of tokens in the sentence are randomly selected for occlusion, and assuming that the fourth token position is randomly selected to be occluded, i.e., to occlude the hairpin, the occlusion process can be described as follows, wherein, assuming that a period of time is required for the whole prediction process to make each sample repeatedly input into the model in multiple epochs process, the text corresponding to the input matrixComprises the following steps:
80% time: replace the target word with [ MASK ], such as: my dog is hairpin- - > my dog is [ MASK ].
10% time: replacing the target word with a random word, for example: my dog is hairpin- > my dog is apple.
10% time: do not change the target word, for example: my dog is hairpin- - > my dog is hairpin.
Thus, the Transformer encoder model in the BERT model has to maintain a representation distribution (a distribution conditional representation) of one context per input token. That is, if the Transformer encoder model learns what the word to be predicted is, the learning of the context information is lost, and if the Transformer encoder model cannot learn which word to be predicted in the model training process, the word to be predicted must be judged by learning the information of token context, and such a model has the feature expression capability for the sentence.
The timing information and the position information of the samples in the above method can be characterized by the following formulas in the BERT model:
Figure BDA0002706315080000101
Figure BDA0002706315080000102
where pos denotes the position index of the word in the matrix, i denotes dimension, 2i denotes the even number therein, 2i +1 denotes the odd number therein, the even number position is encoded using sine, and the odd number position is encoded using cosine.
Wherein, softmax (logistic regression model) multiplies the input matrix by three parameters, Wq, Wv, Wk, respectively, to transform the matrix to obtain query, key, and values. Performing linear transformation on the input matrix according to Wq and Wk respectively through a Transformer encoder to obtain a matrix K corresponding to a matrix Q, Key corresponding to Query; and performing linear transformation on the input matrix according to Wv through a Transformer Decoder to obtain a matrix V corresponding to the Values. The softmax normalization process was further performed by the following formula, and the data was processed to be between 0 and 1. The importance of each word is adjusted using these correlations to obtain a new expression for each word:
Figure BDA0002706315080000111
wherein the content of the first and second substances,
Figure BDA0002706315080000112
is a regulatory factor.
Then, in order to increase semantic information of the output first word feature matrix, a Transformal "multi-head" mode is adopted, and the formula is as follows:
MultiHead(Q,K,V)=Concat(head1,head2,…,headk)Wo (1)
Figure BDA0002706315080000113
as in formula (5), different attition results of Q, K, V are obtained by changing the W parameters (three parameters Wq, Wv, Wk), and the obtained result is used as a head. In the formula (1), k heads are spliced and then the parameters W are usedoThe value obtained by performing a linear transformation once is obtained as a result of multi-head attention, i.e., MultiHead (Q, K, V). Finally, a fitting calculation is performed on the MultiHead (Q, K, V) through the fully connected feedforward network, and the application formula is as follows:
FFN(X)=max(0,ZW1+b1)W2+b2
where the output is represented as X, i.e., the first word feature vector; b is a bias vector; z is input Multihead (Q, K, V); w is a parameter of the fully-connected feed-forward network. The second training model is a CNN model, and the CNN model further extracts a first word sequence matrix
Figure BDA0002706315080000114
Is represented by the following formula:
Figure BDA0002706315080000115
obtaining a first word feature vector processed by a CNN model,
Figure BDA0002706315080000116
finally, C, which can be output according to the BERT modeliAnd M of CNN model outputiObtaining a first word fusion feature matrix R by summingi
The embodiment of the application provides a named entity identification method, wherein the third training model comprises a bidirectional Long Short Term Memory network (Bi-directional Long Short-Term Memory network) BilS model and an automatic attention mechanism model; processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix;
here, the feature matrix R is fused to the first word by the BilSTM modeliProcessing is performed, as shown in fig. 3, the diagram is an internal structure diagram of the BiLSTM model, the BiLSTM model mainly includes three gates, namely a forgetting gate (Forget gate), an Input gate (Input gate), and an Output gate (Output gate), and the middle Cell is called a memory Cell and is used for storing a current memory state.
(1) Forget the door: the function of the forgetting gate is to determine the information discarded in the memory cell. The sigmod activation function is adopted to normalize the numerical value, the weight value is set to be a value between 0 and 1, the data of the sigmod activation function is derived from the current input, the hidden layer state at the previous moment and the memory cell at the previous moment, and the formula of forward propagation is as follows:
ft=sigmoid(Wf.[ht-1+xt]+bf
ftvalues of 0 or 1, 0 indicating complete discard and 1 indicating complete retention. f. oftOutput of forgetting gate layer at time t, ht-1Representing the hidden layer output vector, x, at time t-1tIndicating input at time t, WfIndicating x for input in the f statetWeight matrix of bfRepresenting a bias vector.
(2) An input gate: the input gate determines that additional content is required. The method is characterized in that a sigmod activation function is adopted for normalization, and then a new candidate value vector C forward propagation formula is created through a tanh function:
it=σ(Wf.[ht-1,xt]+bi)
Ct=Tanh(Wc.[ht-1,xt]+bc
itthe value of (1) is 0 or 1, 0 indicates that the current content is not added, and 1 indicates that the current content is newly added. i.e. itIs the output of the output gate layer at time t, WiIndicating for input x in the i statetWeight matrix of biRepresenting the bias vector in the i state. WcIndicates for input x in the C statetIs given, bc denotes the offset vector, CtThe candidate vectors generated for time t. It should be noted here that the data input in the calculation formulas of the forgetting gate and the input gate are the same, and the functions for distinguishing the two are the corresponding weight matrix and the bias.
(3) Memory cell: the memory cell stores the memorized content, and determines whether the past memory is retained at the current moment (i.e. f)tValue of) and whether to remember the new content (i.e., i)tOf) is determined, then the memory cell is updated, that is, after the candidate vector is determined, a state update is performed based on the previously obtained outputs of the forgetting gate and the input gate, where Ct-1Is the state vector at time t-1, CtIs the state vector at time t. The formula is as follows:
Ct=ftCt-1+Ct
among these, the formula for updating cells can be understood as follows: ct-1What represents the LSTM model remembers at time t-1, when at time t there are two questions to be faced, do one continue to remember what was previously (time t-1)? And whether new content is currently remembered? There will therefore be four cases:
i, when f t0 and itWhen equal to 0, C t0, i.e. forget all content in the past and not remember new content;
II when f t0 and itWhen 1, Ct=ZtForget to complete the content in the past, but remember the new content;
III when f t1 and itWhen equal to 0, Ct=Ct-1Namely, reserving the previous content and ignoring the new content;
IV, when f t1 and itWhen 1, Ct=Ct-1+ZtI.e. both retain the previous content and remember the new content.
Where the sigmod function is not binary (i.e. it is a value between 0-1), so for ftAnd itIn fact, it is decided how much to keep remembering past content and to choose to remember new content, respectively, for example f t1 indicates that all past content is retained, ftA value of 0.5 indicates forgetting half of the past content or fading past memory.
(4) An output gate: the output gate determines what content is output, i.e. for the current time t, if OtWhen 0, no output is indicated, and when O is not outputtAnd 1 represents output, and the third Sigmoid function determines that part of information needs to be output. Then processing through a tanh function to obtain a value between-1 and multiplying it with the output of the Sigmoid function to finally obtain the output:
Ot=σ(Wo.[ht-1,xt]+bo
ht=Ot*Tanh(Ct)
wherein, Tanh (C)t) The content memorized in the memory cell at the current moment is processed to make the value range between-1 and 1. O istIs the output at time t, WoIndicating x for input in the o statetWeight matrix of boDenotes an offset vector, htThe vectors of the hidden layer are for time t. Because of the three gating mechanisms, the LSTM can effectively process the long-term dependence problem, and solves the problems of gradient disappearance and gradient explosion to a certain extent.
Thus summarizing the principle of the LSTM model: at the t-th moment, firstly, whether past memory contents are reserved or not is judged, secondly, whether new contents need to be added or not is judged, and then, whether the contents at the current moment need to be output or not is judged after memory cells are updated.
Finally, by the above method, as shown in fig. 4, two opposite forward and backward LSTM layers are set, the forward LSTM layer represents a sequential sequence, and the backward LSTM layer represents a reverse sequence. Outputting through forward LSTM to show the past information; the future information is represented by the backward LSTM output. And combining the forward direction and the backward direction to obtain the output of the BilSTM layer, and obtaining a first word characteristic matrix. According to the method, the semantic information contained in the first word fusion characteristic matrix corresponding to the text to be recognized is increased through the BilSTM model, and the reliability and the accuracy of the semantic information contained in the first word fusion characteristic matrix are improved.
Then, continuously processing the first word characteristic matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word characteristic matrix to obtain a second word characteristic matrix;
i.e. the first word feature matrix GiInput formula
Figure BDA0002706315080000141
Obtaining the output second word characteristic matrix Xi. Finally, according to the second character feature matrix XiAnd acquiring a named entity recognition result of the text to be recognized. Thus, it is aAdding a first word feature matrix G through a self-attention mechanism modeliSo that the second word feature matrix X isiSo that the feature matrix X is based on the second wordiWhen the named entity of the text to be recognized is obtained, the named entity is recognized more definitely, and the accuracy of the recognition result of the named entity is improved.
The embodiment of the application also provides a feature matrix X belonging to a second wordiA method of performing sequence optimization, said third training model further comprising a CRF model (conditional random field);
processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix. For the second word feature matrix X, it is assumed that K is the output score matrix through the self-attention mechanism, the size of K is n × K, n is the number of words, K is the number of labels, and K isijScore representing jth label of ith word, for predicted sequence
Figure BDA0002706315080000151
To say, its score function is obtained:
Figure BDA0002706315080000152
wherein A represents a transition score matrix, A represents the score of label i transitioning to label j, and the probability of generating the prediction sequence Y is as follows:
Figure BDA0002706315080000153
taking logarithms at two ends to obtain a likelihood function of the prediction sequence:
Figure BDA0002706315080000154
in the formula (I), the compound is shown in the specification,
Figure BDA0002706315080000155
representing the actual tag sequence, YXRepresenting all possible tag sequences, the final output optimal sequence is:
Figure BDA0002706315080000156
Figure BDA0002706315080000157
therefore, the sequence of the named entities obtained by identifying the text to be identified is the optimal sequence Y on the premise of improving the accuracy of the named entity identification result by performing sequence optimization on the second word feature matrix through the CRF model*
The embodiment of the application also provides a model training method, which comprises the following steps: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result. That is, if the text to be recognized is recognized through the combined model of the first training model, the second training model and the third training model, the sample text may be recognized through the first training model which is trained well, the second training model which is not trained and the third training model, and the relevant parameters of the second training model and the third training model are continuously adjusted in the process of recognizing the sample text through the first training model, the second training model and the third training model; and finishing the training of a combined model formed by the first training model, the second training model and the third training model.
Based on the above method flow, an embodiment of the present application provides a flow of a named entity identification method, as shown in fig. 5, including:
step 501, obtaining a trained first training model, inputting a second word sequence matrix of a sample text into the first training model, and obtaining a second word feature matrix of the sample text, wherein a first parameter for adjusting dimensions in the first training model is set as a parameter value capable of obtaining a preset dimension.
Step 502, obtaining an untrained second training model, inputting a second word sequence matrix of the sample text into the initial second training model, and obtaining a second word feature matrix of the sample text, wherein a second parameter for adjusting the dimension in the second training model is set as a parameter value capable of obtaining a preset dimension.
Step 503, obtaining a second word fusion feature matrix according to the second word feature matrix and the second word feature matrix with the same dimension.
And step 504, inputting the second word fusion feature matrix into an untrained third training model, and obtaining a second named entity recognition result of the sample text.
And 505, adjusting relevant parameters of the second training model and the third training model according to the second named entity recognition result, and re-executing the steps 501 to 505 until the obtained second named entity recognition result reaches a preset accuracy rate.
Step 506, obtaining a first word vector of each word of the text to be recognized in a first mode, obtaining a first word sequence matrix through the first word vector of the text to be recognized, inputting the first word sequence matrix of the text to be recognized into a first training model, and obtaining a first word feature matrix of the text to be recognized.
Step 507, acquiring a second word vector of each word of the text to be recognized through a second mode, and performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and according to the word segmentation of the text to be recognized, carrying out same-dimension processing on the second word vector of each word in each word segmentation, and determining the word vector of each word segmentation so as to obtain a first word sequence matrix. And inputting the first word sequence matrix into a second training model to obtain a first word characteristic matrix.
And step 508, performing same-dimension processing on the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix.
And 509, inputting the first word fusion feature matrix into a third training model to obtain a named entity recognition result.
It should be noted that, in the above-mentioned flow, steps 501 to 504 are to train the second training model and the third training model through the trained first training model to obtain a mature combined model of the first training model, the second training model and the third training model. Steps 501 to 504 in the above flow may be executed in a loop until it is determined that the recognition accuracy of the combined model of the current first training model, the current second training model, and the current third training model reaches the required accuracy. And executing the steps 506 to 509 according to the models obtained in the steps 501 to 505 to obtain the named entity recognition result of the text to be recognized.
The accuracy of the recognition results of several types of named entities with respect to the above method is provided herein, including activity name (activity _ name), address (address), index data (data), organization name (organization _ name), and time (time). The evaluation index of the application uses methods of accuracy (P), recall (R), and F1 values. The specific formula is as follows:
Figure BDA0002706315080000171
Figure BDA0002706315080000172
Figure BDA0002706315080000173
wherein P is the ratio of the named entity correctly labeled by the method to the total amount of the entity identified in the text to be identified; correct number of entities marked as correct number of entities; missing is the number of entities identifying the error; spurious is the correct number of entities which are not recognized; r is the ratio of the named entity which is correctly marked to the total number of the entities in the test set; f1 is a weighted geometric mean of P and R. The application also provides an accuracy (P), a recall (R) and an F1 value of the named entity identified in the government affairs report; as shown in the following table:
Figure BDA0002706315080000174
Figure BDA0002706315080000181
based on the same concept, an embodiment of the present invention provides a named entity recognition apparatus, and fig. 6 is a schematic diagram of the named entity recognition apparatus provided in the embodiment of the present application, as shown in fig. 6, including:
an obtaining module 601, configured to input a first word sequence matrix of a text to be recognized into a first training model to obtain a first word feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
a processing module 602, configured to process the first word feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
Optionally, the processing module 602 is further configured to: setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality; before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps: and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
Optionally, the processing module 602 is further configured to: determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix; determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode; performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized; and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
Optionally, the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
Optionally, the third training model includes a bidirectional Long Short-Term Memory network (Bi-directional Long Short-Term Memory) model and a self-attention mechanism model; optionally, the processing module 602 is specifically configured to: processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix; processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix; and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
Optionally, the third training model further includes a CRF (conditional random field); the processing module 602 is further configured to: performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix; and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
Optionally, the processing module 602 is further configured to: inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model; inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix; processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix; processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text; and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A named entity recognition method, comprising:
inputting a first character sequence matrix of a text to be recognized into a first training model to obtain a first character feature matrix of the text to be recognized;
inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix;
and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
2. The method of claim 1, wherein before inputting the first word sequence matrix of the text to be recognized into the first training model to obtain the first word feature matrix of the text to be recognized, the method further comprises:
setting a first parameter of the first training model, wherein the first parameter is used for acquiring the first character feature matrix with a preset dimensionality;
before the first word sequence matrix of the text to be recognized is input into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises the following steps:
and setting a second parameter of the second training model, wherein the second parameter is used for acquiring the first word feature matrix of the preset dimensionality.
3. The method of claim 1, wherein before inputting the first word sequence matrix of the text to be recognized into the second training model to obtain the first word feature matrix of the text to be recognized, the method further comprises:
determining a first word vector corresponding to each word of the text to be recognized in a first mode; the first word vector of each word constitutes the first word sequence matrix;
determining a second word vector corresponding to each word of the text to be recognized in a second mode; the first mode is different from the second mode;
performing word segmentation on the text to be recognized to obtain each word segmentation of the text to be recognized;
and carrying out same-dimension processing on the second word vector of each word in each participle, and determining the word vector of each participle so as to obtain the first word sequence matrix.
4. The method of claim 1, wherein the first training model is a BERT model (Bidirectional Encoder characterization model of transformer); the second training model is a CNN model (Convolutional Neural Networks).
5. The method of any of claims 1 to 4, wherein the third training model comprises a bidirectional Long Short Term Memory network (BilL TM) model (Bi-directional Long Short-Term Memory network) and a self-attention mechanism model;
processing the first word fusion feature matrix through a third training model to obtain a named entity recognition result of the text to be recognized, wherein the named entity recognition result comprises the following steps:
processing the first word fusion characteristic matrix through the BilSTM model to increase semantic information of the text to be recognized corresponding to the first word fusion characteristic matrix to obtain a first word characteristic matrix;
processing the first word feature matrix through the self-attention mechanism model to increase the weight of the corresponding named entity in the first word feature matrix to obtain a second word feature matrix;
and acquiring the named entity recognition result of the text to be recognized according to the second character feature matrix.
6. The method of claim 5, wherein the third training model further comprises a CRF model (conditional random field);
processing the first word feature matrix through the attention mechanism to increase the weight of the corresponding named entity in the first word feature matrix, and after obtaining a second word feature matrix, the method further includes:
performing sequence optimization on the second word characteristic matrix through the CRF model to obtain a third word characteristic matrix;
and acquiring the named entity recognition result of the text to be recognized in the optimal arrangement sequence according to the third character feature matrix.
7. The method of any of claims 1-6, further comprising:
inputting a second word sequence matrix of a sample text into a first training model to obtain a second word feature matrix of the sample text, wherein the first training model is a trained model;
inputting the second word sequence matrix of the sample text into an initial second training model to obtain a second word characteristic matrix of the sample text; the dimension of the second word feature matrix is the same as the dimension of the second word feature matrix;
processing the second character feature matrix and the second word feature matrix to obtain a second word fusion feature matrix;
processing the second word fusion feature matrix through an initial third training model to obtain a second named entity recognition result of the sample text;
and if the second named entity recognition result does not meet the set condition, adjusting a second training model and a third training model according to the second named entity recognition result.
8. An apparatus for named entity recognition, the apparatus comprising:
the acquisition module is used for inputting a first character sequence matrix of a text to be recognized into a first training model to acquire a first character feature matrix of the text to be recognized; inputting the first word sequence matrix of the text to be recognized into a second training model to obtain a first word characteristic matrix of the text to be recognized; the dimension of the first character feature matrix is the same as the dimension of the first word feature matrix;
the processing module is used for processing the first character feature matrix and the first word feature matrix to obtain a first word fusion feature matrix; and processing the first word fusion characteristic matrix through a third training model to obtain a named entity recognition result of the text to be recognized.
9. A computer-readable storage medium, characterized in that the storage medium stores a program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 7.
10. A computer device, comprising:
a memory for storing a computer program;
a processor for calling a computer program stored in said memory to execute the method of any of claims 1 to 7 in accordance with the obtained program.
CN202011039983.1A 2020-09-28 2020-09-28 Named entity recognition method and device Active CN112115721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011039983.1A CN112115721B (en) 2020-09-28 2020-09-28 Named entity recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011039983.1A CN112115721B (en) 2020-09-28 2020-09-28 Named entity recognition method and device

Publications (2)

Publication Number Publication Date
CN112115721A true CN112115721A (en) 2020-12-22
CN112115721B CN112115721B (en) 2024-05-17

Family

ID=73798679

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011039983.1A Active CN112115721B (en) 2020-09-28 2020-09-28 Named entity recognition method and device

Country Status (1)

Country Link
CN (1) CN112115721B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112487820A (en) * 2021-02-05 2021-03-12 南京邮电大学 Chinese medical named entity recognition method
CN112699683A (en) * 2020-12-31 2021-04-23 大唐融合通信股份有限公司 Named entity identification method and device fusing neural network and rule
CN112802570A (en) * 2021-02-07 2021-05-14 成都延华西部健康医疗信息产业研究院有限公司 Named entity recognition system and method for electronic medical record
CN112949310A (en) * 2021-03-01 2021-06-11 创新奇智(上海)科技有限公司 Model training method, traditional Chinese medicine name recognition method and device and network model
CN112989834A (en) * 2021-04-15 2021-06-18 杭州一知智能科技有限公司 Named entity identification method and system based on flat grid enhanced linear converter
CN113051500A (en) * 2021-03-25 2021-06-29 武汉大学 Phishing website identification method and system fusing multi-source data
CN113268538A (en) * 2021-05-17 2021-08-17 哈尔滨工业大学(威海) Complex equipment fault tracing method and system based on domain knowledge graph
CN113449524A (en) * 2021-04-01 2021-09-28 山东英信计算机技术有限公司 Named entity identification method, system, equipment and medium
CN114417873A (en) * 2022-01-17 2022-04-29 软通动力信息技术(集团)股份有限公司 Few-sample entity identification method, device, medium and equipment
CN114970666A (en) * 2022-03-29 2022-08-30 北京百度网讯科技有限公司 Spoken language processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017162134A1 (en) * 2016-03-22 2017-09-28 索尼公司 Electronic device and method for text processing
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN111191453A (en) * 2019-12-25 2020-05-22 中国电子科技集团公司第十五研究所 Named entity recognition method based on confrontation training
CN111310470A (en) * 2020-01-17 2020-06-19 西安交通大学 Chinese named entity recognition method fusing word and word features
WO2020133039A1 (en) * 2018-12-27 2020-07-02 深圳市优必选科技有限公司 Entity identification method and apparatus in dialogue corpus, and computer device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017162134A1 (en) * 2016-03-22 2017-09-28 索尼公司 Electronic device and method for text processing
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
WO2020133039A1 (en) * 2018-12-27 2020-07-02 深圳市优必选科技有限公司 Entity identification method and apparatus in dialogue corpus, and computer device
CN111191453A (en) * 2019-12-25 2020-05-22 中国电子科技集团公司第十五研究所 Named entity recognition method based on confrontation training
CN111310470A (en) * 2020-01-17 2020-06-19 西安交通大学 Chinese named entity recognition method fusing word and word features

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
谢腾;杨俊安;刘辉;: "基于BERT-BiLSTM-CRF模型的中文实体识别", 计算机系统应用, no. 07 *
赵平;孙连英;万莹;葛娜;: "基于BERT+BiLSTM+CRF的中文景点命名实体识别", 计算机系统应用, no. 06 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699683A (en) * 2020-12-31 2021-04-23 大唐融合通信股份有限公司 Named entity identification method and device fusing neural network and rule
CN112487820A (en) * 2021-02-05 2021-03-12 南京邮电大学 Chinese medical named entity recognition method
CN112487820B (en) * 2021-02-05 2021-05-25 南京邮电大学 Chinese medical named entity recognition method
CN112802570A (en) * 2021-02-07 2021-05-14 成都延华西部健康医疗信息产业研究院有限公司 Named entity recognition system and method for electronic medical record
CN112949310A (en) * 2021-03-01 2021-06-11 创新奇智(上海)科技有限公司 Model training method, traditional Chinese medicine name recognition method and device and network model
CN113051500A (en) * 2021-03-25 2021-06-29 武汉大学 Phishing website identification method and system fusing multi-source data
CN113051500B (en) * 2021-03-25 2022-08-16 武汉大学 Phishing website identification method and system fusing multi-source data
CN113449524A (en) * 2021-04-01 2021-09-28 山东英信计算机技术有限公司 Named entity identification method, system, equipment and medium
CN112989834A (en) * 2021-04-15 2021-06-18 杭州一知智能科技有限公司 Named entity identification method and system based on flat grid enhanced linear converter
CN113268538A (en) * 2021-05-17 2021-08-17 哈尔滨工业大学(威海) Complex equipment fault tracing method and system based on domain knowledge graph
CN114417873A (en) * 2022-01-17 2022-04-29 软通动力信息技术(集团)股份有限公司 Few-sample entity identification method, device, medium and equipment
CN114970666A (en) * 2022-03-29 2022-08-30 北京百度网讯科技有限公司 Spoken language processing method and device, electronic equipment and storage medium
CN114970666B (en) * 2022-03-29 2023-08-29 北京百度网讯科技有限公司 Spoken language processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112115721B (en) 2024-05-17

Similar Documents

Publication Publication Date Title
CN112115721B (en) Named entity recognition method and device
CN111444726B (en) Chinese semantic information extraction method and device based on long-short-term memory network of bidirectional lattice structure
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
Badjatiya et al. Attention-based neural text segmentation
CN111310471B (en) Travel named entity identification method based on BBLC model
CN111985239B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN112270193A (en) Chinese named entity identification method based on BERT-FLAT
CN109858041B (en) Named entity recognition method combining semi-supervised learning with user-defined dictionary
CN112487820B (en) Chinese medical named entity recognition method
CN112163429B (en) Sentence correlation obtaining method, system and medium combining cyclic network and BERT
CN114548101B (en) Event detection method and system based on backtracking sequence generation method
CN111966812A (en) Automatic question answering method based on dynamic word vector and storage medium
CN115587594B (en) Unstructured text data extraction model training method and system for network security
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN112685561A (en) Small sample clinical medical text post-structuring processing method across disease categories
CN115238026A (en) Medical text subject segmentation method and device based on deep learning
CN113836891A (en) Method and device for extracting structured information based on multi-element labeling strategy
CN113641809A (en) XLNET-BiGRU-CRF-based intelligent question answering method
Chao et al. Variational connectionist temporal classification
CN116362242A (en) Small sample slot value extraction method, device, equipment and storage medium
CN116306606A (en) Financial contract term extraction method and system based on incremental learning
CN116108840A (en) Text fine granularity emotion analysis method, system, medium and computing device
CN115600597A (en) Named entity identification method, device and system based on attention mechanism and intra-word semantic fusion and storage medium
CN112733526B (en) Extraction method for automatically identifying tax collection object in financial file
CN114925695A (en) Named entity identification method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant