CN111209738A - Multi-task named entity recognition method combining text classification - Google Patents

Multi-task named entity recognition method combining text classification Download PDF

Info

Publication number
CN111209738A
CN111209738A CN201911417834.1A CN201911417834A CN111209738A CN 111209738 A CN111209738 A CN 111209738A CN 201911417834 A CN201911417834 A CN 201911417834A CN 111209738 A CN111209738 A CN 111209738A
Authority
CN
China
Prior art keywords
task
layer
word
vector
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911417834.1A
Other languages
Chinese (zh)
Other versions
CN111209738B (en
Inventor
庄越挺
浦世亮
汤斯亮
纪睿
王凯
吴飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911417834.1A priority Critical patent/CN111209738B/en
Publication of CN111209738A publication Critical patent/CN111209738A/en
Application granted granted Critical
Publication of CN111209738B publication Critical patent/CN111209738B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a multi-task named entity recognition method combining text classification. The method comprises the following steps: (1) constructing a text classifier by using a convolutional neural network, and measuring the similarity of texts; (2) selecting a proper threshold, and determining whether the auxiliary task data set participates in the update of the shared layer parameters according to the comparison between the text classification result and the threshold; (3) cascading character vectors of the text and pre-trained word vectors to serve as input feature vectors; (4) in a sharing layer, modeling an input feature vector of each word in a sentence by using a bidirectional LSTM, and learning common features of each task; (5) training each task in turn on the task layer, transmitting the output of the sharing layer to the bidirectional LSTM neural network in the main task private layer or the auxiliary task private layer, then using the linear chain random field to decode the label of the whole sentence, and labeling the entity in the sentence. The invention performs experiments on data sets in multiple biomedical fields, and can effectively improve the named entity recognition effect in a specific field with difficult acquisition of language materials and high labeling cost.

Description

Multi-task named entity recognition method combining text classification
Technical Field
The invention relates to natural language processing, in particular to a multitask named entity recognition method combining text classification.
Background
Natural Language Processing (NLP) is a cross discipline integrating linguistics and computer disciplines. Named Entity Recognition (NER) is a basic task in natural language processing, and aims to recognize proper nouns and meaningful quantitative phrases in natural language texts and classify the proper nouns and meaningful quantitative phrases. With the rise of information extraction and big data concepts, named entity recognition tasks are increasingly emphasized by people and become important components of natural language processing such as public opinion analysis, information retrieval, automatic question answering and machine translation. How to automatically, accurately and quickly identify named entities from massive internet text information gradually becomes a hot problem concerned by academia and industry.
Named entity recognition techniques, which aim to identify entity text and categories in documents in a particular domain (e.g., biomedical), have become an important component of document classification, retrieval, and content analysis in a particular domain. Taking the biomedical field as an example, while the number of biomedical documents, clinical records, etc. is growing at a high rate, there is also a high rate of growth of new biomedical entities and their acronyms, synonyms. However, existing named entity recognition systems based on learning rely heavily on labeling data which requires high cost, and in the biomedical field, professional domain knowledge is required to label data. How to utilize the published data set without additional manual labeling of a new data set has become a research focus at present.
The neural network model is a mainstream entity recognition technology at present for recognizing named entities in texts, however, such a learning model often needs a large amount of labeled data for training. Neural network models often perform very poorly due to the lack of training data in the biomedical field.
Aiming at the difficulty in the prior art, a multi-task named entity recognition method for joint text classification in a specific field is provided. Although data is often limited for a particular domain, there is often some data for related domains. For example, in the biomedical field, there are related field data sets, such as disease data sets, drug data sets, species data sets, and the like. The purpose of the method study is to utilize the data to help the target task improve the effect. The method is based on the assumption that two data sets should overlap in semantic space if they can facilitate each other or the target task. When the overlapped part of the two data sets is close in semanteme, namely a target task is trained, sentences close to the target task in the auxiliary task are trained, and sentences with non-close semantemes are not trained. The used frame is multi-task learning, and if the sentences of the auxiliary task are close to the target task semantics, the sharing layer and the task layer are updated; otherwise, only the task layer is updated. Experiments are carried out on a plurality of data sets in biomedicine and related fields, and the effect of a target task can be effectively improved under most conditions.
Disclosure of Invention
The invention aims to utilize the data sets of related fields to help the target field to improve the effect under the background that new data sets do not need to be additionally labeled, and provides a multi-task named entity recognition method aiming at joint text classification of a specific field.
The technical scheme adopted by the invention is as follows:
a multitask named entity recognition method combining text classification comprises the following steps:
s1: constructing a text classifier by using a convolutional neural network, and measuring the similarity of texts;
s2: selecting a threshold, and determining whether the auxiliary task data set participates in updating of the shared layer parameters according to the comparison between the text classification result and the threshold;
s3: cascading character vectors of the text and pre-trained word vectors to serve as input feature vectors;
s4: in a sharing layer, modeling an input feature vector of each word in a sentence by using a bidirectional LSTM, and learning common features of each task;
s5: training each task in turn on the task layer, transmitting the output of the sharing layer to the bidirectional LSTM neural network in the main task private layer or the auxiliary task private layer, then using the linear chain random field to decode the label of the whole sentence, and labeling the entity in the sentence.
The steps can be realized in the following way:
in step S1, a text classifier is constructed by using a convolutional neural network, and the specific steps of measuring the similarity of the text are as follows:
s11: inputting each word in a sentence, and converting the word into a word vector with a dimension of k through a word embedding module; let the word vector of the ith word in the sentence
Figure BDA0002351641680000021
If the sentence length is n, the sentence is represented as:
x1:n=[x1;x2;…;xn](1)
s12: let the convolution kernel be
Figure BDA0002351641680000022
At window xi:i+h-1The upper convolution calculation obtains the characteristic ci
ci=f(w·xi:i+h-1+b) (2)
Where h × k is the dimension of the convolution kernel, and b represents the bias;
the sentence-wise constructed features of length n are:
c=[c1;c2;…;cn-h+1](3)
s13: performing maximum pooling on c
Figure BDA0002351641680000023
Corresponding feature expressions as convolution kernel w:
Figure BDA0002351641680000024
s14: using a plurality of convolution kernels w1,w2,…,wsRespectively performing the above operations to express the obtained corresponding characteristics
Figure BDA0002351641680000025
Splicing, inputting into a fully connected network, and classifying by using a Softmax function; the Softmax function is defined as follows:
Figure BDA0002351641680000031
wherein, being the input of the Softmax function, ViAn ith element representing an input vector; s is the output of the Softmax function, SiAnd an ith element representing an output vector represents the probability that the input sentence belongs to an ith category, and the number of the categories is M.
In step S2, a threshold is selected, and for the data set of the auxiliary task, the specific step of determining whether to participate in updating the shared layer parameter according to the comparison between the text classification result and the threshold is as follows:
s21: setting m data sets, wherein the first data set is set as a main task, and the rest m-1 data sets are auxiliary tasks;
s22: after the training of the text classifier is completed, each sentence is subjected to text classificationThe machine will generate 1 vector as
Figure BDA0002351641680000032
The 1 st digit of the vector is denoted as k0Each dataset takes k of all sentences0As a threshold for the data set;
s23: when the multi-task named entity recognition model is trained, the data of the main task is updated to the sharing layer by default;
s24: the data of the auxiliary task passes through a text classifier, and when the text is classified and output, k0And if the value is larger than the threshold value, updating the task layer and the sharing layer, otherwise, only updating the task layer.
In step S3, the step of concatenating the character vector of the text and the pre-trained word vector as the input feature vector is as follows:
s31: the method comprises the steps that a natural language processing tool is adopted to perform sentence segmentation and word segmentation on a document, and the sentences, words and labels are counted to form a sentence table, a word table and a label table; counting characters in the word list to form a character list;
s32: let C be the character table, d be the dimension of each character vector, and the character vector matrix be:
Figure BDA0002351641680000033
s33: let the vector of the ith character of the word t be
Figure BDA0002351641680000034
The word is denoted t1:l=[t1;t2;…;tl]Where l is the length of the word t;
s34: using a kernel of height h
Figure BDA0002351641680000035
Realizing convolution, adding bias value b, then making nonlinear regression on the whole convolution result to implement characteristic mapping, and mapping function ftThe ith element ft(i) Is given by formula (6);
ft(i)=tanh(w·ti:i+h-1+b) (6)
s35: with yt=maxift(i) A feature expression corresponding to a convolution kernel w as a word t;
s36: using a plurality of convolution kernels w1,w2,…,wqRespectively performing the above operations to express the obtained corresponding characteristics
Figure BDA0002351641680000036
And splicing the words, and then cascading the words with the word vectors pre-trained by the words t to be used as the input feature vectors of the t.
In step S4, in the sharing layer, the specific steps of modeling the input feature vector of each word in the sentence by using the bidirectional LSTM and learning the common features of each task are as follows:
s41: definition of xtIs the input feature vector at time t, htFor the hidden layer state vector to store all useful information at time t, σ is sigmoid regression layer, and x is inner product, Ui,Uf,Uc,UoFor input x in different statestWeight matrix of Wi,Wf,Wc,WoIs a hidden layer state htWeight matrix of bi,bf,bc,boIs a bias vector;
s42: the calculation of the forget gate at time t is shown in equation (7):
ft=σ(Wfht-1+Ufxt+bf) (7)
ftdetermining the proportion of the unit state needing to be forgotten at the time of t-1;
s43: and updating the information in the cell state required to be stored until the time t, wherein the calculation formulas are shown as (8) and (9):
it=σ(Wiht-1+Uixt+bi) (8)
Figure BDA0002351641680000041
wherein
Figure BDA0002351641680000042
To be added to the candidate vector for the cell state at time t, itDetermining
Figure BDA0002351641680000043
A storable proportion;
s44: combining the calculation results of the first two steps together to generate a new cell state, wherein the calculation formula is shown as an expression (10):
Figure BDA0002351641680000044
Ctcell state at time t;
s45: the output at time t is calculated, and the calculation formulas are shown as (11) and (12):
ot=σ(Woht-1+Uoxt+bo) (11)
ht=ot*tanh(Ct) (12)
wherein o istDetermining the proportion of the unit state which can be used as output at the time t; h istA hidden layer vector representing time t as output information of time t;
s46: hidden layer information h in the above steptStoring all the past time information, and setting a hidden layer information g by the same methodtFor storing future information, the last two hidden layer information are concatenated to form the final output vector.
In step S5, each task is trained in turn at the task layer, the output of the shared layer is transmitted to the bidirectional LSTM neural network in the main task private layer or the auxiliary task private layer, the entire sentence is tag-decoded by using the linear chain random field, and the entity in the sentence is labeled as follows:
s51: the output of the sharing layer is used as input and is transmitted into a bidirectional LSTM private layer of a main task or an auxiliary task, and then the output of the bidirectional LSTM private layer is used as the input of a conditional random field;
s52: with z ═ z1,z2,…,znDenotes an input sequence of conditional random fields, where n is the length of the input sequence and z is the length of the input sequenceiIs the input vector of the ith word, y ═ y1,y2,…,ynY (z) ═ y'1,y′n,…,y′nZ is all possible output label sequences;
s53: for tag sequence y, its score is defined as:
Figure BDA0002351641680000045
wherein A is a transition score matrix, Aj,kRepresents the score of the transition from label j to label k; p is a fractional matrix of the output of the previous layer network, Pj,kA score of a kth tag corresponding to a jth word;
s54: for an input sequence z, the probability that its tag sequence is y is defined as:
Figure BDA0002351641680000051
in the training process, the logarithmic probability of the correct sequence label is maximized;
s55: at the time of final decoding, the sequence y with the highest score is searched*As a final output sequence, as shown in equation (15):
Figure BDA0002351641680000052
compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a multi-task named entity recognition method for joint text classification in a specific field. Aiming at the problem that a specific field (such as a biomedical field) is lack of labeled data, the method fully utilizes the theoretical knowledge of multi-task learning and explores and utilizes a related field data set to improve the named entity identification accuracy of the target field.
2. The method combines a text classification model to measure the relevance between the related field data and the target task, the related field data with high relevance to the target task participates in the update of the shared layer parameters, and the data with low relevance only participates in the update of the self task layer parameters. Therefore, irrelevant data are prevented from interfering the training of the target task, and the relevant data are effectively utilized to improve the effect of the target task.
Drawings
FIG. 1 is a schematic diagram of a text classification model based on a convolutional neural network;
FIG. 2 is a schematic diagram of a bi-directional LSTM neural network;
FIG. 3 is a block diagram of a method for multi-tasking named entity recognition for federated text classification;
FIG. 4 is a training flow of a method for multi-task named entity recognition with joint text classification.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
The invention mainly realizes a multi-task named entity recognition method for joint text classification in a specific field. Aiming at the problem that a specific field (such as a biomedical field) is lack of labeled data, the method fully utilizes the theoretical knowledge of multi-task learning and explores and utilizes a related field data set to improve the named entity identification accuracy of the target field. The invention adopts the text classification model based on the convolutional neural network shown in figure 1 to measure the relevance of the related field data and the target task. The result of the character feature vector and the word vector after being cascaded is input into the bidirectional LSTM neural network shown in FIG. 2, and then input into the task layer of the main task or the auxiliary task, and the overall framework of the multitask model is shown in FIG. 3.
The invention discloses a multi-task named entity recognition method based on combined text classification, which comprises the following specific steps:
s1: and constructing a text classifier by using a convolutional neural network, and measuring the similarity of the text.
In this embodiment, the sub-steps of specifically implementing S1 are as follows:
s11: inputting each word in a sentence, and converting the word into a word vector with a dimension of k through a word embedding module; let the word vector of the ith word in the sentence
Figure BDA0002351641680000061
If the sentence length is n, the sentence is represented as:
x1:n=[x1;x2;…;xn](1)
s12: let the convolution kernel be
Figure BDA0002351641680000062
At window xi:i+h-1The upper convolution calculation obtains the characteristic ci
ci=f(w·xi:i+h-1+b) (2)
Where h × k is the dimension of the convolution kernel, and b represents the bias;
the sentence-wise constructed features of length n are:
c=[c1;c2;…;cn-h+1](3)
s13: performing maximum pooling on c
Figure BDA0002351641680000063
Corresponding feature expressions as convolution kernel w:
Figure BDA0002351641680000064
s14: using a plurality of convolution kernels w1,w2,…,wsRespectively performing the above operations to express the obtained corresponding characteristics
Figure BDA0002351641680000065
Splicing, inputting into a fully connected network, and classifying by using a Softmax function; the Softmax function is defined as follows:
Figure BDA0002351641680000066
wherein, being the input of the Softmax function, ViAn ith element representing an input vector; s is the output of the Softmax function, SiAnd an ith element representing an output vector represents the probability that the input sentence belongs to an ith category, and the number of the categories is M.
S2: and selecting a proper threshold, and determining whether the auxiliary task data set participates in the update of the shared layer parameters according to the comparison between the text classification result and the threshold.
In this embodiment, the sub-steps of specifically implementing S3 are as follows:
s21: setting m data sets, wherein the first data set is set as a main task, and the rest m-1 data sets are auxiliary tasks;
s22: after the training of the text classifier is completed, each sentence generates 1 vector through the text classifier and records the vector as
Figure BDA0002351641680000067
The 1 st digit of the vector is denoted as k0Each dataset takes k of all sentences0As a threshold for the data set;
s23: when the multi-task named entity recognition model is trained, the data of the main task is updated to the sharing layer by default;
s24: the data of the auxiliary task passes through a text classifier, and when the text is classified and output, k0And if the value is larger than the threshold value, updating the task layer and the sharing layer, otherwise, only updating the task layer.
S3: and cascading character vectors of the text and pre-trained word vectors to serve as input feature vectors.
In this embodiment, the sub-steps of specifically implementing S3 are as follows:
s31: the method comprises the steps that a natural language processing tool is adopted to perform sentence segmentation and word segmentation on a document, and the sentences, words and labels are counted to form a sentence table, a word table and a label table; counting characters in the word list to form a character list;
s32: let C be the character table, d be the dimension of each character vector, the character vector matrixComprises the following steps:
Figure BDA0002351641680000071
s33: let the vector of the ith character of the word t be
Figure BDA0002351641680000072
The word is denoted t1:l=[t1;t2;…;tl]Where l is the length of the word t;
s34: using a kernel of height h
Figure BDA0002351641680000073
Realizing convolution, adding bias value b, then making nonlinear regression on the whole convolution result to implement characteristic mapping, and mapping function ftThe ith element ft(i) Is given by formula (6);
ft(i)=tanh(w·ti:i+h-1+b) (6)
s35: with yt=maxift(i) A feature expression corresponding to a convolution kernel w as a word t;
s36: using a plurality of convolution kernels w1,w2,…,wqRespectively performing the above operations to express the obtained corresponding characteristics
Figure BDA0002351641680000074
And splicing the words, and then cascading the words with the word vectors pre-trained by the words t to be used as the input feature vectors of the t.
S4: in the sharing layer, the input feature vector of each word in the sentence is modeled by using bidirectional LSTM, and the common features of all tasks are learned.
In this embodiment, the sub-steps of specifically implementing S4 are as follows:
s41: definition of xtIs the input feature vector at time t, htFor the hidden layer state vector to store all useful information at time t, σ is sigmoid regression layer, and x is inner product, Ui,Uf,Uc,UoFor input x in different statestWeight matrix of Wi,Wf,Wc,WoIs a hidden layer state htWeight matrix of bi,bf,bc,boIs a bias vector;
s42: the calculation of the forget gate at time t is shown in equation (7):
ft=σ(Wfht-1+Ufxt+bf) (7)
ftdetermining the proportion of the unit state needing to be forgotten at the time of t-1;
s43: and updating the information in the cell state required to be stored until the time t, wherein the calculation formulas are shown as (8) and (9):
it=σ(Wiht-1+Uixt+bi) (8)
Figure BDA0002351641680000075
wherein
Figure BDA0002351641680000076
To be added to the candidate vector for the cell state at time t, itDetermining
Figure BDA0002351641680000077
A storable proportion;
s44: combining the calculation results of the first two steps together to generate a new cell state, wherein the calculation formula is shown as an expression (10):
Figure BDA0002351641680000078
Ctcell state at time t;
s45: the output at time t is calculated, and the calculation formulas are shown as (11) and (12):
ot=σ(Woht-1+Uoxt+bo) (11)
ht=ot*tanh(Ct) (12)
wherein o istDetermining the proportion of the unit state which can be used as output at the time t; h istA hidden layer vector representing time t as output information of time t;
s46: hidden layer information h in the above steptStoring all the past time information, and setting a hidden layer information g by the same methodtFor storing future information, the last two hidden layer information are concatenated to form the final output vector.
S5: training each task in turn on the task layer, transmitting the output of the sharing layer to the bidirectional LSTM neural network in the main task private layer or the auxiliary task private layer, then using the linear chain random field to decode the label of the whole sentence, and labeling the entity in the sentence.
In this embodiment, the sub-steps of specifically implementing S5 are as follows:
s51: the output of the sharing layer is used as input and is transmitted into a bidirectional LSTM private layer of a main task or an auxiliary task, and then the output of the bidirectional LSTM private layer is used as the input of a conditional random field;
s52: with z ═ z1,z2,…,znDenotes an input sequence of conditional random fields, where n is the length of the input sequence and z is the length of the input sequenceiIs the input vector of the ith word, y ═ y1,y2,…,ynY (z) ═ y'1,y′n,…,y′nZ is all possible output label sequences;
s53: for tag sequence y, its score is defined as:
Figure BDA0002351641680000081
wherein A is a transition score matrix, Aj,kRepresents the score of the transition from label j to label k; p is a fractional matrix of the output of the previous layer network, Pj,kA score of a kth tag corresponding to a jth word;
s54: for an input sequence z, the probability that its tag sequence is y is defined as:
Figure BDA0002351641680000082
in the training process, the logarithmic probability of the correct sequence label is maximized;
s55: at the time of final decoding, the sequence y with the highest score is searched*As a final output sequence, as shown in equation (15):
Figure BDA0002351641680000083
the method is applied to the embodiment, the specific steps and the parameter definitions are as described above, and some contents are not repeated again, and the embodiment mainly shows the specific implementation and technical effects thereof.
Examples
Taking 3 public data sets (BioNLP13CG, BioNLP13PC and CRAFT) of cell component groups in the biomedical field as an example, the method is applied to the 3 data sets for named entity identification, and specific parameters and practices in each step are as follows: training a text classifier:
1. each word in the input sentence is converted into a word vector of dimension 128 by the word embedding module. A sentence of length n may be represented as1:n=[x1;x2;…;xn];
2. The convolution kernel uses three sizes of 3, 4 and 5, and 100 sentences with the length of n are respectively used for constructing the feature which is recorded as c;
3. maximum in pooled level selection features
Figure BDA0002351641680000091
4. And splicing all the features, inputting the spliced features into a fully-connected network, and classifying by using a Softmax function so as to construct a text classifier. When the text classifier is trained, the batch size is 64, the dropout is 0.5, and the initial learning rate is set to be 0.001;
selecting a proper threshold value:
5.for example, the named entity recognition task of BioNLP13CG is used as a main task, and the other two tasks are used as auxiliary tasks; after the training of the text classifier is completed, each sentence of the data sets BioNLP13PC and CRAFT generates 1 vector through the text classifier and records the vector as 1 vector
Figure BDA0002351641680000092
The 1 st digit of the k vector is denoted as k0. The two data sets respectively take k of all sentences0The average value of (a) is used as a threshold value;
6. during multitask model training, the data of the BioNLP13CG updates the shared layer by default. The data of BioNLP13PC and CRAFT are firstly processed by a text classifier, and when the text classification outputs k0When the value is larger than the corresponding threshold value, the task layer and the sharing layer are updated; otherwise, only updating the task layer;
extracting character feature vectors of the text, and cascading the character feature vectors and the pre-trained word vectors as input feature vectors:
7. and performing sentence segmentation and word segmentation on the document by adopting a natural language processing tool, and performing statistics on sentences, words and labels to form a sentence table, a vocabulary table and a label table. Counting characters in the word list to form a character list;
8. let C be the character table, d be the dimension of each character vector, and the character vector matrix be:
Figure BDA0002351641680000093
9. let the vector of the ith character of the word t be
Figure BDA0002351641680000094
The word is denoted t1:l=[t1;t2;…;tl]Where l is the length of the word t;
10. using a kernel of height h
Figure BDA0002351641680000095
Realizing convolution, adding bias value b, then making nonlinear regression on the whole convolution result to implement characteristic mapping, and mapping function ftThe ith element ft(i) Is given by the formula (6). With yt=maxift(i) As a characteristic expression for the word t corresponding to the convolution kernel w.
11. Using a plurality of convolution kernels w1,w2,…,wqRespectively performing the above operations to express the obtained corresponding characteristics
Figure BDA0002351641680000096
Spliced together, and then cascaded with the word t pre-trained GloVe 100-dimensional word vector disclosed by 6 hundred million Stanford as the input feature vector of t.
At the sharing layer, the input feature vector for each word in the sentence is modeled using bi-directional LSTM:
12. in the sharing layer, the input feature vector obtained in the step 11 is transmitted into a bidirectional LSTM, the parameter updating mode of the bidirectional LSTM neural network is that 10 is used as batchsize, parameter updating is carried out by using an Adam optimization algorithm, dropout is 0.5, the initialized learning rate is 0.015, and after each iteration, the learning rate updating formula is that
Figure BDA0002351641680000101
Wherein the decline rate d is 0.05, and e is the iteration number;
13. definition of xtIs the input feature vector at time t, htFor the hidden layer state vector to store all useful information at time t, σ is sigmoid regression layer, and x is inner product, Ui,Uf,Uc,UoFor input x in different statestWeight matrix of Wi,Wf,Wc,WoIs a hidden layer state htWeight matrix of bi,bf,bc,boIs a bias vector;
14. the calculation formula for forget gate at time t is as follows:
ft=σ(Wfht-1+Ufxt+bf)
15. and updating the information in the cell state required to be saved to the time t, wherein the calculation formula is as follows:
it=σ(Wiht-1+Uixt+bi)
Figure BDA0002351641680000102
wherein
Figure BDA0002351641680000103
A vector that can be added to the cell state at time t;
16. combining the calculation results of the first two steps together to generate a new cell state, wherein the calculation formula is as follows:
Figure BDA0002351641680000104
wherein
Figure BDA0002351641680000105
A vector of a cell state at the time t;
17. output at time t, and update htThe calculation formula is as follows:
Ot=σ(Woht-1+Uoxt+bo)
ht=ot*tanh(Ct)
wherein o istIs the output at time t; h istA vector of a hidden layer at time t;
18. h in the above steptStoring all the past time information, and setting a g again by the same methodtFor storing future information, the last two hidden layer information are concatenated to form the final output vector.
Training each task in turn at the task level:
19. the output of BioNLP13CG at the shared layer is used as input to be transmitted into the bidirectional LSTM network of the private layer of the main task, and the output of BioNLP13PC and CRAFT at the shared layer is used as input to be transmitted into the bidirectional LSTM networks of the private layers of the auxiliary task 1 and the auxiliary task 2 respectively. Taking the output of the bidirectional LSTM as the input of the conditional random field;
and (3) carrying out entity labeling on each word by using a conditional random field:
20. with z ═ z1,z2,…,znDenotes an input sequence of conditional random fields, where n is the length of the input sequence and z is the length of the input sequenceiIs the input vector of the ith word, y ═ y1,y2,…,ynY (z) ═ y'1,y′n,…,y′nA possible output tag sequence of z;
21. for tag sequence y, its score is defined as:
Figure BDA0002351641680000111
wherein A is a transition score matrix, Aj,kRepresents the score of the transition from label j to label k; p is a fractional matrix of the output of the previous layer network, Pj,kThe score of the kth tag corresponding to the jth word.
22. For an input sequence z, the probability that its tag sequence is y is defined as:
Figure BDA0002351641680000112
in the training process, we maximize the log probability of the correct sequence label;
23. at the time of final decoding, the sequence y with the highest score is searched*As the final output sequence:
Figure BDA0002351641680000113
24. and identifying the position of the marked words in the original file, and neatly feeding back the marking result to the user, so that the marking accuracy can be calculated. The following table was achieved:
data set Single task BioNLP13CG BioNLP13PC CRAFT
BioNLP13CG 74.72 77.11 77.65 69.16
BioNLP13PC 88.17 78.16 89.12 77.23
CRAFT 64.24 61.53 62.31 64.72
The single task column indicates the accuracy of 3 data sets to identify tasks as separate named entities. The accuracy of the BioNLP13CG column indicates the accuracy of the BioNLP13CG as the main task and the remaining 2 as the auxiliary tasks, and the BioNLP13PC column and the CRAFT column are the same.
From the above experimental results, the accuracy of the main task in the multitask model is generally higher than that of the single task. Therefore, the method can effectively improve the accuracy of the target task.

Claims (6)

1. A multitask named entity recognition method combining text classification is characterized by comprising the following steps:
s1: constructing a text classifier by using a convolutional neural network, and measuring the similarity of texts;
s2: selecting a threshold, and determining whether the auxiliary task data set participates in updating of the shared layer parameters according to the comparison between the text classification result and the threshold;
s3: cascading character vectors of the text and pre-trained word vectors to serve as input feature vectors;
s4: in a sharing layer, modeling an input feature vector of each word in a sentence by using a bidirectional LSTM, and learning common features of each task;
s5: training each task in turn on the task layer, transmitting the output of the sharing layer to the bidirectional LSTM neural network in the main task private layer or the auxiliary task private layer, then using the linear chain random field to decode the label of the whole sentence, and labeling the entity in the sentence.
2. The method for multi-task named entity recognition through combined text classification according to claim 1, wherein in step S1, a text classifier is constructed by using a convolutional neural network, and the specific steps for measuring the similarity of texts are as follows:
s11: inputting each word in a sentence, and converting the word into a word vector with a dimension of k through a word embedding module; let the word vector of the ith word in the sentence
Figure FDA0002351641670000011
If the sentence length is n, the sentence is represented as:
x1:n=[x1;x2;…;xn](1)
s12: let the convolution kernel be
Figure FDA0002351641670000012
At window xi:i+h-1The upper convolution calculation obtains the characteristic ci
ci=f(w·xi:i+h-1+b) (2)
Where h × k is the dimension of the convolution kernel, and b represents the bias;
the sentence-wise constructed features of length n are:
c=[c1;c2;…;cn-h+1](3)
s13: performing maximum pooling on c
Figure FDA0002351641670000013
Corresponding feature expressions as convolution kernel w:
Figure FDA0002351641670000014
s14: using a plurality of convolution kernels w1,w2,…,wsRespectively performing the above operations to express the obtained corresponding characteristics
Figure FDA0002351641670000015
Splicing, inputting into a fully connected network, and classifying by using a Softmax function; the Softmax function is defined as follows:
Figure FDA0002351641670000016
where V is the input to the Softmax function, ViAn ith element representing an input vector; s is the output of the Softmax function, SiAnd an ith element representing an output vector represents the probability that the input sentence belongs to an ith category, and the number of the categories is M.
3. The method as claimed in claim 1, wherein the step S2 of selecting the threshold, and the specific steps of determining whether the auxiliary task data set participates in the update of the shared layer parameter according to the comparison between the text classification result and the threshold are as follows:
s21: setting m data sets, wherein the first data set is set as a main task, and the rest m-1 data sets are auxiliary tasks;
s22: after the training of the text classifier is completed, each sentence generates 1 vector through the text classifier and records the vector as
Figure FDA0002351641670000021
The 1 st digit of the vector is denoted as k0Each dataset takes k of all sentences0As a threshold for the data set;
s23: when the multi-task named entity recognition model is trained, the data of the main task is updated to the sharing layer by default;
s24: the data of the auxiliary task passes through a text classifier, and when the text is classified and output, k0And if the value is larger than the threshold value, updating the task layer and the sharing layer, otherwise, only updating the task layer.
4. The method for multi-task named entity recognition through combined text classification as claimed in claim 1, wherein in step S3, the step of concatenating the character vector of the text and the pre-trained word vector as the input feature vector comprises:
s31: the method comprises the steps that a natural language processing tool is adopted to perform sentence segmentation and word segmentation on a document, and the sentences, words and labels are counted to form a sentence table, a word table and a label table; counting characters in the word list to form a character list;
s32: let C be the character table, d be the dimension of each character vector, and the character vector matrix be:
Figure FDA0002351641670000022
s33: let the vector of the ith character of the word t be
Figure FDA0002351641670000023
The word is denoted t1:l=[t1;t2;…;tl]Where l is the length of the word t;
s34: using a kernel of height h
Figure FDA0002351641670000024
Realizing convolution, adding bias value b, then making nonlinear regression on the whole convolution result to implement characteristic mapping, and mapping function ftThe ith element ft(i) Is given by formula (6);
ft(i)=tanh(w·ti:i+h-1+b) (6)
s35: with yt=maxift(i) A feature expression corresponding to a convolution kernel w as a word t;
s36: using a plurality of convolution kernels w1,w2,…,wqRespectively performing the above operations to express the obtained corresponding characteristics
Figure FDA0002351641670000025
And splicing the words, and then cascading the words with the word vectors pre-trained by the words t to be used as the input feature vectors of the t.
5. The method as claimed in claim 1, wherein in step S4, the step of learning the common features of each task by modeling the input feature vector of each word in the sentence with bidirectional LSTM in the sharing layer comprises the following steps:
s41: definition of xtIs the input feature vector at time t, htFor the hidden layer state vector to store all useful information at time t, σ is sigmoid regression layer, and x is inner product, Ui,Uf,Uc,UoFor input x in different statestWeight matrix of Wi,Wf,Wc,WoIs a hidden layer state htWeight matrix of bi,bf,bc,boIs a bias vector;
s42: the calculation of the forget gate at time t is shown in equation (7):
ft=σ(Wfht-1+Ufxt+bf) (7)
ftdetermining the proportion of the unit state needing to be forgotten at the time of t-1;
s43: and updating the information in the cell state required to be stored until the time t, wherein the calculation formulas are shown as (8) and (9):
it=σ(Wiht-1+Uixt+bi) (8)
Figure FDA0002351641670000031
wherein
Figure FDA0002351641670000032
To be added to the candidate vector for the cell state at time t, itDetermining
Figure FDA0002351641670000033
A storable proportion;
s44: combining the calculation results of the first two steps together to generate a new cell state, wherein the calculation formula is shown as an expression (10):
Figure FDA0002351641670000034
Ctcell state at time t;
s45: the output at time t is calculated, and the calculation formulas are shown as (11) and (12):
ot=σ(Woht-1+Uoxt+bo) (11)
ht=ot*tanh(Ct) (12)
wherein o istDetermining the proportion of the unit state which can be used as output at the time t; h istHidden layer vector representing time t, asOutput information at time t;
s46: hidden layer information h in the above steptStoring all the past time information, and setting a hidden layer information g by the same methodtFor storing future information, the last two hidden layer information are concatenated to form the final output vector.
6. The method as claimed in claim 1, wherein in step S5, the steps of training each task in turn at task layer, transmitting the output of the shared layer to the bi-directional LSTM neural network in the main task private layer or the auxiliary task private layer, then using linear chain random field to decode the label of the whole sentence, and labeling the entity in the sentence are as follows:
s51: the output of the sharing layer is used as input and is transmitted into a bidirectional LSTM private layer of a main task or an auxiliary task, and then the output of the bidirectional LSTM private layer is used as the input of a conditional random field;
s52: with z ═ z1,z2,…,znDenotes an input sequence of conditional random fields, where n is the length of the input sequence and z is the length of the input sequenceiIs the input vector of the ith word, y ═ y1,y2,…,ynY (z) ═ y'1,y′n,…,y′nZ is all possible output label sequences;
s53: for tag sequence y, its score is defined as:
Figure FDA0002351641670000035
wherein A is a transition score matrix, Aj,kRepresents the score of the transition from label j to label k; p is a fractional matrix of the output of the previous layer network, Pj,kA score of a kth tag corresponding to a jth word;
s54: for an input sequence z, the probability that its tag sequence is y is defined as:
Figure FDA0002351641670000041
in the training process, the logarithmic probability of the correct sequence label is maximized;
s55: at the time of final decoding, the sequence y with the highest score is searched*As a final output sequence, as shown in equation (15):
Figure FDA0002351641670000042
CN201911417834.1A 2019-12-31 2019-12-31 Multi-task named entity recognition method combining text classification Active CN111209738B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911417834.1A CN111209738B (en) 2019-12-31 2019-12-31 Multi-task named entity recognition method combining text classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911417834.1A CN111209738B (en) 2019-12-31 2019-12-31 Multi-task named entity recognition method combining text classification

Publications (2)

Publication Number Publication Date
CN111209738A true CN111209738A (en) 2020-05-29
CN111209738B CN111209738B (en) 2021-03-26

Family

ID=70786490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911417834.1A Active CN111209738B (en) 2019-12-31 2019-12-31 Multi-task named entity recognition method combining text classification

Country Status (1)

Country Link
CN (1) CN111209738B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859936A (en) * 2020-07-09 2020-10-30 大连理工大学 Cross-domain establishment oriented legal document professional jurisdiction identification method based on deep hybrid network
CN112039997A (en) * 2020-09-03 2020-12-04 重庆邮电大学 Triple-feature-based Internet of things terminal identification method
CN112052684A (en) * 2020-09-07 2020-12-08 南方电网数字电网研究院有限公司 Named entity identification method, device, equipment and storage medium for power metering
CN112085251A (en) * 2020-08-03 2020-12-15 广州数说故事信息科技有限公司 Consumer product research and development combined concept recommendation method and system
CN112541355A (en) * 2020-12-11 2021-03-23 华南理工大学 Few-sample named entity identification method and system with entity boundary class decoupling
CN113064993A (en) * 2021-03-23 2021-07-02 南京视察者智能科技有限公司 Design method, optimization method and labeling method of automatic text classification labeling system based on big data
CN113204970A (en) * 2021-06-07 2021-08-03 吉林大学 BERT-BilSTM-CRF named entity detection model and device
CN113254617A (en) * 2021-06-11 2021-08-13 成都晓多科技有限公司 Message intention identification method and system based on pre-training language model and encoder
CN113255342A (en) * 2021-06-11 2021-08-13 云南大学 Method and system for identifying product name of 5G mobile service
CN113743111A (en) * 2020-08-25 2021-12-03 国家计算机网络与信息安全管理中心 Financial risk prediction method and device based on text pre-training and multi-task learning
CN114036933A (en) * 2022-01-10 2022-02-11 湖南工商大学 Information extraction method based on legal documents
CN114048749A (en) * 2021-11-19 2022-02-15 重庆邮电大学 Chinese named entity recognition method suitable for multiple fields
CN115688777A (en) * 2022-09-28 2023-02-03 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text
CN116074317A (en) * 2023-02-20 2023-05-05 王春辉 Service resource sharing method and server based on big data

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110119050A1 (en) * 2009-11-18 2011-05-19 Koen Deschacht Method for the automatic determination of context-dependent hidden word distributions
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN108153895A (en) * 2018-01-06 2018-06-12 国网福建省电力有限公司 A kind of building of corpus method and system based on open data
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN108415977A (en) * 2018-02-09 2018-08-17 华南理工大学 One is read understanding method based on the production machine of deep neural network and intensified learning
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN108595708A (en) * 2018-05-10 2018-09-28 北京航空航天大学 A kind of exception information file classification method of knowledge based collection of illustrative plates
CN108664589A (en) * 2018-05-08 2018-10-16 苏州大学 Text message extracting method, device, system and medium based on domain-adaptive
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109766417A (en) * 2018-11-30 2019-05-17 浙江大学 A kind of construction method of the literature annals question answering system of knowledge based map
CN110046709A (en) * 2019-04-22 2019-07-23 成都新希望金融信息有限公司 A kind of multi-task learning model based on two-way LSTM
CN110134954A (en) * 2019-05-06 2019-08-16 北京工业大学 A kind of name entity recognition method based on Attention mechanism
CN110162795A (en) * 2019-05-30 2019-08-23 重庆大学 A kind of adaptive cross-cutting name entity recognition method and system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110119050A1 (en) * 2009-11-18 2011-05-19 Koen Deschacht Method for the automatic determination of context-dependent hidden word distributions
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN108153895A (en) * 2018-01-06 2018-06-12 国网福建省电力有限公司 A kind of building of corpus method and system based on open data
CN108228568A (en) * 2018-01-24 2018-06-29 上海互教教育科技有限公司 A kind of mathematical problem semantic understanding method
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN108415977A (en) * 2018-02-09 2018-08-17 华南理工大学 One is read understanding method based on the production machine of deep neural network and intensified learning
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN108664589A (en) * 2018-05-08 2018-10-16 苏州大学 Text message extracting method, device, system and medium based on domain-adaptive
CN108595708A (en) * 2018-05-10 2018-09-28 北京航空航天大学 A kind of exception information file classification method of knowledge based collection of illustrative plates
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109766417A (en) * 2018-11-30 2019-05-17 浙江大学 A kind of construction method of the literature annals question answering system of knowledge based map
CN110046709A (en) * 2019-04-22 2019-07-23 成都新希望金融信息有限公司 A kind of multi-task learning model based on two-way LSTM
CN110134954A (en) * 2019-05-06 2019-08-16 北京工业大学 A kind of name entity recognition method based on Attention mechanism
CN110162795A (en) * 2019-05-30 2019-08-23 重庆大学 A kind of adaptive cross-cutting name entity recognition method and system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GAMAL CRICHTON等: "A neural network multi-task learning approach to biomedical named entity recognition", 《BMC BIOINFORMATICS》 *
TUNG TRAN等: "A Multi-Task Learning Framework for Extracting Drugs and Their Interactions from Drug Labels", 《HTTPS://ARXIV.ORG/PDF/1905.07464.PDF》 *
XI WANG等: "Multitask learning for biomedical named entity recognition with cross-sharing structure", 《BMC BIOINFORMATICS》 *
奚雪峰等: "面向自然语言处理的深度学习研究", 《自动化学报》 *
陈伟等: "基于BiLSTM-CRF的关键词自动抽取", 《计算机科学》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859936A (en) * 2020-07-09 2020-10-30 大连理工大学 Cross-domain establishment oriented legal document professional jurisdiction identification method based on deep hybrid network
CN112085251A (en) * 2020-08-03 2020-12-15 广州数说故事信息科技有限公司 Consumer product research and development combined concept recommendation method and system
CN113743111A (en) * 2020-08-25 2021-12-03 国家计算机网络与信息安全管理中心 Financial risk prediction method and device based on text pre-training and multi-task learning
CN112039997A (en) * 2020-09-03 2020-12-04 重庆邮电大学 Triple-feature-based Internet of things terminal identification method
CN112052684A (en) * 2020-09-07 2020-12-08 南方电网数字电网研究院有限公司 Named entity identification method, device, equipment and storage medium for power metering
CN112541355A (en) * 2020-12-11 2021-03-23 华南理工大学 Few-sample named entity identification method and system with entity boundary class decoupling
CN112541355B (en) * 2020-12-11 2023-07-18 华南理工大学 Entity boundary type decoupling few-sample named entity recognition method and system
CN113064993A (en) * 2021-03-23 2021-07-02 南京视察者智能科技有限公司 Design method, optimization method and labeling method of automatic text classification labeling system based on big data
CN113064993B (en) * 2021-03-23 2023-07-21 南京视察者智能科技有限公司 Design method, optimization method and labeling method of automatic text classification labeling system based on big data
CN113204970A (en) * 2021-06-07 2021-08-03 吉林大学 BERT-BilSTM-CRF named entity detection model and device
CN113254617B (en) * 2021-06-11 2021-10-22 成都晓多科技有限公司 Message intention identification method and system based on pre-training language model and encoder
CN113254617A (en) * 2021-06-11 2021-08-13 成都晓多科技有限公司 Message intention identification method and system based on pre-training language model and encoder
CN113255342A (en) * 2021-06-11 2021-08-13 云南大学 Method and system for identifying product name of 5G mobile service
CN113255342B (en) * 2021-06-11 2022-09-30 云南大学 Method and system for identifying product name of 5G mobile service
CN114048749A (en) * 2021-11-19 2022-02-15 重庆邮电大学 Chinese named entity recognition method suitable for multiple fields
CN114048749B (en) * 2021-11-19 2024-02-02 北京第一因科技有限公司 Chinese named entity recognition method suitable for multiple fields
CN114036933A (en) * 2022-01-10 2022-02-11 湖南工商大学 Information extraction method based on legal documents
CN114036933B (en) * 2022-01-10 2022-04-22 湖南工商大学 Information extraction method based on legal documents
CN115688777B (en) * 2022-09-28 2023-05-05 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text
CN115688777A (en) * 2022-09-28 2023-02-03 北京邮电大学 Named entity recognition system for nested and discontinuous entities of Chinese financial text
CN116074317A (en) * 2023-02-20 2023-05-05 王春辉 Service resource sharing method and server based on big data
CN116074317B (en) * 2023-02-20 2024-03-26 新疆八达科技发展有限公司 Service resource sharing method and server based on big data

Also Published As

Publication number Publication date
CN111209738B (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN111209738B (en) Multi-task named entity recognition method combining text classification
CN113011533B (en) Text classification method, apparatus, computer device and storage medium
CN106980683B (en) Blog text abstract generating method based on deep learning
CN110209806B (en) Text classification method, text classification device and computer readable storage medium
Wang et al. Mapping customer needs to design parameters in the front end of product design by applying deep learning
CN110188272B (en) Community question-answering website label recommendation method based on user background
Sun et al. Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features
CN110807320B (en) Short text emotion analysis method based on CNN bidirectional GRU attention mechanism
CN110765260A (en) Information recommendation method based on convolutional neural network and joint attention mechanism
CN112002411A (en) Cardiovascular and cerebrovascular disease knowledge map question-answering method based on electronic medical record
CN106569998A (en) Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN110674252A (en) High-precision semantic search system for judicial domain
CN110489523B (en) Fine-grained emotion analysis method based on online shopping evaluation
CN112001186A (en) Emotion classification method using graph convolution neural network and Chinese syntax
CN112328900A (en) Deep learning recommendation method integrating scoring matrix and comment text
CN107818084B (en) Emotion analysis method fused with comment matching diagram
CN106708929B (en) Video program searching method and device
CN112884551B (en) Commodity recommendation method based on neighbor users and comment information
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
KR102155768B1 (en) Method for providing question and answer data set recommendation service using adpative learning from evoloving data stream for shopping mall
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN112966068A (en) Resume identification method and device based on webpage information
CN111274829A (en) Sequence labeling method using cross-language information
CN111582506A (en) Multi-label learning method based on global and local label relation
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant