CN113360667A - Biomedical trigger word detection and named entity identification method based on multitask learning - Google Patents
Biomedical trigger word detection and named entity identification method based on multitask learning Download PDFInfo
- Publication number
- CN113360667A CN113360667A CN202110617440.1A CN202110617440A CN113360667A CN 113360667 A CN113360667 A CN 113360667A CN 202110617440 A CN202110617440 A CN 202110617440A CN 113360667 A CN113360667 A CN 113360667A
- Authority
- CN
- China
- Prior art keywords
- word
- sequence
- trigger
- named entity
- ner
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Animal Behavior & Ethology (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Machine Translation (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a biomedical trigger word detection and named entity identification method based on multitask learning, which comprises the following steps: 1, preprocessing an unstructured biomedical text by a word segmentation and sentence segmentation technology, and labeling the preprocessed biomedical text to generate a standard data set; 2, constructing a biomedical trigger word detection and named entity recognition neural network model based on multitask learning; 3 training the neural network model and updating parameters; and 4, predicting the unlabeled data by using the trained optimal model so as to identify the trigger words and the named entities in the unlabeled data. The method can simultaneously detect the trigger words and named entity recognition in the biomedical text, thereby effectively improving the recognition accuracy and reducing the requirements on computing resources.
Description
Technical Field
The invention relates to the field of biomedical text mining, in particular to a biomedical trigger word detection and named entity identification method based on multi-task learning.
Background
A named entity is a specific noun or noun phrase in text that has a particularly critical meaning. Named entity recognition can be divided into general domain and specific domain entity recognition. In the general field, entities can be divided into organization name entities, person name entities, place name entities, and the like. In the specific biomedical field, entities can be classified into cellular entities, genetic entities, protein entities, pharmaceutical entities, disease entities, and the like. Compared with named entity recognition in the general field, the named entity recognition in the biomedical field is more difficult due to entity nesting, word ambiguity and the like. Accurate identification of biomedical entities can facilitate further development of information extraction techniques and natural language processing techniques. In the biomedical field, the named entity recognition technology can extract structured biomedical entity information from a large amount of unstructured documents, and has a promoting effect on the construction of biomedical knowledge maps and databases.
The current popular named entity recognition methods can be mainly divided into rule-based methods, traditional machine learning-based methods and deep learning-based methods. The rule-based approach relies primarily on manually formulated rules including domain-specific place name dictionaries, syntactic vocabulary patterns, and the like to identify entities in text, without the need for a data set with tag annotations. The method based on the traditional machine learning mainly relies on linguistic features such as manually designed prefix and suffix features, lexical features, syntactic features and the like to train a traditional machine learning algorithm to recognize named entities. In recent years, with the advantage of deep neural networks in automatically extracting internal features of data, a plurality of named entity recognition methods based on deep learning exist. The current named entity recognition method can mainly only perform one independent entity recognition task, and the semantic information feature extraction in the text is not sufficient, so that the recognition effect of the existing method is poor.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a biomedical trigger word detection and named entity identification method based on multi-task learning, so that the total trigger words and named entity identification of a biomedical text can be simultaneously detected, the identification accuracy can be effectively improved, and the requirement on computing resources is reduced.
In order to achieve the purpose, the invention adopts the technical scheme that:
the invention relates to a biomedical trigger word detection and named entity identification method based on multitask learning, which is characterized by comprising the following steps of:
step 1, preprocessing unstructured biomedical texts:
performing word segmentation and sentence segmentation on the unstructured biomedical text to obtain a tag-free training data set consisting of n sentence sequences, and recording the tag-free training data set as S ═ S1,S2,...,Si,...,Sn}; wherein S isiRepresents the ith sentence sequence, and represents the jth word sequence in the ith sentence sequence, andrepresenting the ith sentence sequence SiThe j-th word sequenceThe kth character of (1); n represents the total number of sentences in the training data set, m represents the total number of words in one sentence, and K represents the total number of characters in one word;
step 2, labeling the training data set S:
step 2.1, setting the categories of the trigger words and the named entity recognition respectively asAndwherein the content of the first and second substances,denotes the nth class trigger class, Lner nRepresenting the nth entity category;
step 2.2, the ith sentence sequence S in the training data set S is simultaneously processediLabels of the trigger word category and the named entity category are added to all word sequences, so that a labeled training data set of a trigger word detection task is obtainedTagged training data set for recognition tasks with named entitiesWherein the content of the first and second substances,representing the ith sentence sequence SiThe j-th word sequenceAnd its corresponding trigger word classRepresenting the ith sentence sequence SiThe j-th word sequenceAnd its corresponding entity class
Step 3, word vector pre-training:
obtaining biomedical documents, performing word segmentation and sentence segmentation processing to obtain a label-free training data set consisting of n 'sentence sequences, and recording the label-free training data set as S ═ S'1,S′2,...,S′i′,...,S′n′};S′i′Representing the ith' sentence sequence, using Word2Vec tool based on language model to eliminate the markTraining the training data set S' to obtain a pre-training word vector matrix M;
step 4, biomedical trigger word detection based on multitask learning and data processing of an MTL-TD-NER model of a named entity neural network; the MTL-TD-NER model consists of a word vector coding layer embedded based on hybrid coding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random fields;
step 4.1, the ith sentence sequence in the text form is processed by utilizing the pre-training word vector MWord level word vector form data converted into arbitrary dimension VWherein the content of the first and second substances,represents the jth wordWord-level word vectors of;
step 4.2, processing data based on a word vector coding layer embedded by mixed coding, wherein the word vector coding layer is composed of bidirectional long-short term memory network units;
step 4.2.1, the jth word sequenceEach character in the character string is input into a bidirectional long and short term memory network unit at the character level and used for training the generation probability of generating corresponding words by all the characters;
step 4.2.2, extracting the jth word sequenceThe first character and the last character in the bidirectional long and short term memory network unit are used as the j word sequence after being connected on the output of the hidden layer in the bidirectional long and short term memory network unitCharacter level vector of
Step 4.3, for the jth wordWord-level word vector ofAnd character level word vectorsSplicing to obtain the jth word sequenceMixed coded word vector ofThereby obtaining the ith sentence sequence SiWord vector of
And 4.4, processing data of the feature extraction layer based on the bidirectional LSTM:
the ith sentence sequence SiWord vector ofInputting the sentence into a layer of forward LSTM network structure, and then inputting the ith sentence sequence SiWord vector ofInputting into a layer of reverse LSTM network structure, and finally, inputting the jth word vectorImplicit in two LSTM networksLayer state outputAndthe combinations are spliced together as a word vector at the j' th positionContext feature information ofThereby obtaining the ith sentence sequence SiCharacteristic sequence of
And 4.5, processing data of a classification layer based on the conditional random field:
step 4.5.1, constructing a classification layer of a trigger word detection task and a classification layer of a named entity recognition task, wherein the classification layer takes a conditional random field as a basic unit; building parallel information conversion layer transform for classification layer of trigger word detection tasktd(ii) a Building parallel information conversion layer transform for classification layer of named entity recognition taskner;
Step 4.5.2, feature sequenceInputting the input into a classification layer of a trigger word detection task to obtain an output OtdThen output OtdInput to information transformationtdObtain the characteristic information F of the trigger wordtdFinally, the feature information F is obtainedtdAnd the characteristic sequence HiAdding the entity integral characteristics to obtain an entity integral characteristic, inputting the entity integral characteristic into a classification layer of a named entity identification task to obtain a final entity identification result
Step 4.5.3, mixingCharacteristic sequenceInputting the input into a classification layer of a named entity recognition task to obtain an output OnerThen output OnerInput to information transformationnerTo obtain the characteristic information F of the entitynerFinally, the feature information F is obtainednerAnd the characteristic sequence HiAdding the overall characteristics of the trigger words to obtain the overall characteristics of the trigger words, inputting the overall characteristics into a classification layer of a trigger word detection task, and obtaining a final trigger word recognition result
Step 5, training an MTL-TD-NER model to obtain an optimal trigger word detection and named entity recognition model:
step 5.1, setting parameter variables of the model:
the batch size is B, the current number of iterations is epochnowThe maximum number of iterations is epochmaxThe number of iterations in which the difference value loss of the model does not decrease continuously is epochnoThe maximum number of iterations for the early-stop strategy is epoches;
Step 5.2, parameter initialization:
initializing each parameter of a word vector layer based on mixed coding embedding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random field by adopting a uniform distribution method;
step 5.3, from the epochnowStarting, inputting batches with the size of B in a training data set S into an MTL-TD-NER model each time, and calculating the difference value loss of an output label of the model and a correct label in the training data set S by using the formula (1) so as to update parameters in the model;
in formula (1), losstdAnd lossnerIs trigger word detection and named entity recognitionA penalty function for the task, and having:
in formulae (2) and (3), ytdAnd ynerA sequence of trigger word tags and named entity tags; score (y)td) And score (y)ner) Respectively the ith sentence sequence SiInputting scores of the trigger word label and the named entity label sequence output in the MTL-TD-NER model; λ andthe system is a hyper-parameter and is used for balancing the importance degree between two tasks; (ii) aRepresenting all possible sets of trigger word tag sequences,represents the set of all possible entity tag sequences,to representOne of which triggers a sequence of word tags,a certain sequence of entity tags in the representation;
step 5.4, if epochnowLess than epochmaxAnd epochnoLess than epochesThen will epochnowAdding 1, and then continuing to execute the step 5.3; if epochnowGreater than or equal to epochmaxOr epochnoEqual to epochesThen, the optimal trigger word detection and named entity recognition network model based on the multi-task learning is obtained;
and 6, identifying the unlabeled data by using the optimal trigger word detection and named entity identification network model so as to obtain a trigger word label and a named entity identification label of each word in the unlabeled data.
Compared with the prior art, the invention has the beneficial effects that:
1. the method is different from the traditional named entity identification method based on rules and machine learning, realizes an end-to-end neural network model, avoids the manual design of various rules such as lexical and syntactic rules and the manual extraction of linguistic features, and simplifies the implementation of trigger word detection and named entity identification.
2. The invention designs a neural network model to simultaneously process the trigger word detection task and the named entity recognition task, and adopts a hard parameter sharing mode to enable the two tasks to share the same word vector coding layer based on mixed coding embedding and the feature extraction layer based on bidirectional LSTM, thereby accelerating the training process of the model and improving the operation efficiency of the model.
3. The invention utilizes an information conversion layer to convert the mutually beneficial information between the trigger word and the named entity, can better mine useful characteristic information, and respectively inputs the useful characteristic information into the classification layers to help each other to better identify the trigger word and the named entity.
4. According to the method, the trigger detection task and the named entity recognition task are trained simultaneously under the multi-task learning framework, so that data enhancement can be performed implicitly, regularization is introduced, the risk of over-fitting is effectively avoided, and the recognition accuracy is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
In this embodiment, a biomedical trigger word detection and named entity recognition method based on multi-task learning mainly uses a word vector coding layer based on hybrid embedding and a feature extraction layer based on bidirectional LSTM as a common part of two tasks, and then two classification layers based on conditional random fields are respectively constructed to simultaneously perform trigger word detection and named entity recognition, specifically as shown in fig. 1, according to the following steps:
step 1, preprocessing unstructured biomedical texts:
performing word segmentation and sentence segmentation on the unstructured biomedical text to obtain a tag-free training data set consisting of n sentence sequences, and recording the tag-free training data set as S ═ S1,S2,...,Si,...,Sn}; wherein S isiRepresents the ith sentence sequence, and represents the jth word sequence in the ith sentence sequence, and representing the ith sentence sequence SiThe j-th word sequenceThe kth character of (1); n represents the total number of sentences in the training data set, m represents the total number of words in one sentence, and K represents the total number of characters in one word;
step 2, labeling the training data set S:
step 2.1, setting the categories of the trigger words and the named entity recognition respectively asAndwherein the content of the first and second substances,denotes the nth class trigger class, Lner nRepresenting the nth entity category;
step 2.2, the ith sentence sequence S in the training data set S is simultaneously processediLabels of the trigger word category and the named entity category are added to all word sequences, so that a labeled training data set of a trigger word detection task is obtainedTagged training data set for recognition tasks with named entitiesWherein the content of the first and second substances,representing the ith sentence sequence SiThe j-th word sequenceAnd its corresponding trigger word classRepresenting the ith sentence sequence SiThe j-th word sequenceAnd its corresponding entity class
Step 3, word vector pre-training:
in order to allow word-level word vectors to contain a large amount of linguistic information, a large number of biomedical documents are downloaded from a Pumbed database, and word segmentation and sentence segmentation are performed to obtain a tagless training data set consisting of n ' sentence sequences, which is denoted as S ' ═ S '1,S′2,...,S′i′,...,S′n′};S′i′Expressing the ith 'sentence sequence, and then training S' by a Word2Vec tool based on a language model to obtain a pre-training Word vector matrix M;
step 4, biomedical trigger word detection based on multitask learning and data processing of an MTL-TD-NER model of a named entity neural network; the MTL-TD-NER model consists of a word vector coding layer embedded based on hybrid coding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random fields;
step 4.1, the ith sentence sequence in the text form is processed by utilizing the pre-training word vector MWord level word vector form data converted into arbitrary dimension VWherein the content of the first and second substances,represents the jth wordWord-level word vectors.
Step 4.2, processing data based on a word vector coding layer embedded by mixed coding, wherein the word vector coding layer is composed of bidirectional long-term and short-term memory network units;
step 4.2.1, in order to obtain the character level characteristic information of the word, the jth word sequence is processedEach character in the character string is input into a bidirectional long and short term memory network unit at the character level and used for training the generation probability of generating corresponding words by all the characters;
step 4.2.2, extracting the jth word sequenceFirst word inThe character and the last character are used as the j word sequence after the output connection on the hidden layer in the bidirectional long and short term memory network unitCharacter level vector of
Step 4.3, for the jth wordWord-level word vector ofAnd character level word vectorsSplicing to obtain the jth word sequenceMixed coded word vector ofThereby obtaining the ith sentence sequence SiWord vector of
And 4.4, processing data of the feature extraction layer based on the bidirectional LSTM:
in order to obtain the characteristic information of the whole context, the ith sentence sequence SiWord vector ofInputting the sentence into a layer of forward LSTM network structure, and then inputting the ith sentence sequence SiWord vector ofInput to a layer of inverted LSTMIn the network structure, the jth word vector is finally addedImplicit layer state output in two LSTM networksAndthe combinations are spliced together as a word vector at the j' th positionContext feature information ofThereby obtaining the ith sentence sequence SiCharacteristic sequence of
And 4.5, processing data of a classification layer based on the conditional random field:
4.5.1, constructing a classification layer of a trigger word detection task and a classification layer of a named entity recognition task, wherein the classification layer takes a conditional random field as a basic unit, so that the classification layer can well process the problem of label dependence; meanwhile, considering the relation that the trigger words and the entities can be correlated and promoted, parallel information conversion layer transform is constructed for the classification layer of the trigger word detection tasktd(ii) a Building parallel information conversion layer transform for classification layer of named entity recognition taskner;
Step 4.5.2, feature sequenceInputting the input into a classification layer of a trigger word detection task to obtain an output OtdThen output OtdInput to information transformationtdObtain the characteristic information F of the trigger wordtdFinally, the feature information F is obtainedtdAnd the characteristic sequence HiAdding the entity integral characteristics to obtain an entity integral characteristic, inputting the entity integral characteristic into a classification layer of a named entity identification task to obtain a final entity identification result
Step 4.5.3, feature sequenceInputting the input into a classification layer of a named entity recognition task to obtain an output OnerThen output OnerInput to information transformationnerTo obtain the characteristic information F of the entitynerFinally, the feature information F is obtainednerAnd the characteristic sequence HiAdding the overall characteristics of the trigger words to obtain the overall characteristics of the trigger words, inputting the overall characteristics into a classification layer of a trigger word detection task, and obtaining a final trigger word recognition result
Step 5, training an MTL-TD-NER model to obtain an optimal trigger word detection and named entity recognition model:
step 5.1, setting parameter variables of the model:
setting the batch size B as 50 and the current starting iteration number as epochnow0, the maximum number of iterations is epochmax100, the number of iterations epoch for which the model difference loss does not decrease continuouslyno0, the maximum number of iterations for the early stop strategy is epochesIs 15;
step 5.2, parameter initialization:
initializing each parameter of a word vector layer based on mixed coding embedding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random field by adopting a uniform distribution method;
step 5.3, from the epochnowInitially, the training data set S is input into the MTL-TD-NER model in batches of size B each time, and the output labels of the model and the correctness in the training data set S are calculated using equation (1)The difference value loss of the labels to update the parameters in the model;
in formula (1), losstdAnd lossnerIs a loss function for triggering word detection and named entity recognition tasks and comprises:
in formulae (2) and (3), ytdAnd ynerA sequence of trigger word tags and named entity tags; score (y)td) And score (y)ner) Respectively the ith sentence sequence SiInputting scores of the trigger word label and the named entity label sequence output in the MTL-TD-NER model; λ andthe parameters are over parameters, and are all 1 at the moment, so that the importance degree between the two tasks is balanced;representing all possible sets of trigger word tag sequences,represents the set of all possible entity tag sequences,to representOne of which triggers a sequence of word tags,a certain sequence of entity tags in the representation;
step 5.4, if epochnowLess than epochmaxAnd epochnoLess than epochesThen will epochnowAdding 1, and then continuing to execute the step 5.3; if epochnowGreater than or equal to epochmaxOr epochnoEqual to epochesThen, the optimal trigger word detection and named entity recognition network model based on the multi-task learning is obtained;
and 6, identifying the unlabeled data by using the optimal trigger word detection and named entity identification network model so as to obtain a trigger word label and a named entity identification label of each word in the unlabeled data.
The method for detecting the biomedical trigger words and identifying the named entities based on the multitask learning is provided. In the method, two different models are avoided being used for respectively carrying out trigger word detection and named entity recognition, and a multi-task learning framework is designed to simultaneously carry out two tasks. The MTL-TD-NER model is subjected to experiment on a data set to verify the effectiveness of the provided multi-task learning framework, and the provided multi-task learning framework algorithm is proved to have certain advantages in terms of trigger detection and named entity identification.
Claims (1)
1. A biomedical trigger word detection and named entity recognition method based on multitask learning is characterized by comprising the following steps:
step 1, preprocessing unstructured biomedical texts:
performing word segmentation and sentence segmentation on the unstructured biomedical text to obtain a tag-free training data set consisting of n sentence sequences, and recording the tag-free training data set as S ═ S1,S2,...,Si,...,Sn}; wherein S isiRepresents the ith sentence sequence, and represents the jth word sequence in the ith sentence sequence, and representing the ith sentence sequence SiThe j-th word sequenceThe kth character of (1); n represents the total number of sentences in the training data set, m represents the total number of words in one sentence, and K represents the total number of characters in one word;
step 2, labeling the training data set S:
step 2.1, setting the categories of the trigger words and the named entity recognition respectively asAndwherein the content of the first and second substances,denotes the nth class trigger class, Lner nRepresenting the nth entity category;
step 2.2, the ith sentence sequence S in the training data set S is simultaneously processediLabels of the trigger word category and the named entity category are added to all word sequences, so that a labeled training data set of a trigger word detection task is obtainedTagged training data set for recognition tasks with named entitiesWherein the content of the first and second substances,representing the ith sentence sequence SiThe j-th word sequenceAnd its corresponding trigger word class Representing the ith sentence sequence SiThe j-th word sequenceAnd its corresponding entity class
Step 3, word vector pre-training:
obtaining biomedical documents, performing word segmentation and sentence segmentation processing to obtain a label-free training data set consisting of n 'sentence sequences, and recording the label-free training data set as S ═ S'1,S′2,...,S′i′,...,S′n′};S′i′Representing the ith 'sentence sequence, and training a label-free training data set S' through a Word2Vec tool based on a language model to obtain a pre-training Word vector matrix M;
step 4, biomedical trigger word detection based on multitask learning and data processing of an MTL-TD-NER model of a named entity neural network; the MTL-TD-NER model consists of a word vector coding layer embedded based on hybrid coding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random fields;
step 4.1, the first step of text form is carried out by utilizing the pre-training word vector Mi sentence sequencesWord level word vector form data converted into arbitrary dimension VWherein the content of the first and second substances,represents the jth wordWord-level word vectors of;
step 4.2, processing data based on a word vector coding layer embedded by mixed coding, wherein the word vector coding layer is composed of bidirectional long-short term memory network units;
step 4.2.1, the jth word sequenceEach character in the character string is input into a bidirectional long and short term memory network unit at the character level and used for training the generation probability of generating corresponding words by all the characters;
step 4.2.2, extracting the jth word sequenceThe first character and the last character in the bidirectional long and short term memory network unit are used as the j word sequence after being connected on the output of the hidden layer in the bidirectional long and short term memory network unitCharacter level vector of
Step 4.3, for the jth wordWord-level word vector ofAnd character level word vectorsSplicing to obtain the jth word sequenceMixed coded word vector ofThereby obtaining the ith sentence sequence SiWord vector of
And 4.4, processing data of the feature extraction layer based on the bidirectional LSTM:
the ith sentence sequence SiWord vector ofInputting the sentence into a layer of forward LSTM network structure, and then inputting the ith sentence sequence SiWord vector ofInputting into a layer of reverse LSTM network structure, and finally, inputting the jth word vectorImplicit layer state output in two LSTM networksAndthe combinations are spliced together as a word vector at the j' th positionContext feature information ofThereby obtaining the ith sentence sequence SiCharacteristic sequence of
And 4.5, processing data of a classification layer based on the conditional random field:
step 4.5.1, constructing a classification layer of a trigger word detection task and a classification layer of a named entity recognition task, wherein the classification layer takes a conditional random field as a basic unit; building parallel information conversion layer transform for classification layer of trigger word detection tasktd(ii) a Building parallel information conversion layer transform for classification layer of named entity recognition taskner;
Step 4.5.2, feature sequenceInputting the input into a classification layer of a trigger word detection task to obtain an output OtdThen output OtdInput to information transformationtdObtain the characteristic information F of the trigger wordtdFinally, the feature information F is obtainedtdAnd the characteristic sequence HiAdding the entity integral characteristics to obtain an entity integral characteristic, inputting the entity integral characteristic into a classification layer of a named entity identification task to obtain a final entity identification result
Step 4.5.3, feature sequenceInput to namingIn the classification layer of the entity recognition task, an output O is obtainednerThen output OnerInput to information transformationnerTo obtain the characteristic information F of the entitynerFinally, the feature information F is obtainednerAnd the characteristic sequence HiAdding the overall characteristics of the trigger words to obtain the overall characteristics of the trigger words, inputting the overall characteristics into a classification layer of a trigger word detection task, and obtaining a final trigger word recognition result
Step 5, training an MTL-TD-NER model to obtain an optimal trigger word detection and named entity recognition model:
step 5.1, setting parameter variables of the model:
the batch size is B, the current number of iterations is epochnowThe maximum number of iterations is epochmaxThe number of iterations in which the difference value loss of the model does not decrease continuously is epochnoThe maximum number of iterations for the early-stop strategy is epoches;
Step 5.2, parameter initialization:
initializing each parameter of a word vector layer based on mixed coding embedding, a feature extraction layer based on bidirectional LSTM and a classification layer based on conditional random field by adopting a uniform distribution method;
step 5.3, from the epochnowStarting, inputting batches with the size of B in a training data set S into an MTL-TD-NER model each time, and calculating the difference value loss of an output label of the model and a correct label in the training data set S by using the formula (1) so as to update parameters in the model;
in formula (1), losstdAnd lossnerIs a loss function for triggering word detection and named entity recognition tasks and comprises:
in formulae (2) and (3), ytdAnd ynerA sequence of trigger word tags and named entity tags; score (y)td) And score (y)ner) Respectively the ith sentence sequence SiInputting scores of the trigger word label and the named entity label sequence output in the MTL-TD-NER model; λ andthe system is a hyper-parameter and is used for balancing the importance degree between two tasks; (ii) aRepresenting all possible sets of trigger word tag sequences,represents the set of all possible entity tag sequences,to representOne of which triggers a sequence of word tags,a certain sequence of entity tags in the representation;
step 5.4, if epochnowLess than epochmaxAnd epochnoLess than epochesThen will epochnowAdding 1, and then continuing to execute the step 5.3; if epochnowGreater than or equal to epochmaxOr epochnoEqual to epochesThen, the optimal trigger word detection and named entity recognition network model based on the multi-task learning is obtained;
and 6, identifying the unlabeled data by using the optimal trigger word detection and named entity identification network model so as to obtain a trigger word label and a named entity identification label of each word in the unlabeled data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110617440.1A CN113360667B (en) | 2021-05-31 | 2021-05-31 | Biomedical trigger word detection and named entity identification method based on multi-task learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110617440.1A CN113360667B (en) | 2021-05-31 | 2021-05-31 | Biomedical trigger word detection and named entity identification method based on multi-task learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113360667A true CN113360667A (en) | 2021-09-07 |
CN113360667B CN113360667B (en) | 2022-07-26 |
Family
ID=77531522
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110617440.1A Active CN113360667B (en) | 2021-05-31 | 2021-05-31 | Biomedical trigger word detection and named entity identification method based on multi-task learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113360667B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553853A (en) * | 2021-09-16 | 2021-10-26 | 南方电网数字电网研究院有限公司 | Named entity recognition method and device, computer equipment and storage medium |
CN114580422A (en) * | 2022-03-14 | 2022-06-03 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104965819A (en) * | 2015-07-12 | 2015-10-07 | 大连理工大学 | Biomedical event trigger word identification method based on syntactic word vector |
CN105512209A (en) * | 2015-11-28 | 2016-04-20 | 大连理工大学 | Biomedicine event trigger word identification method based on characteristic automatic learning |
CN108628970A (en) * | 2018-04-17 | 2018-10-09 | 大连理工大学 | A kind of biomedical event joint abstracting method based on new marking mode |
CN111222318A (en) * | 2019-11-19 | 2020-06-02 | 陈一飞 | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network |
WO2020193966A1 (en) * | 2019-03-26 | 2020-10-01 | Benevolentai Technology Limited | Name entity recognition with deep learning |
-
2021
- 2021-05-31 CN CN202110617440.1A patent/CN113360667B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104965819A (en) * | 2015-07-12 | 2015-10-07 | 大连理工大学 | Biomedical event trigger word identification method based on syntactic word vector |
CN105512209A (en) * | 2015-11-28 | 2016-04-20 | 大连理工大学 | Biomedicine event trigger word identification method based on characteristic automatic learning |
CN108628970A (en) * | 2018-04-17 | 2018-10-09 | 大连理工大学 | A kind of biomedical event joint abstracting method based on new marking mode |
WO2020193966A1 (en) * | 2019-03-26 | 2020-10-01 | Benevolentai Technology Limited | Name entity recognition with deep learning |
CN111222318A (en) * | 2019-11-19 | 2020-06-02 | 陈一飞 | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network |
Non-Patent Citations (4)
Title |
---|
YAN WANG 等: "Biomedical event trigger detection based on bidirectional LSTM and CRF", 《2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)》 * |
YANSEN SU 等: "EMODMI: A Multi-Objective Optimization Based Method to Identify Disease Modules", 《 IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE》 * |
何馨宇: "基于文本挖掘的生物事件抽取关键问题研究", 《中国博士学位论文全文数据库》 * |
苏延森 等: "水下爬行机器人多目标路径规划的研究", 《合肥工业大学学报(自然科学版)》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553853A (en) * | 2021-09-16 | 2021-10-26 | 南方电网数字电网研究院有限公司 | Named entity recognition method and device, computer equipment and storage medium |
CN114580422A (en) * | 2022-03-14 | 2022-06-03 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
CN114580422B (en) * | 2022-03-14 | 2022-12-13 | 昆明理工大学 | Named entity identification method combining two-stage classification of neighbor analysis |
Also Published As
Publication number | Publication date |
---|---|
CN113360667B (en) | 2022-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444726B (en) | Chinese semantic information extraction method and device based on long-short-term memory network of bidirectional lattice structure | |
CN110502749B (en) | Text relation extraction method based on double-layer attention mechanism and bidirectional GRU | |
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
Yao et al. | An improved LSTM structure for natural language processing | |
CN112115238B (en) | Question-answering method and system based on BERT and knowledge base | |
CN112541356B (en) | Method and system for recognizing biomedical named entities | |
CN108388560A (en) | GRU-CRF meeting title recognition methods based on language model | |
CN114943230B (en) | Method for linking entities in Chinese specific field by fusing common sense knowledge | |
CN111950283B (en) | Chinese word segmentation and named entity recognition system for large-scale medical text mining | |
CN111738007A (en) | Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network | |
CN113360667B (en) | Biomedical trigger word detection and named entity identification method based on multi-task learning | |
Gao et al. | Named entity recognition method of Chinese EMR based on BERT-BiLSTM-CRF | |
CN112784604A (en) | Entity linking method based on entity boundary network | |
CN111914556A (en) | Emotion guiding method and system based on emotion semantic transfer map | |
CN114676255A (en) | Text processing method, device, equipment, storage medium and computer program product | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
Xu et al. | Sentence segmentation for classical Chinese based on LSTM with radical embedding | |
CN113094502A (en) | Multi-granularity takeaway user comment sentiment analysis method | |
CN112800184A (en) | Short text comment emotion analysis method based on Target-Aspect-Opinion joint extraction | |
CN115545021A (en) | Clinical term identification method and device based on deep learning | |
CN112101014A (en) | Chinese chemical industry document word segmentation method based on mixed feature fusion | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
Liu et al. | Improved Chinese sentence semantic similarity calculation method based on multi-feature fusion | |
CN114564953A (en) | Emotion target extraction model based on multiple word embedding fusion and attention mechanism | |
CN116522165B (en) | Public opinion text matching system and method based on twin structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |