CN111078889A - Method for extracting relationships among medicines based on attention of various entities and improved pre-training language model - Google Patents

Method for extracting relationships among medicines based on attention of various entities and improved pre-training language model Download PDF

Info

Publication number
CN111078889A
CN111078889A CN201911330114.1A CN201911330114A CN111078889A CN 111078889 A CN111078889 A CN 111078889A CN 201911330114 A CN201911330114 A CN 201911330114A CN 111078889 A CN111078889 A CN 111078889A
Authority
CN
China
Prior art keywords
drug
sentence
attention
entity
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911330114.1A
Other languages
Chinese (zh)
Other versions
CN111078889B (en
Inventor
李丽双
朱燏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN201911330114.1A priority Critical patent/CN111078889B/en
Publication of CN111078889A publication Critical patent/CN111078889A/en
Application granted granted Critical
Publication of CN111078889B publication Critical patent/CN111078889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Abstract

The invention belongs to the technical field of computer natural language processing, and provides a method for extracting relationships among medicines based on attention of various entities and an improved pre-training language model. A variety of different entity attention mechanisms are utilized within a neural network to enhance understanding of complex drug names by the neural network, wherein the entity attention mechanisms include: the method comprises the following steps of entity mark attention, two-entity mark difference attention and an attention mechanism based on an entity description document, and meanwhile, the input of a pre-training language model is improved, so that the output of the pre-training language model can be better suitable for a relationship extraction task among medicines. The method has the advantages that the problem that the medicine name cannot be well understood by a deep learning model due to the fact that the medicine name is too complex when a medicine relation description document is processed is solved, and the medicine relation recognition level is improved.

Description

Method for extracting relationships among medicines based on attention of various entities and improved pre-training language model
Technical Field
The invention belongs to the technical field of computer natural language processing, relates to a method for extracting relationships between medicines from biomedical texts, and particularly relates to a method for extracting relationships between medicines based on an improved pre-training language model and multiple entity attention mechanisms.
Background
Drug-drug interactions (DDIs) refer to the combined effects of two or more drugs taken simultaneously or over a period of time. With the ongoing and intensive research on drug-drug interactions by medical workers, a great deal of valuable information is buried in the exponentially growing unstructured biomedical literature. Many of The information on drugs are currently found in Drug-related open databases such as Drug bank (Wishart, D.S. et al. Drug Bank5.0: A major update to The Drug Bank database for 2018.nucleic acids Res.2017,46, 1074-. How to automatically extract the relationship between structured medicines from massive unstructured biomedical documents is a problem which needs to be solved by researchers urgently.
Relationship extraction is one of the common tasks in natural language processing that enables the mining of relationships between two specific entities in text through a machine learning model. The extraction of the relationship between drugs is a very typical relationship extraction task, and is also one of the tasks which are very concerned in the biomedical field. In recent years, DDI Extraction2011 (section Bedmar, I.et al, The 1st section topic of The section-2011 Challenge: Extraction of Drug-Drug Interactions from biological tissues. in Proceedings of The 1st section Task Extraction2011, Huelva, Spain,7 September) and DDIExtraction2013 (section-Bedmar, I.et al.section Eval-2013Task9: Extraction of Drug-Drug Interactions from biological tissues (DDI Extraction 2013), introduction of The section 7 of The section biological tissues of The section, III, IV.
At present, researchers mainly adopt corpora of a DDI Extraction2013 task to evaluate the performance of a DDI Extraction model. The difficulty with this task is to classify the relationships between drugs described in biomedical text into 5 classes, including the mecanism type, Effect type, Advice type, Int type and Negative type. The mecanism type is used to describe the pharmacokinetic relationship of two drugs. The Effect type is used to indicate that two drugs have mutual influence on the drug Effect. The Advise type is used to describe the recommended or suggested relationship of two drugs in use. The Int type is used to describe that two drugs have a specific relationship but is not described in the literature. Negative type indicates that there is no correlation between the two drugs. For example, in the illustrative sentence "Pantoprazole a murach maker effect on loopdogrel's pharmacokineticics and ndon sheet reactivity during contact consistent use", where the relationship mechanism exists between the drugs "Pantoprazole" and "loopdogrel". In the example sentence "code in combination with other therapeutic algorithms, general anethosthetics, phenothiazines, transformers, sectional-hypnotics, or other cns depressants (including alcohols) has additional negative effects", the relationship effect exists between the drugs code and the therapeutic algorithms. Through the second example, we can also find that in the sentence, besides two drugs that have a relationship, other drugs such as "anesthesics", "phenothiazines" and "alcohol" are mentioned. The medicines which do not have the relation in the sentence can interfere the judgment of the current medicine relation, and the difficulty of model judgment is improved. In addition, drug names tend to be quite complex, which also makes it difficult for models to understand the meaning of drug entities in sentences by drug name.
Currently, two types of methods are mainly used for such tasks, the first type is the traditional machine learning method, and the second type is Deep learning (LeCun Y et al, Deep learning [ J ]]Methods of Nature,2015,521(7553): 436). The traditional machine learning method needs to extract a large number of words from the original text,And the characteristics such as grammar and the like are sent to a discriminator such as SVM or random forest and the like. Chowdhury et al (Chowdhury M et al, FBK-irst: A multi-phase kernel base amplification for Drug-Drug interaction detection and classification of Drug expression C]7th International work on semantic evaluation, Atlanta, Georgia, USA,2013: 351-. J.
Figure BDA0002329340670000031
Et al (
Figure BDA0002329340670000032
J,Kaewphan S,Salakoski T.UTurku:drug namedentity recognition and drug-drug interaction extraction using SVMclassification and domain knowledge[C]The shortest dependence path information is adopted as the input of the SVM model and the knowledge of the related fields is fused by Volume 2: Proceedings of the SeventhInternationalWorkshop on Semantic Evaluation (SemEval 2013).2013: 651-659). Thomas et al (Thomas P, Neves M,
Figure BDA0002329340670000033
T,et al.WBI-DDI:drug-drug interaction extraction usingmajority voting[C]// Second Joint Conference on Lexical and comparative Semantics (. SEM), Volume 2: Proceedings of the Seventh International Workshop on Semanetic Evaluation (SemEval 2013).2013:628-635) used a voting-based kernel method for classification. In general, the conventional machine learning-based method needs to design a large number of complex feature sets to improve the performance of the model, but designing and extracting the feature sets requires much manpower.
In recent years, more and more depth models are applied to natural language processing tasks and have good effects. Quan et al (Quan C, Hua L, Sun X, et al. multichannel connected neural network for biological interaction [ J ]. BioMed research international,2016,2016:1-10) propose a multichannel CNN model using word vectors obtained by various pre-training methods as input. Asada et al (Asada M, Miwa M, Sasaki Y. engineering Drug-Drug Interactive extraction from texture by Molecular Structure Information [ J ]. Proceedings of 56th annular Meeting of the ACL,2018:680-685) propose a method of fusing Molecular Information into CNN and Graph Convolution Neural Network (GCNN) to extract DDI. Recurrent Neural Networks (RNNs) are better suited to processing time series data than CNNs, and are better at capturing sequence features of sentences. Zhang et al (Zhang Y, Zheng W, Lin H, et al. drug-drug interaction experience technical RNNs on sequence and short dependency path [ J ]. Bioinformatics,2017,34(5): 828) 835) propose a hierarchical RNN method, combining Shortest Dependent Paths (SDPs) and sentence sequences for DDI extraction. Some researchers also combined the two models to extract DDIs. Sun et al (Sun X, Dong K, Ma L, et al. Drug-Drug Interactive extraction viia recovery Hybrid Networks with improved Focal local [ J ]. Entrophy, 2019,21(1):37) propose a new Recursive Hybrid Convolutional Neural Network (RHCNN) for DDI extraction.
Although various approaches have been proposed, there is still much room to improve the performance of the DDI extraction model. In order to avoid the influence of complicated drug names on the performance of the model, the former work often replaces drug names in sentences with specific words, which results in the loss of part of useful information. Furthermore, previous work mostly relies on the syntactic characteristics of the dependency path to improve the performance of the model, and the syntactic characteristics depend on specific tool generation, so that the performance of the model is also limited by the tools.
Disclosure of Invention
The invention does not depend on any lexical and syntactic information, simplifies the input of the model through the improved BioBERT pre-training word vector and various entity attention mechanisms, better utilizes the drug name information, and the performance reaches the current leading level.
The technical scheme of the invention is as follows:
a method for extracting relationships among medicines based on multi-entity attention and an improved pre-training language model comprises the following steps:
text preprocessing
Preprocessing the corpus: (1) firstly, all texts are converted into lower case, and then punctuation marks and non-English characters are removed; (2) because the extraction of the relationship between the medicines does not relate to quantitative analysis, the invention replaces all the numbers in the text with the word "num"; (3) a sentence may contain a plurality of drug entities, and for each pair of drug entities an instance is generated, together
Figure BDA0002329340670000051
An example, where n is the number of drug entities in the sentence; (4) replacing the target entity in each instance with "drug 1" and "drug 2", replacing with "drug 0" for the non-target entities in the instance; (5) the set model is able to handle the maximum length of a sentence and if the sentence in the instance does not reach the maximum length, it is filled in with the character "0".
(II) obtaining sentence preliminary coding by using improved BioBERT model
And the improved BioBERT is adopted as an encoding mode of a word vector, so that the word vector has better generalization performance. As shown in FIG. 2, the BioBERT model is composed of 12 layers of Transformer structures as the BERT model, and the output of each layer of Transformer is sent to the next layer of Transformer; averaging output vectors of transformers of the last four layers in a BioBERT model, and replacing the original output of the BioBERT with the average vectors; for the preprocessed sentence X ═ { X ═ X1,x2,...,xm(m is sentence length), after encoding by the above-mentioned modified biobeep, a vector representation of the sentence, V, biobert (x), is obtained;
(III) utilizing bidirectional gating recursion unit to obtain semantic representation of sentence
In order to incorporate the context information into the sentence code, the Bi-GRU is adopted to enter the sentenceCarrying out further encoding; for each word V in ViIts representation is obtained by forward and backward GRU coding
Figure BDA0002329340670000061
And
Figure BDA0002329340670000062
then the forward and backward results are spliced to obtain the final representation of each word
Figure BDA0002329340670000063
Wherein d ishDimension for GRU unit output; the sentence coding vector is H ═ H in this case1,h2,...,hm};
(IV) enhancing the weight of an entity in a sentence by using a plurality of entity attention mechanisms
The sentence coding vector H is processed through three different entity attention mechanisms to enhance the understanding of the model to the medicine entity; the three attention mechanisms all adopt an original attention model, but input drug entity information is different, so that the neural network model utilizes the drug entity information from different angles;
these three attention mechanisms are described separately below;
(4.1) drug description document attention
Selecting Wikipedia and drug Bank as the acquisition path of the drug entity description document, and for the set E ═ E of all drug entities in the corpus1,e2,...,ekK is the total number of all drug entities in the corpus, the drug description document is converted into a vector set K of drug description documents which is Doc2Vec (E) through a Doc2Vec model,
Figure BDA0002329340670000064
wherein d iseIs the length of the document vector;
(4.2) attention of drug entities
The drug entity word vector is sent to an attention mechanism as a feature; the drug entity information is two existing relations in the sentence coding vector HCorresponding to the drug entity of (a) vector he1,he2
(4.3) attention between drug entities
The difference between two drug entities is used as mutual information between two drugs and sent to an attention mechanism; the inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2
Respectively sending the three entity information and the sentence coding vector H into an attention mechanism together to obtain entity information weighted sentence representation; the attention mechanism is shown in equations (1-3):
M=tanh([HWs,RWp]+b) (1)
α=softmax(M) (2)
r=HαT(3)
wherein the content of the first and second substances,
Figure BDA0002329340670000071
expanding the three characteristics to the length equal to the sentence to obtain a sequence;
Figure BDA0002329340670000072
a parameter matrix for attention mechanism, wherein daIs the matrix dimension;
Figure BDA0002329340670000073
is an offset; the output of the attention mechanism is
Figure BDA0002329340670000074
With the above-mentioned attention mechanism, an entity-weighted sentence vector representation based on three features is obtained, as shown in equations (4-8):
Figure BDA0002329340670000075
Figure BDA0002329340670000076
re1=attention(H,he1) (6)
re2=attention(H,he2) (7)
re12=attention(H,he12) (8)
wherein k is1And k2Is two drug description document vectors, r, from a set K of drug description document vectorsk1And rk2Attention results obtained by two drug entity document description vectors, re1And re2Attention results obtained for two drug entities, re12Attention results obtained by the difference of the two drug entities; by encoding these attention results and the last element H of the sentence-encoding vector HmAnd (4) splicing to obtain a final sentence expression vector O, as shown in formula (9):
Figure BDA0002329340670000081
(V) obtaining the final medicine relation classification by utilizing a Softmax classifier
After the sentence representation weighted by the entity information is obtained, compressing the sentence representation vector dimension through a layer of feedforward neural network, and finally sending the sentence representation vector dimension to a Softmax layer to obtain a final classification result;
the model output layer sends the output O of the multi-entity attention layer as the final classification feature to the full-connection layer for classification, and the probability P (y ═ C) of y belonging to the DDI type of the C (C ∈ C) of the candidate drug-drug relation is shown as the formula (10):
P(yi)=Softmax(OWO+b) (10)
wherein, WOAnd b is a weight matrix and an offset, the activation function of the full connectivity layer is Softmax, and C is a set of DDI type tags. Finally, the category label with the highest probability is calculated using equation (11)
Figure BDA0002329340670000082
I.e. the type of relationship between candidate drug-drug pairs.
Figure BDA0002329340670000083
The invention has the beneficial effects that: the comparison between the extraction method of the present invention and other DDI extraction methods is shown in table 1, where all the methods are tests performed on DDIExtraction2013 corpus. The F1 value of the present invention was 80.9%, which is a 5.4% improvement over the previous best results. In addition, the system also reaches the highest accuracy and recall rate, which are respectively increased to 81.0% and 80.9%.
TABLE 1 comparison of the Effect of the invention with other DDI extraction methods
Figure BDA0002329340670000084
Figure BDA0002329340670000091
Drawings
FIG. 1 is a diagram of a neural network model architecture employed by the present invention.
FIG. 2 is a schematic representation of the improvement of the BioBERT model according to the present invention.
Detailed Description
The following describes in detail embodiments of the present invention in conjunction with the constructed neural network model of the present invention.
The general model structure of the invention is shown in figure 1. The DDI corpus to be processed is preprocessed, the noun explanation of the drugs involved in the text is searched out from DrugBank and Wikipedia, and the descriptions of the drugs are converted into vectors by a Doc2Vec tool. For sentences in DDI corpus, the invention obtains vector representation of the sentences through a modified BioBERT model and a bidirectional GRU network. And finally, obtaining a final discrimination result through a feedforward neural network and a softmax layer. The specific implementation flow is described below.
First, preprocessing the corpus
The pretreatment work comprises the following steps:
(1) removing punctuation marks and non-English characters in the corpus, and separating each word by a blank;
(2) uniformly converting the text into lower case characters;
(3) uniformly replacing related numbers in the corpus into num;
(4) for the case that a sentence in the corpus contains a plurality of drug entities, combining all the drug entities pairwise, if the sentence contains n drug entities, generating the drug entities together
Figure BDA0002329340670000092
An example. In addition, the invention replaces the drug entities in each instance that are in relationship with each other with "drug 1" and "drug 2", and for the other drugs in the instance with "drug 0".
(5) The set model is able to handle the maximum length of a sentence and if the sentence in the instance does not reach the maximum length, it is filled in with the character "0".
Coding of sentences
The sentence coding is divided into the following two steps:
(1) preliminary coding of sentences by modified BioBERT
The present invention encodes each word in a sentence as a word vector through a modified BioBERT. For the preprocessed sentence X ═ { X ═ X1,x2,...,xnN is the sentence length, resulting in a vector representation of the sentence V-biobert (x). BioBERT was trained using two biomedical databases, PMC and PubMed.
(2) Context semantic coding of sentences by Bi-GRU
For each word V in ViThe invention obtains its representation by forward and backward GRU coding
Figure BDA0002329340670000101
And
Figure BDA0002329340670000102
then the forward and backward results are spliced to obtain the final result of each wordTo represent
Figure BDA0002329340670000103
Wherein d ishThe dimension of the GRU unit output. When the sentence is coded as H ═ H1,h2,...,hn}. The output dimensions of the GRU units and the dimensions of the output of the BioBERT model are identical.
Coding of medicine description document
The invention adopts a browser automatic test framework selenium as a crawler to dynamically crawl the abstract of each entity in Wikipedia and DrugBank. In the process of crawling the abstract, all entities can not find a definite abstract corresponding to the entity, for example, a medicine of which 'neuro-drugs' (anticonvulsant medicines) are not very clear but a general name of a class of medicines can not be found, so that the entity entry can not be found, and therefore, the names of the large classes of the entities are used for replacing the whole entity, namely, the abstract using 'neuro-drugs' as key words is used as the abstract of the whole entity. If a small number of words have no corresponding abstract after the above processing, the entity itself is used as the corresponding abstract for supplement.
For the set of all drug entities in the corpus E ═ E1,e2,...,ekConverting the drug description document of the corpus into a vector set K of drug description documents, which is Doc2Vec (E), through a Doc2Vec model,
Figure BDA0002329340670000104
wherein d iseIs the length of the document vector.
Attention mechanism for four or more kinds of entities
The three kinds of entity information adopted by the method are respectively medicine description information, medicine entity information and medicine inter-information. The medicine description information is a medicine description document vector set K, and the medicine entity information is a vector H corresponding to two medicine entities with a relationship in the sentence sequence coding He1,he2The inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2. Dimension of three kinds of entity information is same as GRU unitIs output dimension.
The three entity information are respectively sent into an attention mechanism (formula 1-3) together with a vector expression H of the sentence, and a sentence expression with the entity information weighted is obtained, as shown in formula (4-8). Where rk1 and rk2 are attention results obtained by two drug entity document description vectors, re1 and re2 are attention results obtained by two drug entities, and re12 is attention result obtained by the difference of two drug entities. By comparing these attention results with the last element H of the sentence vector sequence HfinalAnd (5) splicing to obtain a final sentence expression vector O, as shown in formula (9). Note that the output of the force mechanism has the same dimension as the output of the GRU unit.
Five, output
And the model output layer sends the output O of the multi-entity attention layer as a final classification feature to the full-connection layer for classification, and the candidate drug-drug relation is shown as a formula (10) for the probability P (y ═ C) that y belongs to the DDI type of the C (C ∈ C).
Wherein, WOAnd b is a weight matrix and an offset, the activation function of the full link layer is Softmax, and C is a set C ═ negative, effect, mechanism, advice, int } of DDI type tags. Finally, the category label with the highest probability is calculated using equation (11)
Figure BDA0002329340670000111
I.e. the type of relationship between candidate drug-drug pairs.
After the model is realized through the five steps, the invention trains the model and tests the performance on the DDIExtraction2013 corpus. The division of the training set and test set is 9: 1. Summary of the DDI 2013 corpus is shown in Table 2, the DDI corpus consists of 792 texts from a drug Bank database and 233 abstracts from a MedLine database, and the total drug relationships are 5, namely Negative, Effect, Mechanism, Advice and Int.
TABLE 2 number of relationships in DDIExtraction2013 corpus
Type (B) DDI-DrugBank DDI-MedLine Total of
Effect 1855(39.4%) 214(65.4%) 2069(41.1%)
Mechanism 1539(32.7%) 86(26.3%) 1625(32.3%)
Advice 1035(22%) 15(4.6%) 1050(20.9%)
Int 272(5.8%) 12(3.7%) 765(5.6%)
Total of 4701 327 5028
The invention generates additional examples by matching the corpora in the medicine pairwise. However, in the training examples obtained by the method, the number of Negative type examples is extremely large, and the imbalance of the classes can greatly affect the performance of the model. In order to solve the problem of unbalanced number of the relation examples of various medicines in the corpus, the invention carries out the work of clearing negative examples according to the following three rules:
1. if two drugs in a drug pair appear in the same relationship, the corresponding instance is filtered out.
2. If two drugs in a drug pair have the same name, or one is an abbreviation for the other, the corresponding instance is filtered out.
3. If one drug in a drug pair is a special case of the other drug, the corresponding instance is filtered out.
The corpus instance information after deleting the negative cases is shown in table 3. By adopting the negative case deleting method based on the rules, the problem of unbalance among the cases is relieved to a certain extent.
TABLE 3 data set by example Generation and negative deletion
Figure BDA0002329340670000121
The evaluation index adopted by the invention is F1 value, as shown in formula (12):
Figure BDA0002329340670000131
wherein P represents the precision, R represents the recall, and the calculation formulas (13-14) of precision and recall are as follows:
Figure BDA0002329340670000132
Figure BDA0002329340670000133
where TP represents the number of predicted positive and actual positive instances, FP represents the number of predicted positive and actual negative instances, FN represents the number of predicted negative and actual positive instances, and TN represents the number of predicted negative and actual negative instances.
The invention adopts a Keras library based on a Tensorflow bottom layer to realize a specific model. The model set-up parameters are shown in table 4.
TABLE 4 parameter set of inventive model
Parameter name Parameter value
Doc2Vec vector dimension 200
BioBERT vector dimension 768
Output dimension of BiGRU layer 1536
Maximum sentence length 250
Attention layer output dimension 1536
Output dimension of multilayer perceptron 256
In the training phase, the present invention uses an early stop method. After 10 rounds of continuous training, if the model performance in the verification set is not improved, the training is stopped and the best model to perform in the verification set is selected as the final model to predict the results of the test set. And (4) tuning all the hyper-parameters on the verification set through grid search. The learning rate of the model during training was set to 0.001, and 128 instances were processed by the model each time.

Claims (1)

1. A method for extracting relationships among medicines based on attention of various entities and an improved pre-training language model is characterized by comprising the following steps:
text preprocessing
Preprocessing the corpus: (1) firstly, all texts are converted into lower case, and then punctuation marks and non-English characters are removed; (2) replacing all the numbers in the text with the word "num"; (3) a sentence may contain a plurality of drug entities, and for each pair of drug entities an instance is generated, together
Figure FDA0002329340660000014
An example, wherein n is the number of drug entities in the sentence; (4) replacing the target entity in each instance with "drug 1" and "drug 2", replacing with "drug 0" for the non-target entities in the instance; (5) setting the maximum length of the sentence which can be processed by the model, and if the sentence in the example does not reach the maximum length, filling the sentence with the character of '0';
(II) obtaining sentence preliminary coding by using improved BioBERT model
Adopting an improved BioBERT as a coding mode of a word vector, wherein the BioBERT model is composed of 12 layers of Transformer structures like the BERT model, and the output of each layer of Transformer is sent to the next layer of Transformer; in the BioBERT model, averaging output vectors of transformers of the last four layers, and replacing the original output of the BioBERT with the average vectors; for the preprocessed sentence X ═ { X ═ X1,x2,...,xmWhere m is the sentence length, after encoding by the above-described modified biobert, a vector representation of the sentence V ═ biobert (x) is obtained;
(III) utilizing bidirectional gating recursion unit to obtain semantic representation of sentence
In order to integrate the context information into the sentence codes, Bi-GRU is adopted to further code the sentences; for each word V in ViIts representation is obtained by forward and backward GRU coding
Figure FDA0002329340660000011
And
Figure FDA0002329340660000012
then the forward and backward results are spliced to obtain the final representation of each word
Figure FDA0002329340660000013
Wherein d ishDimension for GRU unit output; the sentence coding vector is H ═ H in this case1,h2,...,hm};
(IV) enhancing the weight of an entity in a sentence by using a plurality of entity attention mechanisms
The sentence coding vector H is processed through three different entity attention mechanisms to enhance the understanding of the model to the medicine entity;
(4.1) drug description document attention
Selecting Wikipedia and drug Bank as the acquisition path of the drug entity description document, and for the set E ═ E of all drug entities in the corpus1,e2,...,ekK is the total number of all drug entities in the corpus, the drug description document is converted into a vector set K of drug description document (Doc 2Vec (E)) through a Doc2Vec model,
Figure FDA0002329340660000021
wherein d iseIs the length of the document vector;
(4.2) attention of drug entities
The drug entity word vector is sent to an attention mechanism as a feature; the drug entity information is a vector corresponding to two drug entities with relationship in the sentence coding vector Hhe1,he2
(4.3) attention between drug entities
The difference between two drug entities is used as mutual information between two drugs and sent to an attention mechanism; the inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2
Respectively sending the three entity information and the sentence coding vector H into an attention mechanism together to obtain entity information weighted sentence representation; the attention mechanism is shown in equations (1-3):
M=tanh([HWs,RWp]+b) (1)
α=softmax(M) (2)
r=HαT(3)
wherein the content of the first and second substances,
Figure FDA0002329340660000022
expanding the three characteristics to the length equal to the sentence to obtain a sequence;
Figure FDA0002329340660000023
a parameter matrix for attention mechanism, wherein daIs the matrix dimension;
Figure FDA0002329340660000024
is an offset; the output of the attention mechanism is
Figure FDA0002329340660000025
With the above-mentioned attention mechanism, an entity-weighted sentence vector representation based on three features is obtained, as shown in equations (4-8):
Figure FDA0002329340660000026
Figure FDA0002329340660000031
re1=attention(H,he1) (6)
re2=attention(H,he2) (7)
re12=attention(H,he12) (8)
wherein k is1And k2Is two drug description document vectors, r, from a set K of drug description document vectorsk1And rk2Attention results obtained by two drug entity document description vectors, re1And re2Attention results obtained for two drug entities, re12Attention results obtained by the difference of the two drug entities; by encoding these attention results and the last element H of the sentence-encoding vector HmAnd (4) splicing to obtain a final sentence expression vector O, as shown in formula (9):
Figure FDA0002329340660000032
(V) obtaining the final medicine relation classification by utilizing a Softmax classifier
After the sentence representation weighted by the entity information is obtained, compressing the sentence representation vector dimension through a layer of feedforward neural network, and finally sending the sentence representation vector dimension to a Softmax layer to obtain a final classification result;
the model output layer sends the output O of the multi-entity attention layer as the final classification feature to the full-connection layer for classification, and the probability P (y ═ C) of y belonging to the DDI type of the C (C ∈ C) of the candidate drug-drug relation is shown as the formula (10):
P(yi)=Soft max(OWO+b) (10)
wherein, WOB is a weight matrix and an offset, an activation function of the full connection layer is Softmax, and C is a set of DDI type tags; finally, the category label with the highest probability is calculated using equation (11)
Figure FDA0002329340660000034
Namely the relation type of the candidate drug-drug pair;
Figure FDA0002329340660000033
CN201911330114.1A 2019-12-20 2019-12-20 Method for extracting relationship between medicines based on various attentions and improved pre-training Active CN111078889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911330114.1A CN111078889B (en) 2019-12-20 2019-12-20 Method for extracting relationship between medicines based on various attentions and improved pre-training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911330114.1A CN111078889B (en) 2019-12-20 2019-12-20 Method for extracting relationship between medicines based on various attentions and improved pre-training

Publications (2)

Publication Number Publication Date
CN111078889A true CN111078889A (en) 2020-04-28
CN111078889B CN111078889B (en) 2021-01-05

Family

ID=70316460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911330114.1A Active CN111078889B (en) 2019-12-20 2019-12-20 Method for extracting relationship between medicines based on various attentions and improved pre-training

Country Status (1)

Country Link
CN (1) CN111078889B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798954A (en) * 2020-06-11 2020-10-20 西北工业大学 Drug combination recommendation method based on time attention mechanism and graph convolution network
CN111814460A (en) * 2020-07-06 2020-10-23 四川大学 External knowledge-based drug interaction relation extraction method and system
CN111949792A (en) * 2020-08-13 2020-11-17 电子科技大学 Medicine relation extraction method based on deep learning
CN112256939A (en) * 2020-09-17 2021-01-22 青岛科技大学 Text entity relation extraction method for chemical field
CN112528621A (en) * 2021-02-10 2021-03-19 腾讯科技(深圳)有限公司 Text processing method, text processing model training device and storage medium
CN112667808A (en) * 2020-12-23 2021-04-16 沈阳新松机器人自动化股份有限公司 BERT model-based relationship extraction method and system
CN112820375A (en) * 2021-02-04 2021-05-18 闽江学院 Traditional Chinese medicine recommendation method based on multi-graph convolution neural network
CN112860816A (en) * 2021-03-01 2021-05-28 三维通信股份有限公司 Construction method and detection method of interaction relation detection model of drug entity pair
CN113241128A (en) * 2021-04-29 2021-08-10 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113642319A (en) * 2021-07-29 2021-11-12 北京百度网讯科技有限公司 Text processing method and device, electronic equipment and storage medium
CN113806531A (en) * 2021-08-26 2021-12-17 西北大学 Drug relationship classification model construction method, drug relationship classification method and system
CN114048727A (en) * 2021-11-22 2022-02-15 北京富通东方科技有限公司 Medical field-oriented relation extraction method
CN114925678A (en) * 2022-04-21 2022-08-19 电子科技大学 Drug entity and relationship combined extraction method based on high-level interaction mechanism
CN117408247A (en) * 2023-12-15 2024-01-16 南京邮电大学 Intelligent manufacturing triplet extraction method based on relational pointer network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160275250A1 (en) * 2015-03-17 2016-09-22 Biopolicy Innovations Inc. Drug formulary document parsing and comparison system and method
CN108733792A (en) * 2018-05-14 2018-11-02 北京大学深圳研究生院 A kind of entity relation extraction method
CN109902171A (en) * 2019-01-30 2019-06-18 中国地质大学(武汉) Text Relation extraction method and system based on layering knowledge mapping attention model
CN110580340A (en) * 2019-08-29 2019-12-17 桂林电子科技大学 neural network relation extraction method based on multi-attention machine system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160275250A1 (en) * 2015-03-17 2016-09-22 Biopolicy Innovations Inc. Drug formulary document parsing and comparison system and method
CN108733792A (en) * 2018-05-14 2018-11-02 北京大学深圳研究生院 A kind of entity relation extraction method
CN109902171A (en) * 2019-01-30 2019-06-18 中国地质大学(武汉) Text Relation extraction method and system based on layering knowledge mapping attention model
CN110580340A (en) * 2019-08-29 2019-12-17 桂林电子科技大学 neural network relation extraction method based on multi-attention machine system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LING LUO等: "An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition", 《DATA AND TEXT MINING》 *
李丽双等: "融合依存信息Attention机制的药物关系抽取研究", 《中文信息学报》 *
蒋振超: "基于词表示和深度学习的生物医学关系抽取", 《万方数据》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798954A (en) * 2020-06-11 2020-10-20 西北工业大学 Drug combination recommendation method based on time attention mechanism and graph convolution network
CN111814460A (en) * 2020-07-06 2020-10-23 四川大学 External knowledge-based drug interaction relation extraction method and system
CN111949792A (en) * 2020-08-13 2020-11-17 电子科技大学 Medicine relation extraction method based on deep learning
CN111949792B (en) * 2020-08-13 2022-05-31 电子科技大学 Medicine relation extraction method based on deep learning
CN112256939A (en) * 2020-09-17 2021-01-22 青岛科技大学 Text entity relation extraction method for chemical field
CN112256939B (en) * 2020-09-17 2022-09-16 青岛科技大学 Text entity relation extraction method for chemical field
CN112667808A (en) * 2020-12-23 2021-04-16 沈阳新松机器人自动化股份有限公司 BERT model-based relationship extraction method and system
CN112820375A (en) * 2021-02-04 2021-05-18 闽江学院 Traditional Chinese medicine recommendation method based on multi-graph convolution neural network
CN112528621A (en) * 2021-02-10 2021-03-19 腾讯科技(深圳)有限公司 Text processing method, text processing model training device and storage medium
CN112860816A (en) * 2021-03-01 2021-05-28 三维通信股份有限公司 Construction method and detection method of interaction relation detection model of drug entity pair
CN113241128A (en) * 2021-04-29 2021-08-10 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113241128B (en) * 2021-04-29 2022-05-13 天津大学 Molecular property prediction method based on molecular space position coding attention neural network model
CN113642319A (en) * 2021-07-29 2021-11-12 北京百度网讯科技有限公司 Text processing method and device, electronic equipment and storage medium
CN113806531A (en) * 2021-08-26 2021-12-17 西北大学 Drug relationship classification model construction method, drug relationship classification method and system
CN113806531B (en) * 2021-08-26 2024-02-27 西北大学 Drug relationship classification model construction method, drug relationship classification method and system
CN114048727B (en) * 2021-11-22 2022-07-29 北京富通东方科技有限公司 Medical field-oriented relationship extraction method
CN114048727A (en) * 2021-11-22 2022-02-15 北京富通东方科技有限公司 Medical field-oriented relation extraction method
CN114925678A (en) * 2022-04-21 2022-08-19 电子科技大学 Drug entity and relationship combined extraction method based on high-level interaction mechanism
CN114925678B (en) * 2022-04-21 2023-05-26 电子科技大学 Pharmaceutical entity and relationship joint extraction method based on high-level interaction mechanism
CN117408247A (en) * 2023-12-15 2024-01-16 南京邮电大学 Intelligent manufacturing triplet extraction method based on relational pointer network
CN117408247B (en) * 2023-12-15 2024-03-29 南京邮电大学 Intelligent manufacturing triplet extraction method based on relational pointer network

Also Published As

Publication number Publication date
CN111078889B (en) 2021-01-05

Similar Documents

Publication Publication Date Title
CN111078889B (en) Method for extracting relationship between medicines based on various attentions and improved pre-training
CN112199511B (en) Cross-language multi-source vertical domain knowledge graph construction method
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN111159223B (en) Interactive code searching method and device based on structured embedding
CN109446338B (en) Neural network-based drug disease relation classification method
US20220147836A1 (en) Method and device for text-enhanced knowledge graph joint representation learning
CN110825721B (en) Method for constructing and integrating hypertension knowledge base and system in big data environment
CN111950285B (en) Medical knowledge graph intelligent automatic construction system and method with multi-mode data fusion
CN110532328B (en) Text concept graph construction method
CN106844658A (en) A kind of Chinese text knowledge mapping method for auto constructing and system
CN110287323B (en) Target-oriented emotion classification method
CN105512209A (en) Biomedicine event trigger word identification method based on characteristic automatic learning
CN111625659A (en) Knowledge graph processing method, device, server and storage medium
CN110189831A (en) A kind of case history knowledge mapping construction method and system based on dynamic diagram sequences
CN113806563A (en) Architect knowledge graph construction method for multi-source heterogeneous building humanistic historical material
Gao et al. Detecting disaster-related tweets via multimodal adversarial neural network
CN115019906B (en) Drug entity and interaction combined extraction method for multi-task sequence labeling
CN115269865A (en) Knowledge graph construction method for auxiliary diagnosis
WO2020074017A1 (en) Deep learning-based method and device for screening for keywords in medical document
CN113988075A (en) Network security field text data entity relation extraction method based on multi-task learning
CN113742493A (en) Method and device for constructing pathological knowledge map
CN113707339A (en) Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases
CN116244448A (en) Knowledge graph construction method, device and system based on multi-source data information
Frisoni et al. Unsupervised Descriptive Text Mining for Knowledge Graph Learning.
Tianxiong et al. Identifying chinese event factuality with convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant