CN111078889A - Method for extracting relationships among medicines based on attention of various entities and improved pre-training language model - Google Patents
Method for extracting relationships among medicines based on attention of various entities and improved pre-training language model Download PDFInfo
- Publication number
- CN111078889A CN111078889A CN201911330114.1A CN201911330114A CN111078889A CN 111078889 A CN111078889 A CN 111078889A CN 201911330114 A CN201911330114 A CN 201911330114A CN 111078889 A CN111078889 A CN 111078889A
- Authority
- CN
- China
- Prior art keywords
- drug
- sentence
- attention
- entity
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
Abstract
The invention belongs to the technical field of computer natural language processing, and provides a method for extracting relationships among medicines based on attention of various entities and an improved pre-training language model. A variety of different entity attention mechanisms are utilized within a neural network to enhance understanding of complex drug names by the neural network, wherein the entity attention mechanisms include: the method comprises the following steps of entity mark attention, two-entity mark difference attention and an attention mechanism based on an entity description document, and meanwhile, the input of a pre-training language model is improved, so that the output of the pre-training language model can be better suitable for a relationship extraction task among medicines. The method has the advantages that the problem that the medicine name cannot be well understood by a deep learning model due to the fact that the medicine name is too complex when a medicine relation description document is processed is solved, and the medicine relation recognition level is improved.
Description
Technical Field
The invention belongs to the technical field of computer natural language processing, relates to a method for extracting relationships between medicines from biomedical texts, and particularly relates to a method for extracting relationships between medicines based on an improved pre-training language model and multiple entity attention mechanisms.
Background
Drug-drug interactions (DDIs) refer to the combined effects of two or more drugs taken simultaneously or over a period of time. With the ongoing and intensive research on drug-drug interactions by medical workers, a great deal of valuable information is buried in the exponentially growing unstructured biomedical literature. Many of The information on drugs are currently found in Drug-related open databases such as Drug bank (Wishart, D.S. et al. Drug Bank5.0: A major update to The Drug Bank database for 2018.nucleic acids Res.2017,46, 1074-. How to automatically extract the relationship between structured medicines from massive unstructured biomedical documents is a problem which needs to be solved by researchers urgently.
Relationship extraction is one of the common tasks in natural language processing that enables the mining of relationships between two specific entities in text through a machine learning model. The extraction of the relationship between drugs is a very typical relationship extraction task, and is also one of the tasks which are very concerned in the biomedical field. In recent years, DDI Extraction2011 (section Bedmar, I.et al, The 1st section topic of The section-2011 Challenge: Extraction of Drug-Drug Interactions from biological tissues. in Proceedings of The 1st section Task Extraction2011, Huelva, Spain,7 September) and DDIExtraction2013 (section-Bedmar, I.et al.section Eval-2013Task9: Extraction of Drug-Drug Interactions from biological tissues (DDI Extraction 2013), introduction of The section 7 of The section biological tissues of The section, III, IV.
At present, researchers mainly adopt corpora of a DDI Extraction2013 task to evaluate the performance of a DDI Extraction model. The difficulty with this task is to classify the relationships between drugs described in biomedical text into 5 classes, including the mecanism type, Effect type, Advice type, Int type and Negative type. The mecanism type is used to describe the pharmacokinetic relationship of two drugs. The Effect type is used to indicate that two drugs have mutual influence on the drug Effect. The Advise type is used to describe the recommended or suggested relationship of two drugs in use. The Int type is used to describe that two drugs have a specific relationship but is not described in the literature. Negative type indicates that there is no correlation between the two drugs. For example, in the illustrative sentence "Pantoprazole a murach maker effect on loopdogrel's pharmacokineticics and ndon sheet reactivity during contact consistent use", where the relationship mechanism exists between the drugs "Pantoprazole" and "loopdogrel". In the example sentence "code in combination with other therapeutic algorithms, general anethosthetics, phenothiazines, transformers, sectional-hypnotics, or other cns depressants (including alcohols) has additional negative effects", the relationship effect exists between the drugs code and the therapeutic algorithms. Through the second example, we can also find that in the sentence, besides two drugs that have a relationship, other drugs such as "anesthesics", "phenothiazines" and "alcohol" are mentioned. The medicines which do not have the relation in the sentence can interfere the judgment of the current medicine relation, and the difficulty of model judgment is improved. In addition, drug names tend to be quite complex, which also makes it difficult for models to understand the meaning of drug entities in sentences by drug name.
Currently, two types of methods are mainly used for such tasks, the first type is the traditional machine learning method, and the second type is Deep learning (LeCun Y et al, Deep learning [ J ]]Methods of Nature,2015,521(7553): 436). The traditional machine learning method needs to extract a large number of words from the original text,And the characteristics such as grammar and the like are sent to a discriminator such as SVM or random forest and the like. Chowdhury et al (Chowdhury M et al, FBK-irst: A multi-phase kernel base amplification for Drug-Drug interaction detection and classification of Drug expression C]7th International work on semantic evaluation, Atlanta, Georgia, USA,2013: 351-. J.Et al (J,Kaewphan S,Salakoski T.UTurku:drug namedentity recognition and drug-drug interaction extraction using SVMclassification and domain knowledge[C]The shortest dependence path information is adopted as the input of the SVM model and the knowledge of the related fields is fused by Volume 2: Proceedings of the SeventhInternationalWorkshop on Semantic Evaluation (SemEval 2013).2013: 651-659). Thomas et al (Thomas P, Neves M,T,et al.WBI-DDI:drug-drug interaction extraction usingmajority voting[C]// Second Joint Conference on Lexical and comparative Semantics (. SEM), Volume 2: Proceedings of the Seventh International Workshop on Semanetic Evaluation (SemEval 2013).2013:628-635) used a voting-based kernel method for classification. In general, the conventional machine learning-based method needs to design a large number of complex feature sets to improve the performance of the model, but designing and extracting the feature sets requires much manpower.
In recent years, more and more depth models are applied to natural language processing tasks and have good effects. Quan et al (Quan C, Hua L, Sun X, et al. multichannel connected neural network for biological interaction [ J ]. BioMed research international,2016,2016:1-10) propose a multichannel CNN model using word vectors obtained by various pre-training methods as input. Asada et al (Asada M, Miwa M, Sasaki Y. engineering Drug-Drug Interactive extraction from texture by Molecular Structure Information [ J ]. Proceedings of 56th annular Meeting of the ACL,2018:680-685) propose a method of fusing Molecular Information into CNN and Graph Convolution Neural Network (GCNN) to extract DDI. Recurrent Neural Networks (RNNs) are better suited to processing time series data than CNNs, and are better at capturing sequence features of sentences. Zhang et al (Zhang Y, Zheng W, Lin H, et al. drug-drug interaction experience technical RNNs on sequence and short dependency path [ J ]. Bioinformatics,2017,34(5): 828) 835) propose a hierarchical RNN method, combining Shortest Dependent Paths (SDPs) and sentence sequences for DDI extraction. Some researchers also combined the two models to extract DDIs. Sun et al (Sun X, Dong K, Ma L, et al. Drug-Drug Interactive extraction viia recovery Hybrid Networks with improved Focal local [ J ]. Entrophy, 2019,21(1):37) propose a new Recursive Hybrid Convolutional Neural Network (RHCNN) for DDI extraction.
Although various approaches have been proposed, there is still much room to improve the performance of the DDI extraction model. In order to avoid the influence of complicated drug names on the performance of the model, the former work often replaces drug names in sentences with specific words, which results in the loss of part of useful information. Furthermore, previous work mostly relies on the syntactic characteristics of the dependency path to improve the performance of the model, and the syntactic characteristics depend on specific tool generation, so that the performance of the model is also limited by the tools.
Disclosure of Invention
The invention does not depend on any lexical and syntactic information, simplifies the input of the model through the improved BioBERT pre-training word vector and various entity attention mechanisms, better utilizes the drug name information, and the performance reaches the current leading level.
The technical scheme of the invention is as follows:
a method for extracting relationships among medicines based on multi-entity attention and an improved pre-training language model comprises the following steps:
text preprocessing
Preprocessing the corpus: (1) firstly, all texts are converted into lower case, and then punctuation marks and non-English characters are removed; (2) because the extraction of the relationship between the medicines does not relate to quantitative analysis, the invention replaces all the numbers in the text with the word "num"; (3) a sentence may contain a plurality of drug entities, and for each pair of drug entities an instance is generated, togetherAn example, where n is the number of drug entities in the sentence; (4) replacing the target entity in each instance with "drug 1" and "drug 2", replacing with "drug 0" for the non-target entities in the instance; (5) the set model is able to handle the maximum length of a sentence and if the sentence in the instance does not reach the maximum length, it is filled in with the character "0".
(II) obtaining sentence preliminary coding by using improved BioBERT model
And the improved BioBERT is adopted as an encoding mode of a word vector, so that the word vector has better generalization performance. As shown in FIG. 2, the BioBERT model is composed of 12 layers of Transformer structures as the BERT model, and the output of each layer of Transformer is sent to the next layer of Transformer; averaging output vectors of transformers of the last four layers in a BioBERT model, and replacing the original output of the BioBERT with the average vectors; for the preprocessed sentence X ═ { X ═ X1,x2,...,xm(m is sentence length), after encoding by the above-mentioned modified biobeep, a vector representation of the sentence, V, biobert (x), is obtained;
(III) utilizing bidirectional gating recursion unit to obtain semantic representation of sentence
In order to incorporate the context information into the sentence code, the Bi-GRU is adopted to enter the sentenceCarrying out further encoding; for each word V in ViIts representation is obtained by forward and backward GRU codingAndthen the forward and backward results are spliced to obtain the final representation of each wordWherein d ishDimension for GRU unit output; the sentence coding vector is H ═ H in this case1,h2,...,hm};
(IV) enhancing the weight of an entity in a sentence by using a plurality of entity attention mechanisms
The sentence coding vector H is processed through three different entity attention mechanisms to enhance the understanding of the model to the medicine entity; the three attention mechanisms all adopt an original attention model, but input drug entity information is different, so that the neural network model utilizes the drug entity information from different angles;
these three attention mechanisms are described separately below;
(4.1) drug description document attention
Selecting Wikipedia and drug Bank as the acquisition path of the drug entity description document, and for the set E ═ E of all drug entities in the corpus1,e2,...,ekK is the total number of all drug entities in the corpus, the drug description document is converted into a vector set K of drug description documents which is Doc2Vec (E) through a Doc2Vec model,wherein d iseIs the length of the document vector;
(4.2) attention of drug entities
The drug entity word vector is sent to an attention mechanism as a feature; the drug entity information is two existing relations in the sentence coding vector HCorresponding to the drug entity of (a) vector he1,he2;
(4.3) attention between drug entities
The difference between two drug entities is used as mutual information between two drugs and sent to an attention mechanism; the inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2;
Respectively sending the three entity information and the sentence coding vector H into an attention mechanism together to obtain entity information weighted sentence representation; the attention mechanism is shown in equations (1-3):
M=tanh([HWs,RWp]+b) (1)
α=softmax(M) (2)
r=HαT(3)
wherein the content of the first and second substances,expanding the three characteristics to the length equal to the sentence to obtain a sequence;a parameter matrix for attention mechanism, wherein daIs the matrix dimension;is an offset; the output of the attention mechanism is
With the above-mentioned attention mechanism, an entity-weighted sentence vector representation based on three features is obtained, as shown in equations (4-8):
re1=attention(H,he1) (6)
re2=attention(H,he2) (7)
re12=attention(H,he12) (8)
wherein k is1And k2Is two drug description document vectors, r, from a set K of drug description document vectorsk1And rk2Attention results obtained by two drug entity document description vectors, re1And re2Attention results obtained for two drug entities, re12Attention results obtained by the difference of the two drug entities; by encoding these attention results and the last element H of the sentence-encoding vector HmAnd (4) splicing to obtain a final sentence expression vector O, as shown in formula (9):
(V) obtaining the final medicine relation classification by utilizing a Softmax classifier
After the sentence representation weighted by the entity information is obtained, compressing the sentence representation vector dimension through a layer of feedforward neural network, and finally sending the sentence representation vector dimension to a Softmax layer to obtain a final classification result;
the model output layer sends the output O of the multi-entity attention layer as the final classification feature to the full-connection layer for classification, and the probability P (y ═ C) of y belonging to the DDI type of the C (C ∈ C) of the candidate drug-drug relation is shown as the formula (10):
P(yi)=Softmax(OWO+b) (10)
wherein, WOAnd b is a weight matrix and an offset, the activation function of the full connectivity layer is Softmax, and C is a set of DDI type tags. Finally, the category label with the highest probability is calculated using equation (11)I.e. the type of relationship between candidate drug-drug pairs.
The invention has the beneficial effects that: the comparison between the extraction method of the present invention and other DDI extraction methods is shown in table 1, where all the methods are tests performed on DDIExtraction2013 corpus. The F1 value of the present invention was 80.9%, which is a 5.4% improvement over the previous best results. In addition, the system also reaches the highest accuracy and recall rate, which are respectively increased to 81.0% and 80.9%.
TABLE 1 comparison of the Effect of the invention with other DDI extraction methods
Drawings
FIG. 1 is a diagram of a neural network model architecture employed by the present invention.
FIG. 2 is a schematic representation of the improvement of the BioBERT model according to the present invention.
Detailed Description
The following describes in detail embodiments of the present invention in conjunction with the constructed neural network model of the present invention.
The general model structure of the invention is shown in figure 1. The DDI corpus to be processed is preprocessed, the noun explanation of the drugs involved in the text is searched out from DrugBank and Wikipedia, and the descriptions of the drugs are converted into vectors by a Doc2Vec tool. For sentences in DDI corpus, the invention obtains vector representation of the sentences through a modified BioBERT model and a bidirectional GRU network. And finally, obtaining a final discrimination result through a feedforward neural network and a softmax layer. The specific implementation flow is described below.
First, preprocessing the corpus
The pretreatment work comprises the following steps:
(1) removing punctuation marks and non-English characters in the corpus, and separating each word by a blank;
(2) uniformly converting the text into lower case characters;
(3) uniformly replacing related numbers in the corpus into num;
(4) for the case that a sentence in the corpus contains a plurality of drug entities, combining all the drug entities pairwise, if the sentence contains n drug entities, generating the drug entities togetherAn example. In addition, the invention replaces the drug entities in each instance that are in relationship with each other with "drug 1" and "drug 2", and for the other drugs in the instance with "drug 0".
(5) The set model is able to handle the maximum length of a sentence and if the sentence in the instance does not reach the maximum length, it is filled in with the character "0".
Coding of sentences
The sentence coding is divided into the following two steps:
(1) preliminary coding of sentences by modified BioBERT
The present invention encodes each word in a sentence as a word vector through a modified BioBERT. For the preprocessed sentence X ═ { X ═ X1,x2,...,xnN is the sentence length, resulting in a vector representation of the sentence V-biobert (x). BioBERT was trained using two biomedical databases, PMC and PubMed.
(2) Context semantic coding of sentences by Bi-GRU
For each word V in ViThe invention obtains its representation by forward and backward GRU codingAndthen the forward and backward results are spliced to obtain the final result of each wordTo representWherein d ishThe dimension of the GRU unit output. When the sentence is coded as H ═ H1,h2,...,hn}. The output dimensions of the GRU units and the dimensions of the output of the BioBERT model are identical.
Coding of medicine description document
The invention adopts a browser automatic test framework selenium as a crawler to dynamically crawl the abstract of each entity in Wikipedia and DrugBank. In the process of crawling the abstract, all entities can not find a definite abstract corresponding to the entity, for example, a medicine of which 'neuro-drugs' (anticonvulsant medicines) are not very clear but a general name of a class of medicines can not be found, so that the entity entry can not be found, and therefore, the names of the large classes of the entities are used for replacing the whole entity, namely, the abstract using 'neuro-drugs' as key words is used as the abstract of the whole entity. If a small number of words have no corresponding abstract after the above processing, the entity itself is used as the corresponding abstract for supplement.
For the set of all drug entities in the corpus E ═ E1,e2,...,ekConverting the drug description document of the corpus into a vector set K of drug description documents, which is Doc2Vec (E), through a Doc2Vec model,wherein d iseIs the length of the document vector.
Attention mechanism for four or more kinds of entities
The three kinds of entity information adopted by the method are respectively medicine description information, medicine entity information and medicine inter-information. The medicine description information is a medicine description document vector set K, and the medicine entity information is a vector H corresponding to two medicine entities with a relationship in the sentence sequence coding He1,he2The inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2. Dimension of three kinds of entity information is same as GRU unitIs output dimension.
The three entity information are respectively sent into an attention mechanism (formula 1-3) together with a vector expression H of the sentence, and a sentence expression with the entity information weighted is obtained, as shown in formula (4-8). Where rk1 and rk2 are attention results obtained by two drug entity document description vectors, re1 and re2 are attention results obtained by two drug entities, and re12 is attention result obtained by the difference of two drug entities. By comparing these attention results with the last element H of the sentence vector sequence HfinalAnd (5) splicing to obtain a final sentence expression vector O, as shown in formula (9). Note that the output of the force mechanism has the same dimension as the output of the GRU unit.
Five, output
And the model output layer sends the output O of the multi-entity attention layer as a final classification feature to the full-connection layer for classification, and the candidate drug-drug relation is shown as a formula (10) for the probability P (y ═ C) that y belongs to the DDI type of the C (C ∈ C).
Wherein, WOAnd b is a weight matrix and an offset, the activation function of the full link layer is Softmax, and C is a set C ═ negative, effect, mechanism, advice, int } of DDI type tags. Finally, the category label with the highest probability is calculated using equation (11)I.e. the type of relationship between candidate drug-drug pairs.
After the model is realized through the five steps, the invention trains the model and tests the performance on the DDIExtraction2013 corpus. The division of the training set and test set is 9: 1. Summary of the DDI 2013 corpus is shown in Table 2, the DDI corpus consists of 792 texts from a drug Bank database and 233 abstracts from a MedLine database, and the total drug relationships are 5, namely Negative, Effect, Mechanism, Advice and Int.
TABLE 2 number of relationships in DDIExtraction2013 corpus
Type (B) | DDI-DrugBank | DDI-MedLine | Total of |
Effect | 1855(39.4%) | 214(65.4%) | 2069(41.1%) |
Mechanism | 1539(32.7%) | 86(26.3%) | 1625(32.3%) |
Advice | 1035(22%) | 15(4.6%) | 1050(20.9%) |
Int | 272(5.8%) | 12(3.7%) | 765(5.6%) |
Total of | 4701 | 327 | 5028 |
The invention generates additional examples by matching the corpora in the medicine pairwise. However, in the training examples obtained by the method, the number of Negative type examples is extremely large, and the imbalance of the classes can greatly affect the performance of the model. In order to solve the problem of unbalanced number of the relation examples of various medicines in the corpus, the invention carries out the work of clearing negative examples according to the following three rules:
1. if two drugs in a drug pair appear in the same relationship, the corresponding instance is filtered out.
2. If two drugs in a drug pair have the same name, or one is an abbreviation for the other, the corresponding instance is filtered out.
3. If one drug in a drug pair is a special case of the other drug, the corresponding instance is filtered out.
The corpus instance information after deleting the negative cases is shown in table 3. By adopting the negative case deleting method based on the rules, the problem of unbalance among the cases is relieved to a certain extent.
TABLE 3 data set by example Generation and negative deletion
The evaluation index adopted by the invention is F1 value, as shown in formula (12):
wherein P represents the precision, R represents the recall, and the calculation formulas (13-14) of precision and recall are as follows:
where TP represents the number of predicted positive and actual positive instances, FP represents the number of predicted positive and actual negative instances, FN represents the number of predicted negative and actual positive instances, and TN represents the number of predicted negative and actual negative instances.
The invention adopts a Keras library based on a Tensorflow bottom layer to realize a specific model. The model set-up parameters are shown in table 4.
TABLE 4 parameter set of inventive model
Parameter name | Parameter value |
Doc2Vec vector dimension | 200 |
BioBERT vector dimension | 768 |
Output dimension of BiGRU layer | 1536 |
Maximum sentence length | 250 |
Attention layer output dimension | 1536 |
Output dimension of multilayer perceptron | 256 |
In the training phase, the present invention uses an early stop method. After 10 rounds of continuous training, if the model performance in the verification set is not improved, the training is stopped and the best model to perform in the verification set is selected as the final model to predict the results of the test set. And (4) tuning all the hyper-parameters on the verification set through grid search. The learning rate of the model during training was set to 0.001, and 128 instances were processed by the model each time.
Claims (1)
1. A method for extracting relationships among medicines based on attention of various entities and an improved pre-training language model is characterized by comprising the following steps:
text preprocessing
Preprocessing the corpus: (1) firstly, all texts are converted into lower case, and then punctuation marks and non-English characters are removed; (2) replacing all the numbers in the text with the word "num"; (3) a sentence may contain a plurality of drug entities, and for each pair of drug entities an instance is generated, togetherAn example, wherein n is the number of drug entities in the sentence; (4) replacing the target entity in each instance with "drug 1" and "drug 2", replacing with "drug 0" for the non-target entities in the instance; (5) setting the maximum length of the sentence which can be processed by the model, and if the sentence in the example does not reach the maximum length, filling the sentence with the character of '0';
(II) obtaining sentence preliminary coding by using improved BioBERT model
Adopting an improved BioBERT as a coding mode of a word vector, wherein the BioBERT model is composed of 12 layers of Transformer structures like the BERT model, and the output of each layer of Transformer is sent to the next layer of Transformer; in the BioBERT model, averaging output vectors of transformers of the last four layers, and replacing the original output of the BioBERT with the average vectors; for the preprocessed sentence X ═ { X ═ X1,x2,...,xmWhere m is the sentence length, after encoding by the above-described modified biobert, a vector representation of the sentence V ═ biobert (x) is obtained;
(III) utilizing bidirectional gating recursion unit to obtain semantic representation of sentence
In order to integrate the context information into the sentence codes, Bi-GRU is adopted to further code the sentences; for each word V in ViIts representation is obtained by forward and backward GRU codingAndthen the forward and backward results are spliced to obtain the final representation of each wordWherein d ishDimension for GRU unit output; the sentence coding vector is H ═ H in this case1,h2,...,hm};
(IV) enhancing the weight of an entity in a sentence by using a plurality of entity attention mechanisms
The sentence coding vector H is processed through three different entity attention mechanisms to enhance the understanding of the model to the medicine entity;
(4.1) drug description document attention
Selecting Wikipedia and drug Bank as the acquisition path of the drug entity description document, and for the set E ═ E of all drug entities in the corpus1,e2,...,ekK is the total number of all drug entities in the corpus, the drug description document is converted into a vector set K of drug description document (Doc 2Vec (E)) through a Doc2Vec model,wherein d iseIs the length of the document vector;
(4.2) attention of drug entities
The drug entity word vector is sent to an attention mechanism as a feature; the drug entity information is a vector corresponding to two drug entities with relationship in the sentence coding vector Hhe1,he2;
(4.3) attention between drug entities
The difference between two drug entities is used as mutual information between two drugs and sent to an attention mechanism; the inter-drug information is the difference of the vectors of two drug entities, i.e. he12=he1-he2;
Respectively sending the three entity information and the sentence coding vector H into an attention mechanism together to obtain entity information weighted sentence representation; the attention mechanism is shown in equations (1-3):
M=tanh([HWs,RWp]+b) (1)
α=softmax(M) (2)
r=HαT(3)
wherein the content of the first and second substances,expanding the three characteristics to the length equal to the sentence to obtain a sequence;
a parameter matrix for attention mechanism, wherein daIs the matrix dimension;is an offset; the output of the attention mechanism is
With the above-mentioned attention mechanism, an entity-weighted sentence vector representation based on three features is obtained, as shown in equations (4-8):
re1=attention(H,he1) (6)
re2=attention(H,he2) (7)
re12=attention(H,he12) (8)
wherein k is1And k2Is two drug description document vectors, r, from a set K of drug description document vectorsk1And rk2Attention results obtained by two drug entity document description vectors, re1And re2Attention results obtained for two drug entities, re12Attention results obtained by the difference of the two drug entities; by encoding these attention results and the last element H of the sentence-encoding vector HmAnd (4) splicing to obtain a final sentence expression vector O, as shown in formula (9):
(V) obtaining the final medicine relation classification by utilizing a Softmax classifier
After the sentence representation weighted by the entity information is obtained, compressing the sentence representation vector dimension through a layer of feedforward neural network, and finally sending the sentence representation vector dimension to a Softmax layer to obtain a final classification result;
the model output layer sends the output O of the multi-entity attention layer as the final classification feature to the full-connection layer for classification, and the probability P (y ═ C) of y belonging to the DDI type of the C (C ∈ C) of the candidate drug-drug relation is shown as the formula (10):
P(yi)=Soft max(OWO+b) (10)
wherein, WOB is a weight matrix and an offset, an activation function of the full connection layer is Softmax, and C is a set of DDI type tags; finally, the category label with the highest probability is calculated using equation (11)Namely the relation type of the candidate drug-drug pair;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911330114.1A CN111078889B (en) | 2019-12-20 | 2019-12-20 | Method for extracting relationship between medicines based on various attentions and improved pre-training |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911330114.1A CN111078889B (en) | 2019-12-20 | 2019-12-20 | Method for extracting relationship between medicines based on various attentions and improved pre-training |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111078889A true CN111078889A (en) | 2020-04-28 |
CN111078889B CN111078889B (en) | 2021-01-05 |
Family
ID=70316460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911330114.1A Active CN111078889B (en) | 2019-12-20 | 2019-12-20 | Method for extracting relationship between medicines based on various attentions and improved pre-training |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111078889B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111798954A (en) * | 2020-06-11 | 2020-10-20 | 西北工业大学 | Drug combination recommendation method based on time attention mechanism and graph convolution network |
CN111814460A (en) * | 2020-07-06 | 2020-10-23 | 四川大学 | External knowledge-based drug interaction relation extraction method and system |
CN111949792A (en) * | 2020-08-13 | 2020-11-17 | 电子科技大学 | Medicine relation extraction method based on deep learning |
CN112256939A (en) * | 2020-09-17 | 2021-01-22 | 青岛科技大学 | Text entity relation extraction method for chemical field |
CN112528621A (en) * | 2021-02-10 | 2021-03-19 | 腾讯科技(深圳)有限公司 | Text processing method, text processing model training device and storage medium |
CN112667808A (en) * | 2020-12-23 | 2021-04-16 | 沈阳新松机器人自动化股份有限公司 | BERT model-based relationship extraction method and system |
CN112820375A (en) * | 2021-02-04 | 2021-05-18 | 闽江学院 | Traditional Chinese medicine recommendation method based on multi-graph convolution neural network |
CN112860816A (en) * | 2021-03-01 | 2021-05-28 | 三维通信股份有限公司 | Construction method and detection method of interaction relation detection model of drug entity pair |
CN113241128A (en) * | 2021-04-29 | 2021-08-10 | 天津大学 | Molecular property prediction method based on molecular space position coding attention neural network model |
CN113642319A (en) * | 2021-07-29 | 2021-11-12 | 北京百度网讯科技有限公司 | Text processing method and device, electronic equipment and storage medium |
CN113806531A (en) * | 2021-08-26 | 2021-12-17 | 西北大学 | Drug relationship classification model construction method, drug relationship classification method and system |
CN114048727A (en) * | 2021-11-22 | 2022-02-15 | 北京富通东方科技有限公司 | Medical field-oriented relation extraction method |
CN114925678A (en) * | 2022-04-21 | 2022-08-19 | 电子科技大学 | Drug entity and relationship combined extraction method based on high-level interaction mechanism |
CN117408247A (en) * | 2023-12-15 | 2024-01-16 | 南京邮电大学 | Intelligent manufacturing triplet extraction method based on relational pointer network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160275250A1 (en) * | 2015-03-17 | 2016-09-22 | Biopolicy Innovations Inc. | Drug formulary document parsing and comparison system and method |
CN108733792A (en) * | 2018-05-14 | 2018-11-02 | 北京大学深圳研究生院 | A kind of entity relation extraction method |
CN109902171A (en) * | 2019-01-30 | 2019-06-18 | 中国地质大学(武汉) | Text Relation extraction method and system based on layering knowledge mapping attention model |
CN110580340A (en) * | 2019-08-29 | 2019-12-17 | 桂林电子科技大学 | neural network relation extraction method based on multi-attention machine system |
-
2019
- 2019-12-20 CN CN201911330114.1A patent/CN111078889B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160275250A1 (en) * | 2015-03-17 | 2016-09-22 | Biopolicy Innovations Inc. | Drug formulary document parsing and comparison system and method |
CN108733792A (en) * | 2018-05-14 | 2018-11-02 | 北京大学深圳研究生院 | A kind of entity relation extraction method |
CN109902171A (en) * | 2019-01-30 | 2019-06-18 | 中国地质大学(武汉) | Text Relation extraction method and system based on layering knowledge mapping attention model |
CN110580340A (en) * | 2019-08-29 | 2019-12-17 | 桂林电子科技大学 | neural network relation extraction method based on multi-attention machine system |
Non-Patent Citations (3)
Title |
---|
LING LUO等: "An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition", 《DATA AND TEXT MINING》 * |
李丽双等: "融合依存信息Attention机制的药物关系抽取研究", 《中文信息学报》 * |
蒋振超: "基于词表示和深度学习的生物医学关系抽取", 《万方数据》 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111798954A (en) * | 2020-06-11 | 2020-10-20 | 西北工业大学 | Drug combination recommendation method based on time attention mechanism and graph convolution network |
CN111814460A (en) * | 2020-07-06 | 2020-10-23 | 四川大学 | External knowledge-based drug interaction relation extraction method and system |
CN111949792A (en) * | 2020-08-13 | 2020-11-17 | 电子科技大学 | Medicine relation extraction method based on deep learning |
CN111949792B (en) * | 2020-08-13 | 2022-05-31 | 电子科技大学 | Medicine relation extraction method based on deep learning |
CN112256939A (en) * | 2020-09-17 | 2021-01-22 | 青岛科技大学 | Text entity relation extraction method for chemical field |
CN112256939B (en) * | 2020-09-17 | 2022-09-16 | 青岛科技大学 | Text entity relation extraction method for chemical field |
CN112667808A (en) * | 2020-12-23 | 2021-04-16 | 沈阳新松机器人自动化股份有限公司 | BERT model-based relationship extraction method and system |
CN112820375A (en) * | 2021-02-04 | 2021-05-18 | 闽江学院 | Traditional Chinese medicine recommendation method based on multi-graph convolution neural network |
CN112528621A (en) * | 2021-02-10 | 2021-03-19 | 腾讯科技(深圳)有限公司 | Text processing method, text processing model training device and storage medium |
CN112860816A (en) * | 2021-03-01 | 2021-05-28 | 三维通信股份有限公司 | Construction method and detection method of interaction relation detection model of drug entity pair |
CN113241128A (en) * | 2021-04-29 | 2021-08-10 | 天津大学 | Molecular property prediction method based on molecular space position coding attention neural network model |
CN113241128B (en) * | 2021-04-29 | 2022-05-13 | 天津大学 | Molecular property prediction method based on molecular space position coding attention neural network model |
CN113642319A (en) * | 2021-07-29 | 2021-11-12 | 北京百度网讯科技有限公司 | Text processing method and device, electronic equipment and storage medium |
CN113806531A (en) * | 2021-08-26 | 2021-12-17 | 西北大学 | Drug relationship classification model construction method, drug relationship classification method and system |
CN113806531B (en) * | 2021-08-26 | 2024-02-27 | 西北大学 | Drug relationship classification model construction method, drug relationship classification method and system |
CN114048727B (en) * | 2021-11-22 | 2022-07-29 | 北京富通东方科技有限公司 | Medical field-oriented relationship extraction method |
CN114048727A (en) * | 2021-11-22 | 2022-02-15 | 北京富通东方科技有限公司 | Medical field-oriented relation extraction method |
CN114925678A (en) * | 2022-04-21 | 2022-08-19 | 电子科技大学 | Drug entity and relationship combined extraction method based on high-level interaction mechanism |
CN114925678B (en) * | 2022-04-21 | 2023-05-26 | 电子科技大学 | Pharmaceutical entity and relationship joint extraction method based on high-level interaction mechanism |
CN117408247A (en) * | 2023-12-15 | 2024-01-16 | 南京邮电大学 | Intelligent manufacturing triplet extraction method based on relational pointer network |
CN117408247B (en) * | 2023-12-15 | 2024-03-29 | 南京邮电大学 | Intelligent manufacturing triplet extraction method based on relational pointer network |
Also Published As
Publication number | Publication date |
---|---|
CN111078889B (en) | 2021-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111078889B (en) | Method for extracting relationship between medicines based on various attentions and improved pre-training | |
CN112199511B (en) | Cross-language multi-source vertical domain knowledge graph construction method | |
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN111159223B (en) | Interactive code searching method and device based on structured embedding | |
CN109446338B (en) | Neural network-based drug disease relation classification method | |
US20220147836A1 (en) | Method and device for text-enhanced knowledge graph joint representation learning | |
CN110825721B (en) | Method for constructing and integrating hypertension knowledge base and system in big data environment | |
CN111950285B (en) | Medical knowledge graph intelligent automatic construction system and method with multi-mode data fusion | |
CN110532328B (en) | Text concept graph construction method | |
CN106844658A (en) | A kind of Chinese text knowledge mapping method for auto constructing and system | |
CN110287323B (en) | Target-oriented emotion classification method | |
CN105512209A (en) | Biomedicine event trigger word identification method based on characteristic automatic learning | |
CN111625659A (en) | Knowledge graph processing method, device, server and storage medium | |
CN110189831A (en) | A kind of case history knowledge mapping construction method and system based on dynamic diagram sequences | |
CN113806563A (en) | Architect knowledge graph construction method for multi-source heterogeneous building humanistic historical material | |
Gao et al. | Detecting disaster-related tweets via multimodal adversarial neural network | |
CN115019906B (en) | Drug entity and interaction combined extraction method for multi-task sequence labeling | |
CN115269865A (en) | Knowledge graph construction method for auxiliary diagnosis | |
WO2020074017A1 (en) | Deep learning-based method and device for screening for keywords in medical document | |
CN113988075A (en) | Network security field text data entity relation extraction method based on multi-task learning | |
CN113742493A (en) | Method and device for constructing pathological knowledge map | |
CN113707339A (en) | Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases | |
CN116244448A (en) | Knowledge graph construction method, device and system based on multi-source data information | |
Frisoni et al. | Unsupervised Descriptive Text Mining for Knowledge Graph Learning. | |
Tianxiong et al. | Identifying chinese event factuality with convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |