CN110020671B - Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network - Google Patents

Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network Download PDF

Info

Publication number
CN110020671B
CN110020671B CN201910174269.4A CN201910174269A CN110020671B CN 110020671 B CN110020671 B CN 110020671B CN 201910174269 A CN201910174269 A CN 201910174269A CN 110020671 B CN110020671 B CN 110020671B
Authority
CN
China
Prior art keywords
text
drug
medicine
layer
preprocessed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910174269.4A
Other languages
Chinese (zh)
Other versions
CN110020671A (en
Inventor
孙霞
马龙
张蕾
冯筠
吴楠楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201910174269.4A priority Critical patent/CN110020671B/en
Publication of CN110020671A publication Critical patent/CN110020671A/en
Application granted granted Critical
Publication of CN110020671B publication Critical patent/CN110020671B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method for constructing a drug relation classification model based on a two-channel CNN-LSTM network, which comprises the steps of preprocessing an original drug text set, and performing reverse order operation on each preprocessed drug text in the preprocessed drug text set to obtain a reverse order text set; taking the preprocessed medicine text set as a positive sequence text set; training a neural network to obtain a drug relation classification model; the neural network comprises a parallel forward text feature extraction layer, a parallel reverse text feature extraction layer, a feature fusion layer and a classification layer; the positive sequence text feature extraction layer and the negative sequence text feature extraction layer respectively comprise a convolution block and a long-short term memory neural network block which are sequentially arranged; according to the invention, a double-channel CNN-LSTM network is constructed, the local features of the medicine texts are extracted by using CNN, the global features of the medicine texts are respectively extracted by using LSTM, the extracted medicine relation features are richer, and the classification accuracy is improved.

Description

Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network
Technical Field
The invention relates to a method for constructing and classifying a drug relationship classification model, in particular to a method for constructing and classifying a drug relationship classification model based on a dual-channel CNN-LSTM network.
Background
Drug relationship refers to the combined effect of two or more drugs administered simultaneously or over a period of time. Such effects can be classified as synergistic, antagonistic, and non-interactive. The antagonistic effects of drugs on each other pose serious health risks to the patient. The drug relationship extraction (DDIE) task is a typical relationship extraction task in the field of natural language processing, aims to detect and identify the semantic relationship of drug pairs, and has important significance for reducing drug safety accidents and promoting the development of biomedical technology.
In recent years, experts in biomedicine and text mining have made great efforts on the task of DDIE, and have created many methods, which can be mainly classified into three categories: rule pattern based methods, statistical machine learning based methods, and deep learning based methods. Although the method based on the rule pattern can be used for identifying entity relations in the target text in a targeted mode, the method has three serious disadvantages: (1) A large amount of manpower and material resources are required to be consumed to research the target text, otherwise, the information extraction quality of the formulated rule cannot be guaranteed; (2) When the rules are formulated, experts in the field are required to provide a large amount of prior knowledge, and different experts may formulate rule sets with inconsistent standards due to subjective consciousness; (3) Because the method has strong pertinence to the domain knowledge, the method is only suitable for information extraction in the domain, and the generalization capability is generally poor, so that the method based on the rule mode does not attract extensive attention of researchers. Although these statistical machine learning-based methods work well, they require elaborate and cumbersome feature engineering to extract the appropriate feature set. However, the quality of the extracted features depends on the existing natural language processing tools, and therefore, the extracted features are unordered due to the noise and cost of the tools, and the quality of the features is difficult to be effectively guaranteed, so that the accuracy of classification is not high.
Disclosure of Invention
The invention aims to provide a method for constructing and classifying a drug relationship classification model based on a dual-channel CNN-LSTM network, which is used for solving the problem of low accuracy of drug relationship classification caused by disorder of features extracted by a drug relationship classification method in the prior art.
In order to realize the task, the invention adopts the following technical scheme:
a drug relationship classification model construction method based on a dual-channel CNN-LSTM network is implemented according to the following steps:
step 1, obtaining an original medicine text set;
labeling the medicine relation in each original medicine text in the original medicine text set to obtain a medicine relation label set;
step 2, preprocessing the original medicine text set to obtain a preprocessed medicine text set;
the preprocessing comprises text normalization, text length fixation and text vector mapping;
step 3, performing reverse order operation on each preprocessed drug text in the preprocessed drug text set to obtain a reverse order text set;
taking the preprocessed medicine text set as a positive sequence text set;
step 4, taking the positive sequence text set and the negative sequence text set as input, taking the drug relationship label set as output, training a neural network, and obtaining a drug relationship classification model;
the neural network comprises a forward sequence text feature extraction layer, a reverse sequence text feature extraction layer, a feature fusion layer and a classification layer which are arranged in parallel in sequence;
the forward sequence text feature extraction layer and the reverse sequence text feature extraction layer respectively comprise a convolution block and a long-term and short-term memory neural network block which are sequentially arranged.
Further, the number of the convolution blocks is set to 4.
Furthermore, each convolution block comprises a batch regularization sublayer, a convolution sublayer, an activation function sublayer and a pooling sublayer which are arranged in sequence.
Further, the activation function in the activation function sublayer is a ReLU function.
Further, the feature fusion layer comprises a full connection layer.
Further, the classification layer comprises a Softmax function layer.
A drug relation classification method based on a dual-channel CNN-LSTM network is characterized in that a drug text to be classified is executed according to the following steps:
step A, preprocessing the text of the medicine to be classified by adopting the method in the step 2 in the claim 1 to obtain a preprocessed medicine text;
step B, inputting the preprocessed drug text into the drug relation classification model of any one of claims 1 to 6 to obtain a classification result.
Compared with the prior art, the invention has the following technical characteristics:
1. according to the method for constructing and classifying the drug relationship classification model based on the dual-channel CNN-LSTM network, the dual-channel CNN-LSTM network is constructed, the CNN is used for extracting the local features of the drug texts, the LSTM is used for extracting the global features of the drug texts respectively, the extracted drug relationship features are richer, and the classification accuracy is improved;
2. the invention provides a method for constructing and classifying a medicine relation classification model based on a double-channel CNN-LSTM network, which is characterized in that a positive sequence text and a negative sequence text of a medicine relation text are respectively sent to the CNN-LSTM network to complete a characteristic extraction process, compared with a single-channel LSTM network, the extracted medicine relation characteristics are more comprehensive, and the classification accuracy is improved;
3. according to the method for constructing and classifying the drug relationship classification model based on the dual-channel CNN-LSTM network, the process of extracting the drug feature vectors is simplified and the accuracy of drug relationship classification is improved by extracting the drug text feature vectors;
4. the invention provides a method for constructing and classifying a drug relationship classification model based on a dual-channel CNN-LSTM network, which takes an original drug relationship text containing a plurality of drug entities as input, does not need manual intervention and related field knowledge, does not need to manually extract complex text features, and has strong generalization capability.
Drawings
FIG. 1 is a diagram of a drug classification model provided in one embodiment of the present invention;
fig. 2 is a diagram of the internal structure of a convolution block provided in one embodiment of the present invention.
Detailed Description
The terms appearing in the detailed description are explained first:
long short term memory neural network (LSTM): the LSTM network consists of an input gate, a forgetting gate, an output gate and a memory unit, and the LSTM can effectively learn long-term dependence information of input data through the complex gating mechanism and is widely applied to processing of serialized information such as text data, track data and the like.
Convolutional Neural Network (CNN): a feed-forward neural network including convolution calculations and having a depth structure.
Example one
As shown in fig. 1, in this embodiment, a method for constructing a drug relationship classification model based on a dual-channel CNN-LSTM network is disclosed, where the method is performed according to the following steps:
step 1, obtaining an original medicine text set;
labeling the drug relationship in each original drug text in the original drug text set to obtain a drug relationship label set;
the biomedical texts acquired in the embodiment can be acquired through biomedical documents, papers and other modes, and the acquired texts can be local or integral parts of the documents and the papers, but the semantic expression of the texts needs to be ensured to be complete.
The original drug text at least needs to include two target drug name words, the two target drug name words are drug words related to drug relationship classification, and the rest are other words, for example, in this embodiment, the original drug text is: "Some quinones, including ciprofloxacin, had been associated with a transformed expression in a serum secretory in a tissues recovering cyclosporine comunicationly", wherein "quinolones", "ciprofloxacin" and "cyclosporine" are drug name words and the remaining words are other words.
In the data set used herein, the length of text is between 0 and 150 words, most of the text length is distributed between 20 and 60 words, and backward dependence phenomena (e.g., grammatical phenomena such as idiomatic clauses) account for 46% of the data set.
The drug relationship labels include 5, which are advice, effect action, mechanism, int forward and irrelevant false, respectively.
Step 2, preprocessing the original medicine text set to obtain a preprocessed medicine text set;
the preprocessing comprises text normalization, text length fixation and text vector mapping;
in this embodiment, the preprocessing method for the original drug text set utilizes the processing method for the drug text set in the patent "drug relation classification method based on multilayer convolutional neural network". The method comprises the following steps that each original medicine text in an original medicine text set is different in format and length, medicine name words are complex and uncommon, and errors are easily introduced when a neural network is adopted for classification, so that the acquired original medicine text needs to be preprocessed, wherein the preprocessing comprises the step of performing word shape normalization on all words in the original medicine text, namely, the word shapes of all words are unified; naming the target drug name words in a uniform naming mode and replacing the original target drug name words in a named form, wherein the specific operation comprises the following steps:
step 2.1, normalizing all words in the original medicine text set, naming the words in a unified form, and replacing the target medicine name words by using the target medicine name words named in the unified form to obtain a normalized medicine text set;
wherein, the normalization comprises morphology normalization and naming normalization;
in order to enable the classification of the medicine texts to be more accurate and reduce the introduction of errors, the word form normalization is performed on each word in the original medicine texts, and the words are converted into a uniform format. And performing morphology normalization on each word in the original medicine text to obtain a normalized original medicine text until each word in each original medicine text in the original medicine text set is subjected to morphology normalization to obtain a normalized original medicine text set.
In order to improve generalization of the neural network, all target drug name words in the drug text are firstly named in a unified form, wherein the unified form is a form of "X serial number", where X may be any english word, such as "day", "interaction", and the like, and the serial number is a sorting serial number in an english form, such as "one, two, three", and the like, and the name after unified naming is replaced with the name of the original target drug word, and the target drug name words after replacement are "drug", "drug" and the like, and there is no influence between the drug texts, so as to obtain a preprocessed drug text set.
Step 2.2, unifying the length of each drug text in the normalized drug text set to obtain a drug text set with a fixed length;
fixing the length of each drug text in the preprocessed drug text set to n, and filling the text with the length less than n, wherein the filling mode may be a mode of using all-zero living random numbers, and the drug text may be represented as:
S=w 1 w 2 w 3 ...w n
step 2.3, carrying out vector mapping on each fixed-length medicine text in the fixed-length medicine text set to obtain a preprocessed medicine text set;
since the neural network cannot directly process the text in the natural language form as an input, the method for mapping the drug text into the text vector in the digital form comprises the following steps:
a. constructing a word vector table, wherein the word vector table is composed of words and corresponding digital word vectors;
the word vector table is composed of words and corresponding digital word vectors, each word in the word vector table corresponds to a unique digital word vector, and more words are filled in the word vector table as much as possible, so that the word vector table can cover more words.
In order to convert more meaningful Word Vectors, in the present embodiment, a GloVe (Global Vectors for Word reproduction) model Word vector table is provided by the NLP research group at the university of Stanford, wherein the GloVe model Word vector table includes 2196016 Word Vectors, and the dimension of each Word vector is 300. If a word in the input original text is not in this word vector table, each dimension of the word vector for that word is initialized to 0.
b. And mapping each medicine text with fixed length in the medicine text set in a table look-up mode to obtain a preprocessed medicine text set.
For each word in an n-dimensional drug text, mapping the word into a d-dimensional vector by looking up the word vector table, and mapping each word in the n-dimensional drug text into a d-dimensional word vector in this way, so that an original drug text S with the length of n is mapped into a (n × d) -dimensional text vector:
Figure BDA0001989036140000081
and for a text vector set containing m original medicine texts S with length n, the original medicine texts S are mapped into m x (n x d), and the medicine text set contains m (n x d) dimensional text vectors.
Step 3, performing reverse order operation on each preprocessed drug text in the preprocessed drug text set to obtain a reverse order text set;
taking the preprocessed medicine text set as a positive sequence text set;
in the scheme, in order to enable the extracted features to be more comprehensive, the positive sequence medicine text and the negative sequence medicine text are used for respectively training the neural network to obtain a classification model.
When the medicine text is operated in the reverse order, the order in the text vector is reversed, for example, a 1-dimensional vector [ 0.21.35.0.62.85.96 ], which is in the reverse order: [0.96 0.85 0.62 0.35 0.21].
Step 4, taking the positive sequence text set and the negative sequence text set as input, taking the drug relationship label set as output, training a neural network, and obtaining a drug relationship classification model;
the neural network comprises a parallel forward-sequence text feature extraction layer, a parallel reverse-sequence text feature extraction layer, a feature fusion layer and a classification layer;
the positive sequence text feature extraction layer and the negative sequence text feature extraction layer have the same structure and respectively comprise a convolution block and a long-term and short-term memory neural network block which are sequentially arranged.
In this embodiment, in order to improve the accuracy of the classification of the drug relationship, the structure of the neural network is redesigned, as shown in fig. 1, the convolution block is used to extract the local features of the text, and then the local features are sent to the LSTM model to supplement and extract the global features and the time sequence features of the text, but this is also a text capable of processing the forward sequence, if a backward modified text such as a certain phrase is encountered, the processing capability is still weak, so that two identical feature extraction layers are used to process the forward sequence and the reverse sequence of the input text respectively, and then the extracted forward sequence and reverse sequence features are combined to obtain the final text features; and then outputting the text features to a classification layer for classification.
In this embodiment, the number of the convolution blocks is less than 4, the extracted local features are not accurate enough, and the number of the convolution blocks is more than 4, so that an overfitting phenomenon occurs, which results in failure in feature extraction, so that as a preferred embodiment, 4 convolution blocks are provided.
Optionally, each convolution block includes a batch regularization sublayer, a convolution sublayer, an activation function sublayer, and a pooling sublayer that are sequentially arranged.
In this embodiment, as shown in fig. 2, the positive-order drug text and the negative-order drug text are both sent into the convolution block and then enter the batch regularization layer, the batch regularization layer functions to make the input data meet normal distribution, the speed of sample training meeting normal distribution is greatly increased, and the accuracy is also increased.
In this embodiment, the normalized data is sent to the convolutional layer for convolution operation, and the parameters of the convolutional layer are set as: the number of convolution units filter is 128.
And then entering an activation function, and deleting meaningless data after convolution, wherein the activation function is a Relu function as a preferred embodiment.
And repeating the convolution and activation operations, and sending the obtained data into a pooling layer, wherein the pooling layer uses the maximum pooling operation, for example, the size of a pooling window is 2*2, the pooling window of 2*2 is slid on the data after convolution and activation, the largest number in the window is selected as a representative in the sliding process, and the representative is represented by how many times of sliding, and then the representative is used as the representative of the original data. The benefits of this are: on the premise of ensuring that the text special certificate is not lost, the data is reduced, and the training of the network is accelerated.
After 4 identical convolution blocks are passed through, the positive-order medicine text and the negative-order medicine text enter the long-short term memory neural network block to obtain the global characteristics and the time sequence characteristics existing in the medicine relation text.
In this embodiment, the number of nodes in the long-short term memory neural network block is set to 64.
Optionally, the feature fusion layer comprises a fully-connected layer.
In this embodiment, after the forward-order drug text and the reverse-order drug text are sent to the CNN-LSTM network, the forward-order text feature and the reverse-order text feature are obtained, and the two features are sent to the full connection layer at the same time. For example, if 100 forward text features and 100 reverse text features exist, a fully connected layer with 200 nodes in the first layer and 100 nodes in the second layer is constructed, the forward text features are sent into the first 100 nodes in the first layer, the reverse text features are sent into the last 100 nodes in the first layer, and then the 200 features are fused together in this way.
Optionally, the classification layer comprises a Softmax function layer.
In this embodiment, the fully-connected layer and the Softmax function layer form the last part of the drug relationship classification algorithm, and are used to output the drug relationship labels in the form of digital vectors according to the number of classes, so as to determine the final result of the final drug relationship classification, each output node of the fully-connected layer and the Softmax function layer represents a drug class, the drug label finally output by the classifier is the probability that a given drug entity pair belongs to each drug class, and the probability value is [0,1]. For example, it is assumed that there are 2 drug relationships, which respectively represent a relationship and a non-relationship, the output nodes of the Softmax function layer are set to 2, that is, there are two drug relationships, which respectively represent positive and negative, and if the drug relationship label in the form of a digital vector output by the Softmax function layer is p [ positive, negative ] = [0.1,0.9], that is, in the output result of the Softmax function layer, the probability value that positive exists is 0.1, and the probability value that negative exists is 0.9, then the determination is made based on the probability value. In this example, the drug relationships include 5, advice, effect action, mechanism, int forward and irrelevant false.
Training the hierarchical convolutional cyclic neural network by adopting the input and the output to obtain a medicine relation classification hierarchical convolutional cyclic neural network, wherein the medicine relation text and each medicine relation label are in a digital vector form; repeatedly training the hierarchical convolutional cyclic neural network for N times, and taking the hierarchical convolutional cyclic neural network with the best performance after the N times of training as the drug relation classification hierarchical convolutional cyclic neural network, wherein N > =1.
And the training set of the classified hierarchical convolutional neural network comprises two parts, namely a medicine text set which is input into the classified hierarchical convolutional neural network after preprocessing, and medicine relation labels among target medicine name words in original medicine texts corresponding to each medicine text in the preprocessed medicine text set, so that the medicine relation label set corresponding to each medicine text in the medicine text set is obtained and is used as the target output of the multilayer convolutional network. Similarly, the test set of the classified hierarchical convolutional neural network also comprises two parts, and the difference is that in the test process, only the preprocessed drug text set is input into the trained classified hierarchical convolutional neural network, the classified hierarchical convolutional neural network can obtain a drug classification result set predicted by a model according to the input drug text data and the trained model parameters, and then the drug classification result set is compared with the real label of the drug relationship, and the performance of the classified hierarchical convolutional neural network is evaluated according to the comparison result of the two.
In this example, a DDIExtraction 2013 drug relationship data set is used as a drug relationship text to train and test a classification hierarchical convolution cyclic neural network, and 80% of the whole data set is used as a training set and 20% is used as a test set, that is, the training set consists of 27792 drug relationship text samples, and the test set consists of 6409 drug relationship text samples. And then, training the hierarchical convolution cyclic neural network for 10 times by using the divided training set, and selecting the model with the best model effect in 10 times of training as the final model of the drug relationship hierarchical convolution cyclic neural network.
Example two
A drug relation classification method based on a dual-channel CNN-LSTM network is characterized in that a drug text to be classified is executed according to the following steps:
step A, preprocessing the text of the medicine to be classified by adopting the method in the step 2 in the embodiment 1 to obtain a preprocessed medicine text;
and step B, inputting the preprocessed drug text into the drug relation classification model in the embodiment 1 to obtain a classification result.
After the final drug relationship hierarchical convolution cyclic neural network is trained, the model can predict drug relationships involved in any drug relationship texts, drug texts with unknown drug relationships are input into the drug relationship hierarchical convolution cyclic neural network, and the drug relationship with the highest probability is selected from digital vectors output by the drug relationship hierarchical convolution cyclic neural network to serve as a drug relationship classification result of the drug texts with unknown drug relationships.
In this embodiment, the text of the drug to be classified is "several drugs have been associated with each other with a transition element in a series of drugs in a series of reactions in a cyclic relationship component, the first target drug name is a drug, the second target drug name is a drug, the drug relationship classification is performed through a trained drug relationship hierarchical convolutional neural network, and the output drug relationship digital vector label is:
P[mechanism,advice,effect,int,false]=[0.02,0.09,0.1,0.67,0.12]
namely, the probability of existence of mecanism between two target drugs quinolones and cyclosporine is 2%, namely, the probability of existence of advice between the two target drugs quinolones and cyclosporine is 9%, namely, the probability of existence of effect between the two target drugs quinolones and cyclosporine is 10%, namely, the probability of existence of int between the two target drugs quinolones and cyclosporine is 67%, namely, the probability of existence of impact between the two target drugs quinolones and cyclosporine is 12%, wherein the probability of existence of int relation is at most 67%, and therefore, the relation between the two target drugs quinolones and cyclosporine is classified as a forward relation by adopting the drug relation hierarchical convolutional recurrent neural network.
Compared with the drug relation classification algorithm in the prior art, the performance of the drug relation classification method based on the two-channel CNN-LSTM network provided by the scheme is compared with that of the drug classification algorithm in the prior art, as the accuracy, the recall rate and the F value are higher when the performance of one drug relation classification method is evaluated, the better the performance of a drug relation classification model is, and as can be seen from the table 1, the drug relation hierarchical convolutional recurrent neural network provided by the invention is obviously superior to other methods in three indexes of the accuracy, the recall rate and the F value, which proves that the drug relation classification method based on the hierarchical bidirectional convolutional recurrent neural network provided by the invention has the optimal classification performance in the aspect of the drug relation classification problem.
TABLE 1 comparison of the drug relationship Classification methods provided by the present invention with other drug relationship Classification methods
Figure BDA0001989036140000141
/>

Claims (5)

1. A method for constructing a drug relationship classification model based on a dual-channel CNN-LSTM network is characterized by comprising the following steps:
step 1, obtaining an original medicine text set;
labeling the medicine relation in each original medicine text in the original medicine text set to obtain a medicine relation label set;
step 2, preprocessing the original medicine text set to obtain a preprocessed medicine text set;
the preprocessing comprises text normalization, text length fixing and text vector mapping;
step 3, performing reverse order operation on each preprocessed drug text in the preprocessed drug text set to obtain a reverse order text set;
taking the preprocessed medicine text set as a positive sequence text set;
step 4, taking the positive sequence text set and the negative sequence text set as input, taking the drug relationship label set as output, training a neural network, and obtaining a drug relationship classification model;
the neural network comprises a forward sequence text feature extraction layer, a reverse sequence text feature extraction layer, a feature fusion layer and a classification layer which are arranged in parallel in sequence;
the forward sequence text feature extraction layer and the reverse sequence text feature extraction layer respectively comprise a convolution block and a long-term and short-term memory neural network block which are sequentially arranged;
the number of the convolution blocks is 4;
each convolution block comprises a batch regularization sublayer, a convolution sublayer, an activation function sublayer and a pooling sublayer which are arranged in sequence.
2. The method for constructing a drug relationship classification model based on a dual-channel CNN-LSTM network as claimed in claim 1, wherein the activation function in the activation function sub-layer is a ReLU function.
3. The method for constructing a drug relationship classification model based on a dual-channel CNN-LSTM network as claimed in claim 1, wherein said feature fusion layer comprises a full link layer.
4. The method for constructing the drug relationship classification model based on the dual-channel CNN-LSTM network as claimed in claim 1, wherein the classification layer comprises a Softmax function layer.
5. A drug relationship classification method based on a dual-channel CNN-LSTM network is characterized in that a drug text to be classified is executed according to the following steps:
step A, preprocessing the text of the medicine to be classified by adopting the method in the step 2 in the claim 1 to obtain a preprocessed medicine text;
step B, inputting the preprocessed drug text into the drug relation classification model of any one of claims 1 to 4 to obtain a classification result.
CN201910174269.4A 2019-03-08 2019-03-08 Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network Active CN110020671B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910174269.4A CN110020671B (en) 2019-03-08 2019-03-08 Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910174269.4A CN110020671B (en) 2019-03-08 2019-03-08 Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network

Publications (2)

Publication Number Publication Date
CN110020671A CN110020671A (en) 2019-07-16
CN110020671B true CN110020671B (en) 2023-04-18

Family

ID=67189365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910174269.4A Active CN110020671B (en) 2019-03-08 2019-03-08 Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network

Country Status (1)

Country Link
CN (1) CN110020671B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243682A (en) * 2020-01-10 2020-06-05 京东方科技集团股份有限公司 Method, device, medium and apparatus for predicting toxicity of drug
CN111444961B (en) * 2020-03-26 2023-08-18 国家计算机网络与信息安全管理中心黑龙江分中心 Method for judging attribution of Internet website through clustering algorithm
CN111898364B (en) * 2020-07-30 2023-09-26 平安科技(深圳)有限公司 Neural network relation extraction method, computer equipment and readable storage medium
CN111933225B (en) * 2020-09-27 2021-01-05 平安科技(深圳)有限公司 Drug classification method and device, terminal equipment and storage medium
CN112860816A (en) * 2021-03-01 2021-05-28 三维通信股份有限公司 Construction method and detection method of interaction relation detection model of drug entity pair
CN113806531B (en) * 2021-08-26 2024-02-27 西北大学 Drug relationship classification model construction method, drug relationship classification method and system
CN114678141A (en) * 2022-03-17 2022-06-28 中国科学院深圳理工大学(筹) Method, apparatus and medium for predicting drug-pair interaction relationship
CN115376668B (en) * 2022-08-30 2024-03-08 温州城市智慧健康有限公司 Big data business analysis method and system applied to intelligent medical treatment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363774A (en) * 2018-02-09 2018-08-03 西北大学 A kind of drug relationship sorting technique based on multilayer convolutional neural networks

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN108763216A (en) * 2018-06-01 2018-11-06 河南理工大学 A kind of text emotion analysis method based on Chinese data collection
CN109299264A (en) * 2018-10-12 2019-02-01 深圳市牛鼎丰科技有限公司 File classification method, device, computer equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363774A (en) * 2018-02-09 2018-08-03 西北大学 A kind of drug relationship sorting technique based on multilayer convolutional neural networks

Also Published As

Publication number Publication date
CN110020671A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN110020671B (en) Drug relationship classification model construction and classification method based on dual-channel CNN-LSTM network
CN110019839B (en) Medical knowledge graph construction method and system based on neural network and remote supervision
CN110807328B (en) Named entity identification method and system for legal document multi-strategy fusion
CN104318340B (en) Information visualization methods and intelligent visible analysis system based on text resume information
Alwehaibi et al. Comparison of pre-trained word vectors for arabic text classification using deep learning approach
CN109065157A (en) A kind of Disease Diagnosis Standard coded Recommendation list determines method and system
CN108804612B (en) Text emotion classification method based on dual neural network model
CN110335653B (en) Non-standard medical record analysis method based on openEHR medical record format
CN112487143A (en) Public opinion big data analysis-based multi-label text classification method
CN108536756A (en) Mood sorting technique and system based on bilingual information
CN113806531B (en) Drug relationship classification model construction method, drug relationship classification method and system
CN107491655A (en) Liver diseases information intelligent consultation method and system based on machine learning
CN111160023B (en) Medical text named entity recognition method based on multi-way recall
CN113707339B (en) Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases
CN112347766A (en) Multi-label classification method for processing microblog text cognition distortion
CN111858940A (en) Multi-head attention-based legal case similarity calculation method and system
CN110851601A (en) Cross-domain emotion classification system and method based on layered attention mechanism
CN113722490A (en) Visual rich document information extraction method based on key value matching relation
CN111540470B (en) Social network depression tendency detection model based on BERT transfer learning and training method thereof
Xu et al. Chinese event detection based on multi-feature fusion and BiLSTM
CN116910238A (en) Knowledge perception false news detection method based on twin network
CN111428502A (en) Named entity labeling method for military corpus
CN114547303A (en) Text multi-feature classification method and device based on Bert-LSTM
Hua et al. A character-level method for text classification
CN110060749A (en) Electronic health record intelligent diagnosing method based on SEV-SDG-CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant