CN114841261A - Increment width and deep learning drug response prediction method, medium, and apparatus - Google Patents
Increment width and deep learning drug response prediction method, medium, and apparatus Download PDFInfo
- Publication number
- CN114841261A CN114841261A CN202210464986.2A CN202210464986A CN114841261A CN 114841261 A CN114841261 A CN 114841261A CN 202210464986 A CN202210464986 A CN 202210464986A CN 114841261 A CN114841261 A CN 114841261A
- Authority
- CN
- China
- Prior art keywords
- drug
- width
- learning
- sequence
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000003814 drug Substances 0.000 title claims abstract description 160
- 229940079593 drug Drugs 0.000 title claims abstract description 131
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000004044 response Effects 0.000 title claims abstract description 41
- 238000013135 deep learning Methods 0.000 title claims abstract description 30
- 230000014509 gene expression Effects 0.000 claims abstract description 22
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 17
- 230000035945 sensitivity Effects 0.000 claims abstract description 17
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 15
- 239000011159 matrix material Substances 0.000 claims description 39
- 239000013598 vector Substances 0.000 claims description 16
- 239000000126 substance Substances 0.000 claims description 14
- 150000003384 small molecules Chemical group 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 12
- 230000004547 gene signature Effects 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 239000002245 particle Substances 0.000 claims description 2
- 230000009471 action Effects 0.000 description 6
- 206010028980 Neoplasm Diseases 0.000 description 5
- 201000011510 cancer Diseases 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012407 engineering method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method, medium and apparatus for predicting drug response for incremental width and deep learning; the method comprises the following steps: carrying out text coding and position coding on the sequence of the medicine to construct a medicine information code; coding and inputting medicine information into a Transformer coder to mine the structural characteristics of the medicine, inputting gene expression data into the characteristic representation of the learning gene of the multilayer perceptron, and splicing the medicine characteristics and the gene characteristics together to form a medicine-gene characteristic pair; and inputting the characteristic pairs into a width learning system to obtain a predicted drug sensitivity regression value. The method can solve the problem of poor drug representation; a width learning system is adopted to fuse the drug expression and gene expression characteristics, so that the accuracy of a drug sensitivity prediction result is improved; the network weight is updated through an incremental learning algorithm, the performance of the model is improved, and the whole model does not need to be retrained.
Description
Technical Field
The invention relates to the technical field of drug response prediction, in particular to a drug response prediction method, medium and device for increment width and deep learning.
Background
Cancer is an important disease threatening human health and causing death, and realizing personalized treatment for cancer patients is one of the most prominent research fields of precise medicine. In recent years, with drugsThe rapid development of genomics and computational models and the drug response prediction technology gradually bring more convenience to personalized treatment research. Drug response prediction aims to extract and integrate gene expression information of drugs and cell lines, predicting the sensitivity of cell lines to drugs. Half maximal Inhibitory Concentration (IC) 50 ) Can be used for reflecting the drug response sensitivity of cell lines and is a commonly used drug response prediction index. Most of the traditional medicine response prediction methods are based on machine learning methods, such as a support vector machine, Bayesian multi-task multi-core learning, random forest and simple neural network models. These methods rely on a priori knowledge and feature engineering to obtain drug and gene signatures, which are then combined into new signatures to predict drug sensitivity of cell lines. In the face of complex, high-dimensional and noisy data, the prediction performance and generalization performance of the method are not advantageous.
With the development of artificial intelligence, deep learning makes a remarkable breakthrough in the problems of drug response prediction, drug development and the like. Methods for predicting drug response using deep learning can be broadly classified into two types, one being based on unsupervised or semi-supervised methods and the other being based on end-to-end supervised methods. Unsupervised or semi-supervised drug response prediction models generally use an auto-encoder to perform feature dimension reduction learning on data such as a drug text sequence and a methyl group, copy number variation, transcriptome and the like of a cell line, and the learned features are used for training a classifier to predict the sensitivity of the cell line to drugs. The end-to-end supervised method utilizes the characteristic of deep learning network modularization, adopts models such as a convolutional neural network, a deep encoder, an integrated deep neural network and a graph neural network to extract the characteristics of different types of data, and inputs the learned medicine and gene expression into a prediction classifier for training. Compared with the traditional machine learning method, the deep learning method has improved prediction performance and generalization, but still cannot meet the requirements of clinical trials. Most of the current drug response prediction algorithms based on deep learning have considerable limitations. Firstly, a characteristic engineering method only adopting a chemical descriptor or molecular fingerprint lacks consideration on the representation of a drug structure, cannot distinguish different atoms in a drug molecule and different action information among related chemical bonds thereof, and easily loses hidden drug structure information; secondly, the fusion mode of the drug characteristics and the gene characteristics is single, and the performance improvement is limited by a classifier constructed by a multilayer neural network model; thirdly, the modeled system needs to retrain the whole model when facing to newly added data, thus greatly increasing the time cost; fourthly, in a real clinical experimental environment, all training samples cannot be acquired at one time due to different privacy and property protection and data acquisition periods. Current drug response prediction models also do not have the ability to incrementally learn multiple batches of data.
In conclusion, the current medicine response prediction method based on deep learning has room for improvement.
Disclosure of Invention
To overcome the disadvantages and shortcomings of the prior art, it is an object of the present invention to provide a method, medium and apparatus for predicting drug response with incremental width and deep learning; the method learns the structural characteristics of the SMILES sequence of the medicine through a Transformer encoder, and solves the problem that different atoms in medicine molecules and different action information among related chemical bonds of the atoms cannot be distinguished; a width learning system is adopted to fuse the drug expression and gene expression characteristics, so that the accuracy of a drug sensitivity prediction result is improved; the network weight is updated through an incremental learning algorithm, the performance of the model is improved, and the whole model does not need to be retrained.
In order to achieve the purpose, the invention is realized by the following technical scheme: a method for predicting a drug response in incremental width and deep learning, comprising: the method comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein, i is 1,2,. and L; l represents the maximum drug string sequence length;
s2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model comprises a Transformer encoder, a multi-layer perceptron and a width learning system;
encoding drug information f i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form a drug-gene signature pair X DG (ii) a Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values;
the IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The IBDT model of (1).
Preferably, in step S1, the text encoding of the SMILES sequence of the drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form;
positionally encoding the SMILES sequence of a drug means: encoding the position information of the drug by utilizing a dictionary lookup matrix;
encoding text T i And a position code P i Adding to obtain medicine information code E i :
E i =T i +P i
Preferably, said text encoding the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atom symbol and a small molecule sequence as a character string word, constructing a word set D containing character strings with different particle sizes, then using a Torchtext tool library to count and label a corpus containing SMILES sequences of all drugs, and expressing the SMILES sequences as a sequence string S ═ { S ═ S 1 ,...,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
wherein, W T Representing a matrix of word vectors that can be trained,a one-hot vector representing the ith character string of the sequence string S;
the position coding of the SMILES sequence of the drug comprises:
wherein, W P A matrix of weights is represented by a matrix of weights,a one-hot vector representing the ith position of the sequence string S.
Preferably, the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are used for respectively expressing that:
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship between:
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the medicine characteristic D F :
Wherein σ 1 、σ 2 、σ 3 Respectively representing nonlinear activation functions;respectively representing learnable weight matrixes;respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F (ii) a Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]. Raw gene expression data was imported from a data set called Cancer Cell Line Encyclopedia (CCLE).
Preferably, the gene expression data G o The input into the multilayer perceptron means that: the multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
wherein σ 4 、σ 5 、σ 6 Respectively representing nonlinear activation functions;respectively representing learnable weight matrixes;respectively, the offsets.
Preferably, during initial training of the IBDT model, the sample forms feature pairs X DG Inputting the data into a width learning system, and mapping n groups of characteristic nodesAnd m groups of enhanced nodes
All the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG :
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG I.e. the weight between the feature versus true drug sensitivity:
W DG =A DG + Y
A DG + =(λI+A DG A DG T ) -1 A DG T
wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
Preferably, in the IBDT model training, the width learning system is added with feature nodes and enhanced nodes by using the added samples subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The method comprises the following steps:
for newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parametersThen independently adding a sample X in the width learning system a Characteristic pair ofMapping new characteristic nodes and enhancement nodes, and combining all the characteristic nodes and the enhancement nodes to obtain an input matrix corresponding to the newly added sampleThe input matrix update of the model is:
calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
wherein, Y a The label value representing the newly added sample.
A storage medium, wherein the storage medium stores a computer program that, when executed by a processor, causes the processor to perform the above-described increment width and deep learning drug response prediction method.
A computing device comprises a processor and a memory for storing a processor executable program, wherein the processor executes the program stored in the memory to realize the increment width and deep learning drug response prediction method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
according to the method, when a newly added sample is faced, the network weight can be updated through the incremental learning algorithm without retraining the whole model, and the performance of the model is improved. The model learns the structural characteristics of the SMILES sequence of the drug through a Transformer encoder, and solves the problem that different atoms in drug molecules and different action information among related chemical bonds of the atoms cannot be distinguished; a width learning system is adopted to fuse the drug expression and gene expression characteristics, and the accuracy of the drug sensitivity prediction result is improved.
Drawings
FIG. 1 is a schematic flow diagram of a method for incremental width and deep learning drug response prediction in accordance with the present invention;
FIG. 2 is an architecture diagram of the IBDT model in the drug response prediction method of incremental Width and deep learning according to the present invention;
FIG. 3 is a schematic flow chart of incremental learning in the method for predicting drug response by incremental width and deep learning according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Example one
The invention provides a drug response prediction method based on an incremental width learning system and a transform model, which comprises the steps of firstly carrying out text coding and position coding on a SMILES sequence of a drug, constructing a drug information code, inputting the drug information code into a transform coder to mine the structural characteristics of the drug, simultaneously inputting gene expression data into the characteristic representation of a learning gene of a multilayer perceptron, splicing the drug characteristics and the gene characteristics together to form a drug-gene characteristic pair, inputting the characteristic pair into the width learning system to train to obtain a final model, and carrying out drug response prediction by using the trained model. The method has the advantages that the structuralization characteristics of the SMILES sequence of the medicine are learned through a Transformer encoder, and the problem that different atoms in medicine molecules and different action information among related chemical bonds of the atoms cannot be distinguished is solved; a width learning system is adopted to fuse the drug expression and gene expression characteristics, and the accuracy of the drug sensitivity prediction result is improved. For the newly added samples, the model does not need to retrain the whole model, and the network weight is updated through an incremental learning algorithm, so that the performance of the model is improved.
The flow of the method for predicting the drug response of increment width and deep learning in the embodiment is shown in fig. 1, and comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein, i is 1,2,. and L; l represents the maximum drug string sequence length.
Text coding of the SMILES sequence of a drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form; these symbols and sequences may represent the SMILES sequence, i.e. the encoding of the drug text, in the form of a word vector;
the text encoding of the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atomic symbol and a small molecule sequence as a character string word, constructing a vocabulary set D containing character strings with different granularity, and then using a Torchtext tool library to perform statistics and labeling on a corpus containing all medicine SMILES sequencesNote that the SMILES sequence is expressed as a sequence string S ═ S 1 ,...,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
wherein, W T Representing a matrix of word vectors that can be trained,a one-hot vector representing the ith character string of the sequence string S;
for capturing the position information of the medicines, the invention generates position codes for each medicine, and codes the position information of the medicines by utilizing a dictionary lookup matrix:
wherein, W P A matrix of weights is represented by a matrix of weights,a one-hot vector representing the ith position of the sequence string S.
Encoding text T i And a position code P i Adding to obtain medicine information code E i :
E i =T i +P i 。
S2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model includes a transform encoder, a multi-layer perceptron and a breadth learning system, as shown in FIG. 2. Encoding the drug information E i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form a drug-gene signature pair X DG 。TransformeThe r encoder module may learn different interaction information between different atoms in the drug SMILES sequence and their associated chemical bonds, generating a drug signature representation with structured information. The multi-tier perceptron module learns a feature representation of the genes. The breadth learning system module integrates drug characteristics and gene characteristics, reduces training time cost, and improves the prediction performance of the model.
In particular, the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are respectively used for representing:
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship:
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a beta 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the output of a Transformer encoder, namely the medicine characteristic D F :
Wherein σ 1 、σ 2 、σ 3 Representing a non-linear activation function;respectively representing learnable weight matrixes;respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F The multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
wherein σ 4 、σ 5 、σ 6 Representing an activation function;respectively representing learnable weight matrixes;respectively, the offsets. Raw gene expression data was imported from a data set called Cancer Cell Line Encyclopedia (CCLE).
Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]。
Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values.
The IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG IBDT model of。
When the IBDT model is initially trained, the characteristic pair X formed by the sample DG Inputting the data into a width learning system, and mapping n groups of characteristic nodesAnd m groups of enhanced nodesAll the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG :
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG I.e. the weight between the feature versus true drug sensitivity:
W DG =A DG + Y
A DG + =(λI+A DG A DG T ) -1 A DG T
wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
After the IBDT model is initially trained, parameters of the IBDT model are fixed, and the newly added sample generates a drug-gene characteristic pair through a Transformer encoder and a multilayer perceptron of the IBDT model. As shown in FIG. 3, for the newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parametersThen independently adding a sample X in the width learning system a Characteristic pair ofNew feature nodes and enhanced nodes are mapped, and the feature space of the original model is enriched; merge allThe characteristic node and the enhanced node obtain an input matrix corresponding to the newly added sampleThe input matrix of the model may be updated as:
the output weights of the network are dynamically updated through an incremental learning algorithm, new knowledge is learned, a knowledge base is updated, and the whole network does not need to be retrained.
Calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
wherein, Y a The label value representing the newly added sample.
The method comprises the steps of training and testing a drug response prediction model purely based on deep learning, a drug response prediction model based on deep learning and a Transformer, a drug response prediction model based on width learning and a deep Transformer and a drug response prediction model based on incremental width and depth Transformer respectively, wherein test results show that the introduction of the Transformer model can better extract action information between different atoms and associated chemical bonds in drug molecules; the breadth learning system can better fuse the characteristics of the medicine and the gene and improve the prediction effect of the model; incremental learning is introduced to further improve the performance of the prediction model.
The method is effectively based on the incremental width and deep learning model, and the problem that different atoms in drug molecules and different action information among related chemical bonds of the atoms cannot be distinguished is solved through learning of the structuralized drug information codes by the Transformer encoder; the medicine characteristics and the gene characteristics are fused by adopting a width learning system, so that the accuracy of a model prediction result is improved; by utilizing the characteristic that the width learning system can be dynamically expanded, new knowledge of a new sample is learned under the condition that the whole network does not need to be retrained, and the model performance is improved. The method disclosed by the invention is used for carrying out reasonable drug reaction prediction, is beneficial to biologists to carry out in-vitro clinical tests, is beneficial to the biologists to design and research new drugs, and is greatly beneficial to the medical scientists to design personalized cancer treatment schemes.
Example two
The present embodiment is a storage medium storing a computer program, which when executed by a processor causes the processor to execute the increment width and deep learning drug response prediction method according to the first embodiment.
EXAMPLE III
The embodiment is a computing device, which comprises a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to realize the increment width and deep learning drug response prediction method according to the first embodiment.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (9)
1. A method for predicting a drug response in incremental width and deep learning, comprising: the method comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein i is 1,2, …, L; l represents the maximum drug string sequence length;
s2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model comprises a Transformer encoder, a multi-layer perceptron and a width learning system;
encoding the drug information E i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form drug-gene signature pairs X DG (ii) a Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values;
the IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The IBDT model of (1).
2. The increment-width and deep-learning drug response prediction method of claim 1, wherein: in step S1, text-coding the SMILES sequence of the drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form;
positionally encoding the SMILES sequence of a drug means: encoding the position information of the drug by utilizing a dictionary lookup matrix;
encoding text T i And a position code P i Adding to obtain medicine information code E i :
E i =T i +P i 。
3. The increment-width and deep-learning drug response prediction method of claim 2, wherein: the text encoding of the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atom symbol and a small molecule sequence as a character string word, constructing a word set D containing character strings with different particle sizes, then using a Torchtext tool library to count and label a corpus containing SMILES sequences of all drugs, and expressing the SMILES sequences as a sequence string S ═ { S ═ S 1 ,…,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
wherein, W T Representing a matrix of word vectors that can be trained,a one-hot vector representing the ith character string of the sequence string S;
the position coding of the SMILES sequence of the drug comprises:
4. The increment-width and deep-learning drug response prediction method of claim 1, wherein: the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are respectively used for representing:
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship between:
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the medicine characteristic D F :
Wherein,σ 1 、σ 2 、σ 3 Representing a non-linear activation function;respectively representing learnable weight matrixes;respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F (ii) a Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]。
5. The increment-width and deep-learning drug response prediction method of claim 4, wherein: said gene expression data G o The input into the multilayer perceptron means that: the multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
6. The increment-width and deep-learning drug response prediction method of claim 1, wherein: during initial training of the IBDT model, the feature pairs X formed by the samples DG Inputting the data into a width learning system, and mapping n groups of characteristic nodesAnd m groups of enhanced nodesAll the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG :
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG :
W DG =A DG +Y
A DG + =(λI+A DG A DG T ) -1 A DG T
Wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
7. The incremental width and deep learning drug response prediction method of claim 6, wherein: in the IBDT model training, the newly added samples are subsequently utilized to enable the feature nodes and the enhanced nodes of the width learning system to be newly added, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The method comprises the following steps:
for newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parametersThen independently adding a sample X in the width learning system a Characteristic pair ofMapping out new characteristic nodes and enhanced nodesCombining all the characteristic nodes and the enhanced nodes to obtain an input matrix corresponding to the newly added sampleThe input matrix update of the model is:
calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
wherein, Y a The label value representing the newly added sample.
8. A storage medium storing a computer program which, when executed by a processor, causes the processor to perform the increment width and deep learning drug response prediction method of any one of claims 1-7.
9. A computing device comprising a processor and a memory for storing processor-executable programs, wherein the processor, when executing a program stored in the memory, implements the increment-width and deep-learned drug response prediction method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210464986.2A CN114841261B (en) | 2022-04-29 | 2022-04-29 | Incremental width and depth learning drug response prediction methods, media and devices |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210464986.2A CN114841261B (en) | 2022-04-29 | 2022-04-29 | Incremental width and depth learning drug response prediction methods, media and devices |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114841261A true CN114841261A (en) | 2022-08-02 |
CN114841261B CN114841261B (en) | 2024-08-02 |
Family
ID=82567755
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210464986.2A Active CN114841261B (en) | 2022-04-29 | 2022-04-29 | Incremental width and depth learning drug response prediction methods, media and devices |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114841261B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115761250A (en) * | 2022-11-21 | 2023-03-07 | 北京科技大学 | Compound inverse synthesis method and device |
CN116403657A (en) * | 2023-03-20 | 2023-07-07 | 本源量子计算科技(合肥)股份有限公司 | Drug response prediction method and device, storage medium and electronic device |
CN117275608A (en) * | 2023-09-08 | 2023-12-22 | 浙江大学 | Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190272468A1 (en) * | 2018-03-05 | 2019-09-05 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation |
CN113764038A (en) * | 2021-08-31 | 2021-12-07 | 华南理工大学 | Method for constructing myelodysplastic syndrome whitening gene prediction model |
CN114220496A (en) * | 2021-11-30 | 2022-03-22 | 华南理工大学 | Deep learning-based inverse synthesis prediction method, device, medium and equipment |
WO2022087540A1 (en) * | 2020-10-23 | 2022-04-28 | The Regents Of The University Of California | Visible neural network framework |
-
2022
- 2022-04-29 CN CN202210464986.2A patent/CN114841261B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190272468A1 (en) * | 2018-03-05 | 2019-09-05 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation |
WO2022087540A1 (en) * | 2020-10-23 | 2022-04-28 | The Regents Of The University Of California | Visible neural network framework |
CN113764038A (en) * | 2021-08-31 | 2021-12-07 | 华南理工大学 | Method for constructing myelodysplastic syndrome whitening gene prediction model |
CN114220496A (en) * | 2021-11-30 | 2022-03-22 | 华南理工大学 | Deep learning-based inverse synthesis prediction method, device, medium and equipment |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115761250A (en) * | 2022-11-21 | 2023-03-07 | 北京科技大学 | Compound inverse synthesis method and device |
CN115761250B (en) * | 2022-11-21 | 2023-10-10 | 北京科技大学 | Compound reverse synthesis method and device |
CN116403657A (en) * | 2023-03-20 | 2023-07-07 | 本源量子计算科技(合肥)股份有限公司 | Drug response prediction method and device, storage medium and electronic device |
CN117275608A (en) * | 2023-09-08 | 2023-12-22 | 浙江大学 | Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs |
CN117275608B (en) * | 2023-09-08 | 2024-04-26 | 浙江大学 | Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs |
Also Published As
Publication number | Publication date |
---|---|
CN114841261B (en) | 2024-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Multi-task label embedding for text classification | |
Zhang et al. | Hierarchical lifelong learning by sharing representations and integrating hypothesis | |
CN114841261B (en) | Incremental width and depth learning drug response prediction methods, media and devices | |
CN112257858B (en) | Model compression method and device | |
Gallant et al. | Representing objects, relations, and sequences | |
CN112288075B (en) | Data processing method and related equipment | |
CN116415654A (en) | Data processing method and related equipment | |
Tsantekidis et al. | Recurrent neural networks | |
CN113066526B (en) | Hypergraph-based drug-target-disease interaction prediction method | |
US11651841B2 (en) | Drug compound identification for target tissue cells | |
Aggarwal et al. | Recurrent neural networks | |
Payne et al. | Bert learns (and teaches) chemistry | |
Zhou et al. | Improving neural protein-protein interaction extraction with knowledge selection | |
Wang et al. | Joint Character‐Level Convolutional and Generative Adversarial Networks for Text Classification | |
CN116720519B (en) | Seedling medicine named entity identification method | |
CN111428046B (en) | Knowledge graph generation method based on bidirectional LSTM deep neural network | |
Mishra et al. | Long short-term memory recurrent neural network architectures for melody generation | |
Ibne Akhtar et al. | Bangla text generation using bidirectional optimized gated recurrent unit network | |
Galatolo et al. | Using stigmergy to incorporate the time into artificial neural networks | |
Yu et al. | AKA-SafeMed: A safe medication recommendation based on attention mechanism and knowledge augmentation | |
Julian | Deep learning with pytorch quick start guide: learn to train and deploy neural network models in Python | |
Györgyi | Techniques of replica symmetry breaking and the storage problem of the McCulloch–Pitts neuron | |
Wang et al. | Feature Fusion‐Based Improved Capsule Network for sEMG Signal Recognition | |
Cao et al. | Learning functional embedding of genes governed by pair-wised labels | |
Yang et al. | Statistical inference: learning in artificial neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |