CN114841261A - Increment width and deep learning drug response prediction method, medium, and apparatus - Google Patents

Increment width and deep learning drug response prediction method, medium, and apparatus Download PDF

Info

Publication number
CN114841261A
CN114841261A CN202210464986.2A CN202210464986A CN114841261A CN 114841261 A CN114841261 A CN 114841261A CN 202210464986 A CN202210464986 A CN 202210464986A CN 114841261 A CN114841261 A CN 114841261A
Authority
CN
China
Prior art keywords
drug
width
learning
sequence
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210464986.2A
Other languages
Chinese (zh)
Other versions
CN114841261B (en
Inventor
陈俊龙
詹永康
孟献兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210464986.2A priority Critical patent/CN114841261B/en
Publication of CN114841261A publication Critical patent/CN114841261A/en
Application granted granted Critical
Publication of CN114841261B publication Critical patent/CN114841261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, medium and apparatus for predicting drug response for incremental width and deep learning; the method comprises the following steps: carrying out text coding and position coding on the sequence of the medicine to construct a medicine information code; coding and inputting medicine information into a Transformer coder to mine the structural characteristics of the medicine, inputting gene expression data into the characteristic representation of the learning gene of the multilayer perceptron, and splicing the medicine characteristics and the gene characteristics together to form a medicine-gene characteristic pair; and inputting the characteristic pairs into a width learning system to obtain a predicted drug sensitivity regression value. The method can solve the problem of poor drug representation; a width learning system is adopted to fuse the drug expression and gene expression characteristics, so that the accuracy of a drug sensitivity prediction result is improved; the network weight is updated through an incremental learning algorithm, the performance of the model is improved, and the whole model does not need to be retrained.

Description

Increment width and deep learning drug response prediction method, medium, and apparatus
Technical Field
The invention relates to the technical field of drug response prediction, in particular to a drug response prediction method, medium and device for increment width and deep learning.
Background
Cancer is an important disease threatening human health and causing death, and realizing personalized treatment for cancer patients is one of the most prominent research fields of precise medicine. In recent years, with drugsThe rapid development of genomics and computational models and the drug response prediction technology gradually bring more convenience to personalized treatment research. Drug response prediction aims to extract and integrate gene expression information of drugs and cell lines, predicting the sensitivity of cell lines to drugs. Half maximal Inhibitory Concentration (IC) 50 ) Can be used for reflecting the drug response sensitivity of cell lines and is a commonly used drug response prediction index. Most of the traditional medicine response prediction methods are based on machine learning methods, such as a support vector machine, Bayesian multi-task multi-core learning, random forest and simple neural network models. These methods rely on a priori knowledge and feature engineering to obtain drug and gene signatures, which are then combined into new signatures to predict drug sensitivity of cell lines. In the face of complex, high-dimensional and noisy data, the prediction performance and generalization performance of the method are not advantageous.
With the development of artificial intelligence, deep learning makes a remarkable breakthrough in the problems of drug response prediction, drug development and the like. Methods for predicting drug response using deep learning can be broadly classified into two types, one being based on unsupervised or semi-supervised methods and the other being based on end-to-end supervised methods. Unsupervised or semi-supervised drug response prediction models generally use an auto-encoder to perform feature dimension reduction learning on data such as a drug text sequence and a methyl group, copy number variation, transcriptome and the like of a cell line, and the learned features are used for training a classifier to predict the sensitivity of the cell line to drugs. The end-to-end supervised method utilizes the characteristic of deep learning network modularization, adopts models such as a convolutional neural network, a deep encoder, an integrated deep neural network and a graph neural network to extract the characteristics of different types of data, and inputs the learned medicine and gene expression into a prediction classifier for training. Compared with the traditional machine learning method, the deep learning method has improved prediction performance and generalization, but still cannot meet the requirements of clinical trials. Most of the current drug response prediction algorithms based on deep learning have considerable limitations. Firstly, a characteristic engineering method only adopting a chemical descriptor or molecular fingerprint lacks consideration on the representation of a drug structure, cannot distinguish different atoms in a drug molecule and different action information among related chemical bonds thereof, and easily loses hidden drug structure information; secondly, the fusion mode of the drug characteristics and the gene characteristics is single, and the performance improvement is limited by a classifier constructed by a multilayer neural network model; thirdly, the modeled system needs to retrain the whole model when facing to newly added data, thus greatly increasing the time cost; fourthly, in a real clinical experimental environment, all training samples cannot be acquired at one time due to different privacy and property protection and data acquisition periods. Current drug response prediction models also do not have the ability to incrementally learn multiple batches of data.
In conclusion, the current medicine response prediction method based on deep learning has room for improvement.
Disclosure of Invention
To overcome the disadvantages and shortcomings of the prior art, it is an object of the present invention to provide a method, medium and apparatus for predicting drug response with incremental width and deep learning; the method learns the structural characteristics of the SMILES sequence of the medicine through a Transformer encoder, and solves the problem that different atoms in medicine molecules and different action information among related chemical bonds of the atoms cannot be distinguished; a width learning system is adopted to fuse the drug expression and gene expression characteristics, so that the accuracy of a drug sensitivity prediction result is improved; the network weight is updated through an incremental learning algorithm, the performance of the model is improved, and the whole model does not need to be retrained.
In order to achieve the purpose, the invention is realized by the following technical scheme: a method for predicting a drug response in incremental width and deep learning, comprising: the method comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein, i is 1,2,. and L; l represents the maximum drug string sequence length;
s2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model comprises a Transformer encoder, a multi-layer perceptron and a width learning system;
encoding drug information f i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form a drug-gene signature pair X DG (ii) a Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values;
the IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The IBDT model of (1).
Preferably, in step S1, the text encoding of the SMILES sequence of the drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form;
positionally encoding the SMILES sequence of a drug means: encoding the position information of the drug by utilizing a dictionary lookup matrix;
encoding text T i And a position code P i Adding to obtain medicine information code E i
E i =T i +P i
Preferably, said text encoding the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atom symbol and a small molecule sequence as a character string word, constructing a word set D containing character strings with different particle sizes, then using a Torchtext tool library to count and label a corpus containing SMILES sequences of all drugs, and expressing the SMILES sequences as a sequence string S ═ { S ═ S 1 ,...,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
Figure BDA0003623542430000041
wherein, W T Representing a matrix of word vectors that can be trained,
Figure BDA0003623542430000042
a one-hot vector representing the ith character string of the sequence string S;
the position coding of the SMILES sequence of the drug comprises:
Figure BDA0003623542430000043
wherein, W P A matrix of weights is represented by a matrix of weights,
Figure BDA0003623542430000044
a one-hot vector representing the ith position of the sequence string S.
Preferably, the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are used for respectively expressing that:
Figure BDA0003623542430000045
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship between:
Figure BDA0003623542430000046
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
Figure BDA0003623542430000047
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the medicine characteristic D F
Figure BDA0003623542430000051
Wherein σ 1 、σ 2 、σ 3 Respectively representing nonlinear activation functions;
Figure BDA0003623542430000052
respectively representing learnable weight matrixes;
Figure BDA0003623542430000053
respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F (ii) a Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]. Raw gene expression data was imported from a data set called Cancer Cell Line Encyclopedia (CCLE).
Preferably, the gene expression data G o The input into the multilayer perceptron means that: the multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
Figure BDA0003623542430000054
wherein σ 4 、σ 5 、σ 6 Respectively representing nonlinear activation functions;
Figure BDA0003623542430000055
respectively representing learnable weight matrixes;
Figure BDA0003623542430000056
respectively, the offsets.
Preferably, during initial training of the IBDT model, the sample forms feature pairs X DG Inputting the data into a width learning system, and mapping n groups of characteristic nodes
Figure BDA0003623542430000057
And m groups of enhanced nodes
Figure BDA0003623542430000058
All the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG
Figure BDA0003623542430000059
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG I.e. the weight between the feature versus true drug sensitivity:
W DG =A DG + Y
A DG + =(λI+A DG A DG T ) -1 A DG T
wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
Preferably, in the IBDT model training, the width learning system is added with feature nodes and enhanced nodes by using the added samples subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The method comprises the following steps:
for newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parameters
Figure BDA0003623542430000061
Then independently adding a sample X in the width learning system a Characteristic pair of
Figure BDA0003623542430000062
Mapping new characteristic nodes and enhancement nodes, and combining all the characteristic nodes and the enhancement nodes to obtain an input matrix corresponding to the newly added sample
Figure BDA0003623542430000063
The input matrix update of the model is:
Figure BDA0003623542430000064
calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
Figure BDA0003623542430000065
wherein
Figure BDA0003623542430000066
Figure BDA0003623542430000067
Wherein,
Figure BDA0003623542430000068
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
Figure BDA0003623542430000069
wherein, Y a The label value representing the newly added sample.
A storage medium, wherein the storage medium stores a computer program that, when executed by a processor, causes the processor to perform the above-described increment width and deep learning drug response prediction method.
A computing device comprises a processor and a memory for storing a processor executable program, wherein the processor executes the program stored in the memory to realize the increment width and deep learning drug response prediction method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
according to the method, when a newly added sample is faced, the network weight can be updated through the incremental learning algorithm without retraining the whole model, and the performance of the model is improved. The model learns the structural characteristics of the SMILES sequence of the drug through a Transformer encoder, and solves the problem that different atoms in drug molecules and different action information among related chemical bonds of the atoms cannot be distinguished; a width learning system is adopted to fuse the drug expression and gene expression characteristics, and the accuracy of the drug sensitivity prediction result is improved.
Drawings
FIG. 1 is a schematic flow diagram of a method for incremental width and deep learning drug response prediction in accordance with the present invention;
FIG. 2 is an architecture diagram of the IBDT model in the drug response prediction method of incremental Width and deep learning according to the present invention;
FIG. 3 is a schematic flow chart of incremental learning in the method for predicting drug response by incremental width and deep learning according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Example one
The invention provides a drug response prediction method based on an incremental width learning system and a transform model, which comprises the steps of firstly carrying out text coding and position coding on a SMILES sequence of a drug, constructing a drug information code, inputting the drug information code into a transform coder to mine the structural characteristics of the drug, simultaneously inputting gene expression data into the characteristic representation of a learning gene of a multilayer perceptron, splicing the drug characteristics and the gene characteristics together to form a drug-gene characteristic pair, inputting the characteristic pair into the width learning system to train to obtain a final model, and carrying out drug response prediction by using the trained model. The method has the advantages that the structuralization characteristics of the SMILES sequence of the medicine are learned through a Transformer encoder, and the problem that different atoms in medicine molecules and different action information among related chemical bonds of the atoms cannot be distinguished is solved; a width learning system is adopted to fuse the drug expression and gene expression characteristics, and the accuracy of the drug sensitivity prediction result is improved. For the newly added samples, the model does not need to retrain the whole model, and the network weight is updated through an incremental learning algorithm, so that the performance of the model is improved.
The flow of the method for predicting the drug response of increment width and deep learning in the embodiment is shown in fig. 1, and comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein, i is 1,2,. and L; l represents the maximum drug string sequence length.
Text coding of the SMILES sequence of a drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form; these symbols and sequences may represent the SMILES sequence, i.e. the encoding of the drug text, in the form of a word vector;
the text encoding of the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atomic symbol and a small molecule sequence as a character string word, constructing a vocabulary set D containing character strings with different granularity, and then using a Torchtext tool library to perform statistics and labeling on a corpus containing all medicine SMILES sequencesNote that the SMILES sequence is expressed as a sequence string S ═ S 1 ,...,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
Figure BDA0003623542430000081
wherein, W T Representing a matrix of word vectors that can be trained,
Figure BDA0003623542430000082
a one-hot vector representing the ith character string of the sequence string S;
for capturing the position information of the medicines, the invention generates position codes for each medicine, and codes the position information of the medicines by utilizing a dictionary lookup matrix:
Figure BDA0003623542430000091
wherein, W P A matrix of weights is represented by a matrix of weights,
Figure BDA0003623542430000092
a one-hot vector representing the ith position of the sequence string S.
Encoding text T i And a position code P i Adding to obtain medicine information code E i
E i =T i +P i
S2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model includes a transform encoder, a multi-layer perceptron and a breadth learning system, as shown in FIG. 2. Encoding the drug information E i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form a drug-gene signature pair X DG 。TransformeThe r encoder module may learn different interaction information between different atoms in the drug SMILES sequence and their associated chemical bonds, generating a drug signature representation with structured information. The multi-tier perceptron module learns a feature representation of the genes. The breadth learning system module integrates drug characteristics and gene characteristics, reduces training time cost, and improves the prediction performance of the model.
In particular, the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are respectively used for representing:
Figure BDA0003623542430000093
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship:
Figure BDA0003623542430000094
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
Figure BDA0003623542430000101
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a beta 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the output of a Transformer encoder, namely the medicine characteristic D F
Figure BDA0003623542430000102
Wherein σ 1 、σ 2 、σ 3 Representing a non-linear activation function;
Figure BDA0003623542430000103
respectively representing learnable weight matrixes;
Figure BDA0003623542430000104
respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F The multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
Figure BDA0003623542430000105
wherein σ 4 、σ 5 、σ 6 Representing an activation function;
Figure BDA0003623542430000106
respectively representing learnable weight matrixes;
Figure BDA0003623542430000107
respectively, the offsets. Raw gene expression data was imported from a data set called Cancer Cell Line Encyclopedia (CCLE).
Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]。
Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values.
The IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG IBDT model of。
When the IBDT model is initially trained, the characteristic pair X formed by the sample DG Inputting the data into a width learning system, and mapping n groups of characteristic nodes
Figure BDA0003623542430000111
And m groups of enhanced nodes
Figure BDA0003623542430000112
All the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG
Figure BDA0003623542430000113
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG I.e. the weight between the feature versus true drug sensitivity:
W DG =A DG + Y
A DG + =(λI+A DG A DG T ) -1 A DG T
wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
After the IBDT model is initially trained, parameters of the IBDT model are fixed, and the newly added sample generates a drug-gene characteristic pair through a Transformer encoder and a multilayer perceptron of the IBDT model. As shown in FIG. 3, for the newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parameters
Figure BDA0003623542430000114
Then independently adding a sample X in the width learning system a Characteristic pair of
Figure BDA0003623542430000115
New feature nodes and enhanced nodes are mapped, and the feature space of the original model is enriched; merge allThe characteristic node and the enhanced node obtain an input matrix corresponding to the newly added sample
Figure BDA0003623542430000116
The input matrix of the model may be updated as:
Figure BDA0003623542430000117
the output weights of the network are dynamically updated through an incremental learning algorithm, new knowledge is learned, a knowledge base is updated, and the whole network does not need to be retrained.
Calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
Figure BDA0003623542430000118
wherein
Figure BDA0003623542430000121
Figure BDA0003623542430000122
Wherein,
Figure BDA0003623542430000123
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
Figure BDA0003623542430000124
wherein, Y a The label value representing the newly added sample.
The method comprises the steps of training and testing a drug response prediction model purely based on deep learning, a drug response prediction model based on deep learning and a Transformer, a drug response prediction model based on width learning and a deep Transformer and a drug response prediction model based on incremental width and depth Transformer respectively, wherein test results show that the introduction of the Transformer model can better extract action information between different atoms and associated chemical bonds in drug molecules; the breadth learning system can better fuse the characteristics of the medicine and the gene and improve the prediction effect of the model; incremental learning is introduced to further improve the performance of the prediction model.
The method is effectively based on the incremental width and deep learning model, and the problem that different atoms in drug molecules and different action information among related chemical bonds of the atoms cannot be distinguished is solved through learning of the structuralized drug information codes by the Transformer encoder; the medicine characteristics and the gene characteristics are fused by adopting a width learning system, so that the accuracy of a model prediction result is improved; by utilizing the characteristic that the width learning system can be dynamically expanded, new knowledge of a new sample is learned under the condition that the whole network does not need to be retrained, and the model performance is improved. The method disclosed by the invention is used for carrying out reasonable drug reaction prediction, is beneficial to biologists to carry out in-vitro clinical tests, is beneficial to the biologists to design and research new drugs, and is greatly beneficial to the medical scientists to design personalized cancer treatment schemes.
Example two
The present embodiment is a storage medium storing a computer program, which when executed by a processor causes the processor to execute the increment width and deep learning drug response prediction method according to the first embodiment.
EXAMPLE III
The embodiment is a computing device, which comprises a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to realize the increment width and deep learning drug response prediction method according to the first embodiment.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. A method for predicting a drug response in incremental width and deep learning, comprising: the method comprises the following steps:
s1, text coding and position coding the SMILES sequence of the medicine to obtain a text code T i And a position code P i Thereby constructing a drug information code E i (ii) a Wherein i is 1,2, …, L; l represents the maximum drug string sequence length;
s2, encoding the drug information E i Inputting the data into an IBDT model; the IBDT model comprises a Transformer encoder, a multi-layer perceptron and a width learning system;
encoding the drug information E i Input into a Transformer encoder to mine the drug characteristics D F Simultaneous gene expression data G o Inputting the characteristics of the learning genes into a multilayer perceptron F Characterization of the drug by D F And gene signature G F Spliced together to form drug-gene signature pairs X DG (ii) a Pair of features X DG Inputting the drug sensitivity values into a width learning system to obtain predicted drug sensitivity regression values;
the IBDT model is characterized in that parameters of the IBDT model are fixed after initial training, feature nodes and enhanced nodes are added to the width learning system by using an added sample subsequently, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The IBDT model of (1).
2. The increment-width and deep-learning drug response prediction method of claim 1, wherein: in step S1, text-coding the SMILES sequence of the drug means: decomposing the SMILES sequence of the medicine into a single atom symbol and small molecule sequence according to chemical prior knowledge, wherein the atom symbol and the small molecule sequence are expressed in a word vector form;
positionally encoding the SMILES sequence of a drug means: encoding the position information of the drug by utilizing a dictionary lookup matrix;
encoding text T i And a position code P i Adding to obtain medicine information code E i
E i =T i +P i
3. The increment-width and deep-learning drug response prediction method of claim 2, wherein: the text encoding of the SMILES sequence of the drug comprises:
decomposing the SMILES sequence of the drug into a single atomic symbol and a small molecule sequence according to chemical prior knowledge; regarding a single atom symbol and a small molecule sequence as a character string word, constructing a word set D containing character strings with different particle sizes, then using a Torchtext tool library to count and label a corpus containing SMILES sequences of all drugs, and expressing the SMILES sequences as a sequence string S ═ { S ═ S 1 ,…,S L And are represented by a one-hot vector, where S i Represents a word in the vocabulary set D; the text code for each drug is expressed as:
Figure FDA0003623542420000021
wherein, W T Representing a matrix of word vectors that can be trained,
Figure FDA0003623542420000022
a one-hot vector representing the ith character string of the sequence string S;
the position coding of the SMILES sequence of the drug comprises:
Figure FDA0003623542420000023
wherein, W P A matrix of weights is represented by a matrix of weights,
Figure FDA0003623542420000024
a one-hot vector representing the ith position of the sequence string S.
4. The increment-width and deep-learning drug response prediction method of claim 1, wherein: the drug information code E i The multi-head self-attention layer through a transform encoder is mapped into a Query matrix, a Key matrix and a Value matrix through linear transformation, and Q, K and V are respectively used for representing:
Figure FDA0003623542420000025
wherein, W q ,W k ,W v Representing a learnable weight matrix; obtaining the subsequences S by using an attention calculation formula i Output of attention relationship between:
Figure FDA0003623542420000031
wherein d is k Code for indicating drug information E i Dimension (d); the output of the multi-headed self-attention layer is then connected into a fully-connected feedforward neural network:
Figure FDA0003623542420000032
wherein F represents the output of the feedforward neural network; w 1 、W 2 Respectively representing learnable weight matrixes; beta is a 1 、β 2 Respectively represent the bias; finally, the output F is input into a multilayer perceptron to obtain the medicine characteristic D F
Figure FDA0003623542420000033
Wherein,σ 1 、σ 2 、σ 3 Representing a non-linear activation function;
Figure FDA0003623542420000034
respectively representing learnable weight matrixes;
Figure FDA0003623542420000035
respectively, the offsets.
Gene expression data G o Inputting into a multilayer perceptron to obtain gene characteristics G F (ii) a Characterization of the drug by D F And gene signature G F Drug-gene signature pair X formed by splicing and integration DG =[D F |G F ]。
5. The increment-width and deep-learning drug response prediction method of claim 4, wherein: said gene expression data G o The input into the multilayer perceptron means that: the multilayer perceptron comprises three hidden layers and three active layers; gene signature G F Comprises the following steps:
Figure FDA0003623542420000036
wherein σ 4 、σ 5 、σ 6 Representing an activation function;
Figure FDA0003623542420000037
respectively representing learnable weight matrixes;
Figure FDA0003623542420000038
respectively, the offsets.
6. The increment-width and deep-learning drug response prediction method of claim 1, wherein: during initial training of the IBDT model, the feature pairs X formed by the samples DG Inputting the data into a width learning system, and mapping n groups of characteristic nodes
Figure FDA0003623542420000039
And m groups of enhanced nodes
Figure FDA00036235424200000310
All the characteristic nodes and the enhanced nodes are combined to obtain an input matrix A DG
Figure FDA00036235424200000311
Computing feature pairs X by a pseudo-inverse and ridge regression learning algorithm DG To the weight W between the outputs Y DG
W DG =A DG +Y
A DG + =(λI+A DG A DG T ) -1 A DG T
Wherein A is DG + Representing an input matrix A DG The pseudo-inverse of (1); λ represents a non-negative number in the ridge regression that tends to 0; i denotes an identity matrix.
7. The incremental width and deep learning drug response prediction method of claim 6, wherein: in the IBDT model training, the newly added samples are subsequently utilized to enable the feature nodes and the enhanced nodes of the width learning system to be newly added, and the output weight W of the learning system is dynamically updated through an incremental learning algorithm DG The method comprises the following steps:
for newly added sample X a Firstly, generating feature pairs by using fixed IBDT model parameters
Figure FDA0003623542420000041
Then independently adding a sample X in the width learning system a Characteristic pair of
Figure FDA0003623542420000042
Mapping out new characteristic nodes and enhanced nodesCombining all the characteristic nodes and the enhanced nodes to obtain an input matrix corresponding to the newly added sample
Figure FDA0003623542420000043
The input matrix update of the model is:
Figure FDA0003623542420000044
calculating the pseudo-inverse between the newly added feature pair and the newly added sample output, and obtaining the weight information of the newly added sample by using an incremental learning algorithm:
Figure FDA0003623542420000045
wherein
Figure FDA0003623542420000046
Figure FDA0003623542420000047
Wherein,
Figure FDA0003623542420000048
merging the weight information of the newly added samples into the output weight W of the width learning system DG Dynamically updating model output weights:
Figure FDA0003623542420000051
wherein, Y a The label value representing the newly added sample.
8. A storage medium storing a computer program which, when executed by a processor, causes the processor to perform the increment width and deep learning drug response prediction method of any one of claims 1-7.
9. A computing device comprising a processor and a memory for storing processor-executable programs, wherein the processor, when executing a program stored in the memory, implements the increment-width and deep-learned drug response prediction method of any one of claims 1-7.
CN202210464986.2A 2022-04-29 2022-04-29 Incremental width and depth learning drug response prediction methods, media and devices Active CN114841261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210464986.2A CN114841261B (en) 2022-04-29 2022-04-29 Incremental width and depth learning drug response prediction methods, media and devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210464986.2A CN114841261B (en) 2022-04-29 2022-04-29 Incremental width and depth learning drug response prediction methods, media and devices

Publications (2)

Publication Number Publication Date
CN114841261A true CN114841261A (en) 2022-08-02
CN114841261B CN114841261B (en) 2024-08-02

Family

ID=82567755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210464986.2A Active CN114841261B (en) 2022-04-29 2022-04-29 Incremental width and depth learning drug response prediction methods, media and devices

Country Status (1)

Country Link
CN (1) CN114841261B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761250A (en) * 2022-11-21 2023-03-07 北京科技大学 Compound inverse synthesis method and device
CN116403657A (en) * 2023-03-20 2023-07-07 本源量子计算科技(合肥)股份有限公司 Drug response prediction method and device, storage medium and electronic device
CN117275608A (en) * 2023-09-08 2023-12-22 浙江大学 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190272468A1 (en) * 2018-03-05 2019-09-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation
CN113764038A (en) * 2021-08-31 2021-12-07 华南理工大学 Method for constructing myelodysplastic syndrome whitening gene prediction model
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment
WO2022087540A1 (en) * 2020-10-23 2022-04-28 The Regents Of The University Of California Visible neural network framework

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190272468A1 (en) * 2018-03-05 2019-09-05 The Board Of Trustees Of The Leland Stanford Junior University Systems and Methods for Spatial Graph Convolutions with Applications to Drug Discovery and Molecular Simulation
WO2022087540A1 (en) * 2020-10-23 2022-04-28 The Regents Of The University Of California Visible neural network framework
CN113764038A (en) * 2021-08-31 2021-12-07 华南理工大学 Method for constructing myelodysplastic syndrome whitening gene prediction model
CN114220496A (en) * 2021-11-30 2022-03-22 华南理工大学 Deep learning-based inverse synthesis prediction method, device, medium and equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761250A (en) * 2022-11-21 2023-03-07 北京科技大学 Compound inverse synthesis method and device
CN115761250B (en) * 2022-11-21 2023-10-10 北京科技大学 Compound reverse synthesis method and device
CN116403657A (en) * 2023-03-20 2023-07-07 本源量子计算科技(合肥)股份有限公司 Drug response prediction method and device, storage medium and electronic device
CN117275608A (en) * 2023-09-08 2023-12-22 浙江大学 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs
CN117275608B (en) * 2023-09-08 2024-04-26 浙江大学 Cooperative attention-based method and device for cooperative prediction of interpretable anticancer drugs

Also Published As

Publication number Publication date
CN114841261B (en) 2024-08-02

Similar Documents

Publication Publication Date Title
Zhang et al. Multi-task label embedding for text classification
Zhang et al. Hierarchical lifelong learning by sharing representations and integrating hypothesis
CN114841261B (en) Incremental width and depth learning drug response prediction methods, media and devices
CN112257858B (en) Model compression method and device
Gallant et al. Representing objects, relations, and sequences
CN112288075B (en) Data processing method and related equipment
CN116415654A (en) Data processing method and related equipment
Tsantekidis et al. Recurrent neural networks
CN113066526B (en) Hypergraph-based drug-target-disease interaction prediction method
US11651841B2 (en) Drug compound identification for target tissue cells
Aggarwal et al. Recurrent neural networks
Payne et al. Bert learns (and teaches) chemistry
Zhou et al. Improving neural protein-protein interaction extraction with knowledge selection
Wang et al. Joint Character‐Level Convolutional and Generative Adversarial Networks for Text Classification
CN116720519B (en) Seedling medicine named entity identification method
CN111428046B (en) Knowledge graph generation method based on bidirectional LSTM deep neural network
Mishra et al. Long short-term memory recurrent neural network architectures for melody generation
Ibne Akhtar et al. Bangla text generation using bidirectional optimized gated recurrent unit network
Galatolo et al. Using stigmergy to incorporate the time into artificial neural networks
Yu et al. AKA-SafeMed: A safe medication recommendation based on attention mechanism and knowledge augmentation
Julian Deep learning with pytorch quick start guide: learn to train and deploy neural network models in Python
Györgyi Techniques of replica symmetry breaking and the storage problem of the McCulloch–Pitts neuron
Wang et al. Feature Fusion‐Based Improved Capsule Network for sEMG Signal Recognition
Cao et al. Learning functional embedding of genes governed by pair-wised labels
Yang et al. Statistical inference: learning in artificial neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant