CN116206688A - Multi-mode information fusion model and method for DTA prediction - Google Patents
Multi-mode information fusion model and method for DTA prediction Download PDFInfo
- Publication number
- CN116206688A CN116206688A CN202310188140.5A CN202310188140A CN116206688A CN 116206688 A CN116206688 A CN 116206688A CN 202310188140 A CN202310188140 A CN 202310188140A CN 116206688 A CN116206688 A CN 116206688A
- Authority
- CN
- China
- Prior art keywords
- target
- information
- drug
- modal
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000004927 fusion Effects 0.000 title claims abstract description 28
- 239000003814 drug Substances 0.000 claims abstract description 127
- 229940079593 drug Drugs 0.000 claims abstract description 86
- 239000003596 drug target Substances 0.000 claims abstract description 15
- 239000000284 extract Substances 0.000 claims abstract description 9
- 238000010586 diagram Methods 0.000 claims description 43
- 239000013598 vector Substances 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 13
- 238000010606 normalization Methods 0.000 claims description 12
- 238000007500 overflow downdraw method Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000007475 c-index Methods 0.000 description 13
- 125000004429 atom Chemical group 0.000 description 11
- 238000013527 convolutional neural network Methods 0.000 description 10
- 238000002474 experimental method Methods 0.000 description 10
- 238000011156 evaluation Methods 0.000 description 8
- 150000001875 compounds Chemical class 0.000 description 7
- 238000007876 drug discovery Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 108090000623 proteins and genes Proteins 0.000 description 6
- 102000004169 proteins and genes Human genes 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 238000003032 molecular docking Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000002679 ablation Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 229940043355 kinase inhibitor Drugs 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 239000003757 phosphotransferase inhibitor Substances 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004617 QSAR study Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000005094 computer simulation Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 2
- 239000010931 gold Substances 0.000 description 2
- 229910052737 gold Inorganic materials 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 241000165780 Preussia/Sporomiella species complex Species 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000013209 evaluation strategy Methods 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 229940050561 matrix product Drugs 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 238000000324 molecular mechanic Methods 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Chemical & Material Sciences (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Data Mining & Analysis (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Pharmacology & Pharmacy (AREA)
- Bioethics (AREA)
- Public Health (AREA)
- Databases & Information Systems (AREA)
- Medicinal Chemistry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Crystallography & Structural Chemistry (AREA)
- Epidemiology (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention provides a multimode information fusion model and a method for DTA prediction, wherein the model comprises a drug molecular structure information encoder, a target structure information encoder, a multimode balance module and a drug target fusion module; the medicine molecular structure information encoder uses a transducer model to encode medicine character string modal information, and uses a GIN model to extract medicine graph modal information characteristics; the target structure information encoder encodes the target character string modal information by using a transducer model, and extracts drug graph modal information characteristics by using a GCN model; the multi-modal balancing module balances and integrates the drug strings and the graph modal information by using a contrast learning method, and balances and integrates the target strings and the graph modal information; the drug target fusion module connects two modal characteristics of the drug and the target obtained by the multi-modal balance module for DTA prediction.
Description
Technical Field
The invention relates to the technical field of drug target binding affinity prediction, in particular to a multi-mode information fusion model and method for DTA prediction.
Background
Drug discovery is a process of discovering potential novel drugs, and involves various fields of pharmacology, chemistry, biology and the like, and huge economic cost and time cost are generally required to be consumed. It is counted that it costs about 26 billion dollars to develop a new drug, while it takes 17 years to get FDA approval. Over the years, with the development of computer technology, computer-aided drug discovery has become a trend, so there is an urgent need to develop a computational model that advances the process of drug discovery. Among them, successful recognition of drug-target interactions is a key step in drug discovery, while affinity to further accurately recognize drug-target interaction relationships is more important for drug development. DTA represents the relationship between the intensity of binding of a drug molecule to a target, and in general, the more strongly a compound molecule binds to a target, the more likely it is that the compound will affect the biological function of the target and is more likely to be a suitable candidate drug. Therefore, establishing a calculation model to accurately predict DTA can accelerate the screening process of drug molecules, minimize unnecessary in-vitro screening experiments, and have important significance for drug research and development.
Many calculation methods and models for DTA prediction have been proposed so far, for example: traditional molecular docking techniques, which predict the binding pattern and binding affinity of drugs and targets by computer simulation, are based on the 3D structure of the target and compound molecules. Many of the mature molecular docking algorithms are developed as software, such as Gold and Dock, and these molecular docking techniques are very time consuming. With the development of computer technology, molecular dynamics simulation technology appears, such as Elanie et al combine a rapid geometric docking algorithm with molecular mechanics interaction energy assessment, calculate the potential of each ligand atom for scoring, and are more flexible, and the prediction result is more accurate, but the cost of calculation and time is expensive.
Most early machine learning methods were based on matrix calculations predicted by structural similarity calculations, which greatly reduced costs. For example, he et al propose a method called SimBoost which predicts successive values of binding affinity for compounds and proteins. Li et al propose a random forest based molecular docking method that predicts by applying the Kronecker similarity matrix product. However, these methods rely excessively on structural data characteristics of the molecule, and acquiring such data is difficult and time consuming. With the rapid development of deep learning and big data age, convolutional Neural Networks (CNNs), graphic Neural Networks (GNNs), and their variants are applied in the field of drug discovery. Since structural information of drugs and targets plays an extremely critical role in DTA prediction, existing DTA prediction methods are mostly based on structural information of drugs and targets, and they can be classified into string-based and graph-based methods.
The string-modality based method learns features from sequence data. For example, deep dta uses CNN for feature extraction of one-dimensional representations of target sequences and drug SMILES. WideDTA calculates complementary protein domain, motif and maximum common substructural word information on this basis and introduces a word-based sequence representation for DTA prediction. In contrast, attentionDTA focuses more on important key subsequences in drug and target sequences, and introduces a double-sided multi-headed attention mechanism to predict DTA. These methods focus only on the string modality of the drug SMILES and target information, and the information of this modality ignores the spatial structure and hydrogen atom information. Furthermore, only a fixed length of the string is considered during the embedding process, which will result in the loss of some useful information. To address this drawback, methods based on graph modalities have evolved. GraphDTA proposes to represent the drug molecular structure information as a map, and feature-extract the drug molecular map using GNNs, and feature-extract the target sequence using CNN. DGgraphDTA performs DTA prediction by using a medicine molecular diagram and a target structure diagram, and performs feature extraction by using a graph convolutional neural network model (GCN). However, the drug molecular diagram lacks the contextual semantic information of the character string and the positional arrangement of the atoms. In the method, only the space structure of the target is considered in the target structure diagram, the arrangement sequence of the target residues is not considered, and the position information of the peptide chain residues is ignored. Therefore, there is a need to systematically consider multimodal information of drugs and target structures to obtain complete information that better predicts DTA.
Multimodal techniques can systematically consider information from a variety of different modalities. In the last decade, information fusion technology has successfully achieved fusion of multi-modal information, and the utilization of multi-modal information has attracted attention of researchers. For example: tuan et al propose a new method to detect false news by fusing multimodal features from text and visual data. Mou et al propose a deep learning model to fuse data of multiple modalities, including eye data, vehicle data and environmental data. From this, the multi-modal information fusion technique has been widely used in various fields. Likewise, the fusion utilization of multimodal information may also be applied in the field of drug discovery. For example: deng et al developed a multi-modal variogram-based embedding method, graph2MDA, that incorporates various attributes and features of microorganisms, drugs to predict microorganism-drug associations. Lyu et al consider the potential correlation between drugs and multimodal data for targets, enzymes, etc., and design an MDNN dual channel framework to obtain multimodal characterization of drugs. It can be seen that these drug discovery methods use embedded representations of different properties of drugs expressed in different modalities, without simultaneously focusing on the multiple modality information of a certain property. Moreover, the existing DTA method only considers the single structural properties of the drug and the target, and does not consider the multiple attribute information of different modes of the drug and the target.
Disclosure of Invention
The invention aims to provide a multi-mode information fusion model and a multi-mode information fusion method for DTA prediction, wherein the model can be embedded with character strings and graph mode information in medicines and targets, and feature representations of different modes are balanced through a contrast learning method so as to output richer information for DTA prediction.
In order to solve the technical problems, the invention adopts the following technical methods: a multimodal information fusion model for DTA prediction, comprising: the system comprises a drug molecular structure information encoder, a target structure information encoder, a multi-modal balancing module and a drug target fusion module;
the medicine molecular structure information encoder encodes medicine character string modal information by using a transducer model, and extracts medicine graph modal information characteristics by using a GIN model;
the target structure information encoder encodes target character string modal information by using a transducer model, and extracts drug graph modal information characteristics by using a GCN model;
the multi-modal balancing module balances and integrates the drug strings and the graph modal information by using a comparison learning method, and balances and integrates the target strings and the graph modal information;
the drug target fusion module connects two modal characteristics of the drug and the target obtained by the multi-modal balance module and is used for DTA prediction.
As another aspect of the present invention, a multi-modal information fusion method for DTA prediction includes:
step S1, embedding a character string mode;
taking the medicine SMILES code as a character string, carrying out integer coding on the character string, merging the position codes of the code to obtain vector representation, and carrying out feature extraction on the vector by a transducer model to obtain the final vector representation of the SMILES character string;
the target sequence is regarded as a character string, integer codes are carried out on the character string, the vector representation is obtained by merging the position codes of the codes, and the final vector representation of the target character string is obtained by carrying out feature extraction on the vector through a transducer model;
step S2, embedding a graph mode;
taking each atom as a node in the medicine molecular diagram, taking the relation among the atoms as an adjacent matrix of the medicine molecular diagram, and taking the attribute of the atoms as the attribute characteristic of the medicine molecular diagram node; taking the characteristic vectors of the medicine molecular diagram and the nodes thereof as input, and embedding the nodes through a GIN model to obtain the representation vector of the medicine molecular diagram;
taking each residue as a node in the target structure diagram, taking the probability of whether the residue pair contacts as an adjacent matrix of the target structure diagram, and grading the position of each residue through a sequence comparison result to be taken as an attribute characteristic of the target structure diagram node; taking a target structure diagram and the feature vector of the node thereof as input, and embedding the node through a GCN model to obtain the representation vector of the target structure diagram;
step S3, comparing and learning multi-mode representation and fusing representation;
and learning characteristic representation by maximizing consistency of a character string mode and a graph mode, respectively obtaining final representation of two modes of the drug and the target, and then splicing the final representation to obtain drug and target mode information for DTA prediction.
Further, in step S1, after the drug and target character strings are integer coded, the positional information of the character string modes is captured by using the arrangement information of the drug atoms and target residues, and the abstract features of different levels are learned from the input through a transducer model, and then the final vector representation of the drug and target character strings is obtained by applying a maximum pooling layer.
Further, in step S1, the following formula is used to represent the position information of the character string mode:
PE (pos,2i) =sin(pos/10000 2i/dmodel ) (1)
PE (pos,2i+1) =cos(pos/10000 2i/dmodel ) (2)
where pos represents a character in the string, i is the dimension of the character code, and dmedel is the code of the character.
Still further, the transducer model includes an MSA layer and an MLP block, the function of the MSA layer being expressed as:
z m =MSA(LN(z l-1 ))+z l-1 ,l=1…L (3)
wherein ,zl-1 Representing the input of MSA layer, z m Representing the output of MSA, LN represents the normalization layer, L represents the batch of input lengths;
the MLP block contains two CNN layers and one normalization layer, the function of which is expressed as:
z l =MLP(LN(z m ))+z m ,l=1…L (4)
wherein ,zl Representing the output of the MLP.
Still further, the GIN model includes five GIN layers, each GIN layer is followed by a batch normalization layer, the last batch normalization layer is followed by a global maximum pool layer, each GIN layer uses a multi-layer perceptron model to characterize the node x i The updating is as follows:
where k represents the GIN layer, e is a learnable parameter or fixed scalar, and N (i) is the neighbor set of the inode.
Still further, the GCN model includes three GCN layers, each activated by a RELU function, and a global maximum pool layer is connected after the last GCN layer, and each GCN layer performs a convolution operation, as follows:
wherein ,adjacency matrix, which is the structure of the target,>is the degree matrix of the target structure diagram, sigma is the activation function, and W is the learnable weight matrix.
Preferably, in step S3, when learning the feature representation by maximizing the consistency of the string mode and the graph mode, fixing any one drug or target itself as an anchor point a, obtaining a set of positive samples P composed of multiple modes of the drug or target itself, regarding all the modes of other drugs or targets as negative samples N, generating each positive (a, P) and negative pair (a, N), and forcing all the mode representations of the anchor point a to be consistent by adopting contrast learning, and distinguishing from all the mode representations of the negative samples, wherein the calculation of Loss is as follows:
wherein, for each sample i, P (i) is its positive sample set, |p (i) | is the number of positive samples, P is one of the positive samples, N (i) is the negative sample set, N is one of the negative samples, and T is the temperature coefficient.
Compared with the traditional model, the multi-mode information fusion model and the method provided by the invention have more excellent feature capturing capability, and can provide more abundant and complete mode information for DTA prediction. Specifically, the invention fully considers the character string and graph mode information in the medicine and the target, wherein the character string mode comprises the medicine SMILES and the target sequence, the graph mode comprises the medicine molecular graph and the target structure graph information, different deep learning models are selected to extract the data characteristics of different modes, the characteristic representation of different modes is balanced through a contrast learning method, and the arrangement information of medicine atoms and target residues is used to capture the position information of the string mode, so that more useful characteristic information can be extracted in the SMILES and the target sequence to effectively improve the DTA prediction effect.
Drawings
Fig. 1 is a flowchart of the application of the multi-modal information fusion model according to the present invention to DTA prediction;
FIG. 2 is a diagram of a comparative learning framework in a multi-modal information fusion method according to the present invention;
FIG. 3 is a graph comparing MSE and CI indicators for various models on a Davis dataset in an evaluation experiment according to the present invention;
FIG. 4 is a graph comparing MSE and CI indicators for various models on a KIBA dataset in an evaluation experiment according to the present invention.
Detailed Description
The invention will be further described with reference to examples and drawings, to which reference is made, but which are not intended to limit the scope of the invention.
Traditional drug-target affinity prediction methods based on biological experiments cannot meet the requirements of drug R & D in the big data age, and although a drug-target binding affinity prediction task model based on deep learning has been successful, the models only consider single mode characteristics of drug and target information, and the information is not complete and rich enough. In fact, the information of different modes of the drug and the target can be complementary, and more valuable information can be obtained by fusing the information of different modes. Based on the above, the invention designs a multi-mode information fusion model called FMDTA for DTA prediction, which is specifically as follows.
1. FMDTA
The invention combines and fuses the information of the two modes to obtain more complete information because the data of the character string mode and the figure mode have respective advantages and disadvantages. FMDTA consists of four parts: the device comprises a drug molecular structure information encoder, a target structure information encoder, a multi-modal balancing module and a drug target fusion module. Wherein:
the drug molecular structure information encoder uses a transducer model to encode drug string modal information, and uses a GIN model to extract drug map modal information features.
The target structure information encoder encodes the target character string modal information by using a transducer model, and extracts drug pattern modal information features by using a GCN model.
The multi-modal balancing module balances and integrates the drug strings and the graph modal information and balances and integrates the target strings and the graph modal information by using a contrast learning method.
The drug target fusion module connects two modal characteristics of the drug and the target obtained by the multi-modal balance module for DTA prediction.
2. The working flow of the FMDTA is shown in figure 1, namely the multi-mode information fusion method provided by the invention mainly comprises three steps of:
step S1, embedding a character string mode
S11, embedding of medicine SMILES
Each medication SMILES code is treated as a string of characters, and is input by integer encoding, with the integer as the character. Here, it should be noted that the model needs to be trained before practical application, and in order to facilitate the FMDAT training provided by the present invention, it is preferable to cut or fill each SMILES string into a fixed length of 100 characters, and these integer sequences are used as inputs to the embedding layer that returns a 128-dimensional vector representation. Different levels of abstract features are learned from the input using a transducer model.
Considering that the arrangement sequence of SMILES character strings is critical to the characteristic expression of drug molecules, the invention integrates the position information of SMILES, and the position coding is expressed by adopting the following formula:
PE (pos,2i) =sin(pos/10000 2i/dmodel ) (1)
PE (pos,2i+1) =cos(pos/10000 2i/dmodel ) (2)
where pos represents a character in the SMILE string, i is the dimension of the character code, and dmodel is the code of the character.
After the position codes are obtained, the drug SMLIES coding information is added with the position codes to obtain complete coding information, and then abstract features with different levels are learned from the input through a transducer model.
For the transducer model, the invention follows the original transducer design, which consists of a multi-head attention (MSA) layer and an MLP block. The function of the MSA layer is expressed as:
z m =MSA(LN(z l-1 ))+z l-1 ,l=1…L (3)
wherein ,zl-1 Representing the input of MSA layer, z m Representing the output of the MSA, LN represents the normalization layer, and L represents the batch of input lengths.
After the MSA layer, the MLP block contains two CNN layers and one linear normalization layer, the function of which is expressed as:
z l =MLP(LN(z m ))+z m ,l=1…L (4)
wherein ,zl Representing the output of the MLP.
Finally, a max pooling layer is applied to obtain a final vector representation of the drug string.
S12, embedding of target sequence
Each target sequence is considered as a string of characters, which are input after being encoded by integers. Similarly, for ease of training, each target sequence is cut or padded to a fixed length of 1000 residues, and these integer sequences are used as inputs to the embedding layer that return the 128-dimensional vector representation. Similarly, considering that the arrangement sequence of target residues is critical to the structural feature expression of the target, after the position information is encoded, abstract features of different levels are learned from the input by using a transducer model, and then the final vector representation of the target character string is obtained by using a maximum pooling layer.
Step S2, embedding of graph modalities
S21, embedding medicine molecular diagram
The drug molecular profile data in the present invention are from GraphDTA [ from literature [15] jiang, m.; li, Z; zhang, s.; wang, s.; wang, x.; yuan, q.; wei, Z.drug-target affinity prediction using graph neural network and contact maps.RSC Advances 2020,10,20701-20712. According to the invention, each atom is used as a node in a medicine molecular diagram, and the connection among atoms is used as an adjacent matrix of the medicine molecular diagram. And meanwhile, the related attribute of the atoms is used as the attribute characteristic of the drug molecular diagram node. And taking the characteristic vectors of the medicine molecular graph and the nodes thereof as input, and embedding the nodes through a GIN (Graph Isomorphic Networks) model.
Specifically, the GIN model includes five GIN layers, each of which characterizes a node x through a multi-layer perceptron (MLP) model i The updating is as follows:
where k represents the GIN layer, e is a learnable parameter or fixed scalar, and N (i) is the neighbor set of the inode.
In the GIN model, a batch normalization layer is arranged behind each GIN layer, and finally, a global maximum pool layer is added to obtain the representation vector of the medicine molecular diagram.
S22, embedding a target structure diagram
The target structure diagram data in the invention are from dggraphdta [ from literature Jiang, m.; li, Z; zhang, s.; wang, s.; wang, x.; yuan, q.; the invention uses each residue as a node in a target structure diagram, and uses the probability of whether the residue pair is contacted or not as an adjacent matrix of the target diagram, so that the spatial information of the protein is well reserved. And scoring each residue position through sequence comparison results and taking the residue position as a characteristic vector of a residue node. Then, the target structure diagram and the characteristic vector of the node are taken as input, and the node embedding is carried out through a GCN model.
Specifically, the GCN model includes three GCN layers, each of which is constructed by performing a convolution operation as follows:
wherein ,adjacency matrix, which is the structure of the target,>is the degree matrix of the target structure diagram, sigma is the activation function, and W is the learnable weight matrix.
Each GCN layer of the GCN model is activated by RELU function, and a global maximum pool layer is added after the last GCN layer, so that a representation vector of a target structure diagram is obtained.
S3, fusion of comparison learning and representation of multi-modal representation
In extracting features from data of different modalities, on one hand, since the internal structures of different drugs or targets may contain information of a plurality of similar functional groups or residues, the features between different drugs/targets become blurred and similar after encoding. On the other hand, due to the obvious difference of embedding methods of different modes, the characteristic representation of two modes of the same drug/target has obvious difference, and the characteristic representation of different drugs/targets of the same mode has small difference. Therefore, a proper fusion method needs to be selected to balance information embedding between different modes, so that information of multiple modes can be complemented.
In order to balance information embedding among different modes, the invention adopts a method of contrast learning to capture the interaction relation of information among modes. As shown in fig. 2, in FMDTA, we learn the feature representation by maximizing the consistency of the string and graph modalities. Specifically, for example, for drug i, fixing itself as anchor point a, a set of positive samples P consisting of its own multiple modes is obtained, and the representation of all modes of the other drugs is regarded as negative samples N, generating each positive (a, P) and negative (a, N) pair. Then, with contrast learning, all modal representations of the forced anchor point a are identical and distinguished from all modal representations of the negative sample. In the contrast learning process, the embedded representation of the drug (target) is iteratively updated through calculation Loss, so that the representations of two modes of the same drug (target) are closer, the representations of different drugs (targets) of the same mode and different drugs (targets) of different modes are more different, the embedded representation of the drug (target) is balanced, and the final embedded representation is obtained, wherein the calculation of Loss is as follows:
wherein, for each sample i, P (i) is its positive sample set, |p (i) | is the number of positive samples, P is one of the positive samples, N (i) is the negative sample set, N is the negative sample therein, and T is the temperature coefficient.
And (5) comparing and learning the embedded information of different modes of the target.
After the final representation of the two modes of the drug and the target is obtained, the drug and the target mode information for DTA prediction is obtained by splicing the two modes, and then the DTA can be predicted through two full-connection layers.
3. In order to verify the feasibility and the superiority of the model and the method according to the invention, a verification experiment is carried out next.
3.1 Experimental data
In this experimental evaluation, the samples from deep DTA [ literature-derived ] were usedH.;/>A.;Ozkirimli,E.DeepDTA:deep drug–target binding affinity prediction.Bioinformatics2018,34,i821–i829.]Comprises Davis [ originating from literature Davis, m.i.; hunt, j.p.; herrgard, s.; ciceri, p.; wodicka, l.m.; pallares, g.; hocker, m.; treiber, d.k.; zarrinkar, P.P. complex analysis of kinase inhibitor selection is 2011,29,1046-1051, nature biotechnology.]And KIBA [ from document Tang, j.; szwajda, a.; shakyawar, s.; xu, t.; hintsanen, P.; wennenberg, K.; aittokallio, T.marking sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis.journal ofChemical Information and Modeling 2014,54,735-743.]A data set.
The Davis dataset contains 72 compounds and 442 proteins, and their corresponding affinity values, measured by Kd values (kinase dissociation constants), ranging from 5.0 to 10.8. The compound has an average length of the SMILES string of 64 and an average length of 788 of the target sequence.
The KIBA dataset contains combined kinase inhibitor bioactivity from different sources (e.g., ki, kd and IC 50), contains 2,116 drugs and 229 binding affinities of the targets, measured as KIBA fraction, ranging from 0.0 to 17.2. The compound has an SMILES string average length of 58 and a target sequence average length of 728. The data information is summarized in table 1.
Table 1 dataset data statistics
3.2 evaluation index
The present experiment evaluates using Mean Squared Error (MSE), the Concordance Index (CI) andthe index is used as an evaluation index of the model performance.
MSE is an evaluation index commonly used in regression tasks, and samples p are measured by calculating the average of the sum of squares of the differences between the predicted values of the samples and the actual values of the samples i Predicted value and sample actual value y i Differences between them. The smaller the MSE the better the effect. The formula for the MSE is as follows:
CI is an index that evaluates the rank of predicted values and rank consistency of the true values between them, the higher the value, the better the predictive effect of the model. The calculation formula of CI is as follows:
wherein ,bx Is of greater affinity d x Predicted value of b y Is of smaller affinity d y Z is a normalization constant; h (x) is a step function, the formula is as follows:
the index is used to evaluate the external predictive performance of the QSAR (quantitative structure activity relationship) model. The formula is as follows:
For training, a server with 2 Intel (R) Xeon (R) Gold 5215.5 GHz CPU, 256GB RAM and 3 NVIDIA Corporation GV GL GPUs was used in the experimental evaluation. The super parameters used for the experiments are shown in table 2.
TABLE 2 super parameters
Hyper-parameters | Setting |
Learning rate | 0.0005 |
Batch size | 512 |
Optimizer | Adam |
GIN layers | 5 |
GCN layers | 3 |
Transformer_embedding_dim | 128 |
GNNS_embedding_dim | 128 |
3.3 evaluation strategy
The experiment uses 5 fold cross validation to verify the performance of the model. The data sets were randomly split into training, validation and test sets at a ratio of 3:1:1 at each compromise. Finally, the model selected by verification is evaluated on the test set.
3.4 model comparison
In this experiment, classical DTA predictive models were classified into two classes and compared with them for performance. The methods in which only the string modality is considered are DeepDTA, wideDTA and attritiondta. The models all adopt CNN models to encode character strings.
Deep dta: the system consists of two independent CNN modules, is used for learning characteristic representation of SMILES and target Sequence information of the drug respectively, and inputs the representation into three fully connected layers for DTA prediction.
WideDTA: protein domains and motifs and maximumcommon substructure words information is introduced on the basis of deep DTA, so that DTA prediction is better performed.
AttentionDTA: two-sided multi-headed attention mechanisms were introduced to focus on critical subsequences important for drug and protein sequences in predicting their affinities.
The methods that considered the graph modality were GraphDTA, DGraphDTA, deepGLSTM and SAG-DTA, both of which used GNNs to encode the graph.
GraphDTA: the medicine molecular diagram is introduced, the molecular diagram characteristics of the GNNs learning medicine are utilized, the characteristics of the CNN learning target Sequence are used for representing, and the two parts are spliced and predicted to DTA through the full-connection layer.
DGraphDTA: the target structure diagram is introduced on the basis of GraphDTA, and the GNNs are utilized to learn the characteristics of drugs and targets, and the two parts are spliced to predict the DTA through a full-connection layer.
SAG-DTA: drug molecular graph characterization was improved by a self-attention mechanism on the basis of GraphDTA for DTA prediction.
Deep glstm: a graph-convolution network and LSTM based method encodes drugs and targets, respectively, to predict DTA.
TABLE 3 results of DTA predictive model on DAVIS dataset
Table 3 gives the performance of FMDTA and baseline methods on Davis dataset. To maintain comparative fairness, the same hyper-parameters and evaluation index were used for all models. As can be seen, the FMDTA is significantly better than the method which relies solely on string or graph modes in three indexes, the MSE is 0.195, the CI is 0.909, and R 2 m is 0.748. Furthermore, the graph modality-based approach is generally superior to the string modality-based approach because the fixed length interception of the string modality data robustly loses much information. Although the data of the graph modality is relatively complete, features tend to be smoothed during the graph embedding process. By a means ofIn order to fuse the multimodal information of the drug and the target by a suitable method, it is necessary to supplement the information with each other.
To further test the generalization of the proposed method, the present experiment uses the same hyper-parameters on the KIBA dataset as in the Davis dataset to evaluate the model. As shown in Table 4, it can be seen that FMDTA still exhibited excellent, specifically, MSE of FMDTA was 0.133, CI was 0.899, R 2 m is 0.801. These results demonstrate the effectiveness of the model of the invention in DTA prediction and good generalization ability.
TABLE 4 results of DTA predictive model on KIBA dataset
3.5 ablation study
3.5.1 validity analysis of multimodal information fusion
The model of the invention was compared to DeepDTA, graphDTA and dggraphdta on the Davis dataset. In order to ensure the fairness of comparison, the same module of the model and the comparison model adopts the same parameters. Wherein FMDTA (w/o PC) represents the simple splicing and fusion of two modes of a drug and a target, and does not contain character string position information coding and contrast learning.
As can be seen from fig. 3, fusing information from both modalities is superior to using only single modality information. The combination of the structural information of the medicine and the target can obtain more complete structural information and spatial structural information, and the limitation that the deep DTA simply utilizes the character string mode is overcome. The character string modal information of the medicine SMILES and the target Sequence can acquire the arrangement information and the context semantic information of the character string, and the limitation that the DGgraphDTA simply utilizes the structural information is overcome. The graphDTA utilizes a medicine molecular structure diagram and target Sequence information, and also only utilizes information of a single mode, so that the information is not complete. The method shows that the information of the two modes can be mutually supplemented, and the prediction capability of the model on the DTA can be effectively improved by fusing the information of the two modes. Meanwhile, the effect of GraphDTA is better than that of deep DTA and DGgraphDTA, and the different modes of medicine and target protein information are indicated, so that the effect is better.
To further verify the validity of the multimodal information fusion on DTA tasks, FMDTA (w/o PC) was compared to DeepDTA, graphDTA and dggraphdta on KIBA dataset. FIG. 4 shows performance of FMDTA (w/o PC) and unimodal models on KIBA datasets. It can be seen that the fusion of multimodal information is equally excellent on the KIBA dataset, indicating that the method is rational and efficient.
3.5.2 validity analysis of string position information coding and multimodal contrast learning
This section aims to prove the importance of the position information coding of the character string modality and the importance of multi-modality contrast learning. As previously described, the order of the atoms in the SMILES and the order of the target sequence residues are critical to the molecular characterization. At the same time, it is also important to maximize the consistency of the string and graph modalities to learn the feature representation, where a comprehensive ablation study is deployed to explore the necessity of each individual module, and to compare on the Davis and KIBA datasets.
Three variants of FMDTA are:
(1) FMDTA (w/oPC) FMDTA without position information of string modal coding component and multi-modal contrast learning component, which directly splices two separate modalities of drug and target protein.
(2) FMDTA (w/o C) FMDTA without multimodal contrast learning component extracts positional information of string modalities but does not take interaction between modality features into account.
(3) FMDTA (w/o P) FMDTA without string modal position information encoding, balances fusion string and graphic modal characteristics by contrast learning components.
Table 5 results of ablation study on positional information and contrast learning
As can be seen from the results of Table 5, the MSE index was increased by 0.5% and the CI was increased by 0.7% by adding the position encoding module of the string modality to the Davis dataset. The MSE index improvement was not significant on the KIBA dataset, whereas the CI improved by 0.3%. After the multi-modal contrast characterization learning module is added on the Davis data set, the MSE measurement and the CI are significantly improved by 2% and 1%, respectively. On the KIBA dataset, MSE and CI were significantly increased by 0.4% and 0.7%, respectively. The addition of the position-coding module and the contrast-learning module resulted in a more significant improvement, with 2.8% and 1.5% improvement in MSE and 0.7% and 1.2% improvement in CI, respectively, over the Davis and KIBA datasets. From this, it can be concluded that:
(1) The position information added with the character string codes can capture more complete information, and the model performance is improved.
(2) Contrast learning allows information from different modalities to interact during encoding, resulting in a more balanced representation of features from different modalities and a more flexible model.
The foregoing embodiments are preferred embodiments of the present invention, and in addition, the present invention may be implemented in other ways, and any obvious substitution is within the scope of the present invention without departing from the concept of the present invention.
In order to facilitate understanding of the improvements of the present invention over the prior art, some of the figures and descriptions of the present invention have been simplified, and some other elements have been omitted from this document for clarity, as will be appreciated by those of ordinary skill in the art.
Claims (8)
1. The multi-modal information fusion model for DTA prediction is characterized by comprising a drug molecular structure information encoder, a target structure information encoder, a multi-modal balancing module and a drug target fusion module;
the medicine molecular structure information encoder encodes medicine character string modal information by using a transducer model, and extracts medicine graph modal information characteristics by using a GIN model;
the target structure information encoder encodes target character string modal information by using a transducer model, and extracts drug graph modal information characteristics by using a GCN model;
the multi-modal balancing module balances and integrates the drug strings and the graph modal information by using a comparison learning method, and balances and integrates the target strings and the graph modal information;
the drug target fusion module connects two modal characteristics of the drug and the target obtained by the multi-modal balance module and is used for DTA prediction.
2. A multi-modal information fusion method for DTA prediction, comprising:
step S1, embedding a character string mode;
taking the medicine SMILES code as a character string, carrying out integer coding on the character string, merging the position codes of the code to obtain vector representation, and carrying out feature extraction on the vector by a transducer model to obtain the final vector representation of the SMILES character string;
the target sequence is regarded as a character string, integer codes are carried out on the character string, the vector representation is obtained by merging the position codes of the codes, and the final vector representation of the target character string is obtained by carrying out feature extraction on the vector through a transducer model;
step S2, embedding a graph mode;
taking each atom as a node in the medicine molecular diagram, taking the relation among the atoms as an adjacent matrix of the medicine molecular diagram, and taking the attribute of the atoms as the attribute characteristic of the medicine molecular diagram node; taking the characteristic vectors of the medicine molecular diagram and the nodes thereof as input, and embedding the nodes through a GIN model to obtain the representation vector of the medicine molecular diagram;
taking each residue as a node in the target structure diagram, taking the probability of whether the residue pair contacts as an adjacent matrix of the target structure diagram, and grading the position of each residue through a sequence comparison result to be taken as an attribute characteristic of the target structure diagram node; taking a target structure diagram and the feature vector of the node thereof as input, and embedding the node through a GCN model to obtain the representation vector of the target structure diagram;
step S3, comparing and learning multi-mode representation and fusing representation;
and learning characteristic representation by maximizing consistency of a character string mode and a graph mode, respectively obtaining final representation of two modes of the drug and the target, and then splicing the final representation to obtain drug and target mode information for DTA prediction.
3. The multi-modal information fusion method for DTA prediction as recited in claim 2, wherein: in step S1, after the medicine and the target character string are subjected to integer coding, the arrangement information of medicine atoms and target residues is utilized to capture the position information of the character string mode, abstract features of different levels are learned from input through a transducer model, and a maximum pooling layer is applied to obtain the final vector representation of the medicine and the target character string.
4. The multi-modal information fusion method for DTA prediction as recited in claim 3, wherein: in step S1, the following formula is used to represent the position information of the character string mode:
PE (pos,2i) =sin(pos/10000 2i/dmodel ) (1)
PE (pos,2i+1) =cos(pos/10000 2i/dmodel ) (2)
where pos represents a character in the string, i is the dimension of the character code, and dmedel is the code of the character.
5. The multi-modal information fusion method for DTA prediction as recited in claim 4, wherein: the transducer model comprises an MSA layer and an MLP block, and the function of the MSA layer is expressed as:
z m =MSA(LN(z l-1 ))+z l-1 ,l=1...L (3)
wherein ,zl-1 Representing the input of MSA layer, z m Representing the output of MSA, LN represents the normalization layer, L represents the batch of input lengths;
the MLP block contains two CNN layers and one normalization layer, the function of which is expressed as:
z l =MLP(LN(z m ))+z m ,l=1...L (4)
wherein ,zl Representing the output of the MLP.
6. The multi-modal information fusion method for DTA prediction as recited in claim 5, wherein: the GIN model comprises five GIN layers, each GIN layer is provided with a batch normalization layer at the back, the last batch normalization layer is connected with a global maximum pool layer, and each GIN layer is used for integrating node characteristics x through a multi-layer perceptron model i The updating is as follows:
where k represents the GIN layer, e is a learnable parameter or fixed scalar, and N (i) is the neighbor set of the inode.
7. The multi-modal information fusion method for DTA prediction as recited in claim 6, wherein: the GCN model comprises three GCN layers, each GCN layer is activated by RELU functions, a global maximum pool layer is connected behind the last GCN layer, and each GCN layer performs convolution operation once, wherein the convolution operation comprises the following formula:
8. The multi-modal information fusion method for DTA prediction as recited in claim 7, wherein: in step S3, when learning the feature representation by maximizing the consistency of the string mode and the graph mode, fixing any drug or target as an anchor point a, obtaining a group of positive samples P composed of multiple modes of the drug or target itself, regarding the representations of all modes of other drugs or targets as negative samples N, generating each positive pair (a, P) and negative pair (a, N), forcing all the mode representations of the anchor point a to be consistent by adopting contrast learning, and distinguishing from all the mode representations of the negative samples, wherein the calculation of Loss is as follows:
wherein, for each sample i, P (i) is its positive sample set, |p (i) | is the number of positive samples, P is one of the positive samples, N (i) is the negative sample set, N is one of the negative samples, and T is the temperature coefficient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310188140.5A CN116206688A (en) | 2023-03-02 | 2023-03-02 | Multi-mode information fusion model and method for DTA prediction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310188140.5A CN116206688A (en) | 2023-03-02 | 2023-03-02 | Multi-mode information fusion model and method for DTA prediction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116206688A true CN116206688A (en) | 2023-06-02 |
Family
ID=86510874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310188140.5A Pending CN116206688A (en) | 2023-03-02 | 2023-03-02 | Multi-mode information fusion model and method for DTA prediction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116206688A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116825234A (en) * | 2023-08-30 | 2023-09-29 | 江西农业大学 | Multi-mode information fusion medicine molecule activity prediction method and electronic equipment |
CN117132591A (en) * | 2023-10-24 | 2023-11-28 | 杭州宇谷科技股份有限公司 | Battery data processing method and system based on multi-mode information |
CN118097665A (en) * | 2024-04-25 | 2024-05-28 | 云南大学 | Chemical molecular structure identification method, device and medium based on multi-stage sequence |
-
2023
- 2023-03-02 CN CN202310188140.5A patent/CN116206688A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116825234A (en) * | 2023-08-30 | 2023-09-29 | 江西农业大学 | Multi-mode information fusion medicine molecule activity prediction method and electronic equipment |
CN116825234B (en) * | 2023-08-30 | 2023-11-07 | 江西农业大学 | Multi-mode information fusion medicine molecule activity prediction method and electronic equipment |
CN117132591A (en) * | 2023-10-24 | 2023-11-28 | 杭州宇谷科技股份有限公司 | Battery data processing method and system based on multi-mode information |
CN117132591B (en) * | 2023-10-24 | 2024-02-06 | 杭州宇谷科技股份有限公司 | Battery data processing method and system based on multi-mode information |
CN118097665A (en) * | 2024-04-25 | 2024-05-28 | 云南大学 | Chemical molecular structure identification method, device and medium based on multi-stage sequence |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhao et al. | HyperAttentionDTI: improving drug–protein interaction prediction by sequence-based deep learning with attention mechanism | |
Chen et al. | Alchemy: A quantum chemistry dataset for benchmarking ai models | |
Yuan et al. | FusionDTA: attention-based feature polymerizer and knowledge distillation for drug-target binding affinity prediction | |
CN116206688A (en) | Multi-mode information fusion model and method for DTA prediction | |
Li et al. | Protein contact map prediction based on ResNet and DenseNet | |
Cheng et al. | IIFDTI: predicting drug–target interactions through interactive and independent features based on attention mechanism | |
CN113421658B (en) | Drug-target interaction prediction method based on neighbor attention network | |
Kim et al. | Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction | |
Pan et al. | SubMDTA: drug target affinity prediction based on substructure extraction and multi-scale features | |
CN116013428A (en) | Drug target general prediction method, device and medium based on self-supervision learning | |
Shen et al. | Clustering-driven deep adversarial hashing for scalable unsupervised cross-modal retrieval | |
CN115472221A (en) | Protein fitness prediction method based on deep learning | |
CN118038995B (en) | Method and system for predicting small open reading window coding polypeptide capacity in non-coding RNA | |
Tian et al. | GTAMP-DTA: Graph transformer combined with attention mechanism for drug-target binding affinity prediction | |
Zhou et al. | Accurate and definite mutational effect prediction with lightweight equivariant graph neural networks | |
Hu et al. | Improving Protein-Protein Interaction Prediction Using Protein Language Model and Protein Network Features | |
Enireddy et al. | OneHotEncoding and LSTM-based deep learning models for protein secondary structure prediction | |
CN116646001A (en) | Method for predicting drug target binding based on combined cross-domain attention model | |
Wang et al. | Sparse imbalanced drug-target interaction prediction via heterogeneous data augmentation and node similarity | |
Jha et al. | Prediction of Protein-Protein Interactions Using Vision Transformer and Language Model | |
Zhijian et al. | GDGRU-DTA: predicting drug-target binding affinity based on GNN and double GRU | |
Zhang et al. | A Multi-perspective Model for Protein–Ligand-Binding Affinity Prediction | |
Zhang et al. | GANs for molecule generation in drug design and discovery | |
Halsana et al. | DensePPI: A Novel Image-Based Deep Learning Method for Prediction of Protein–Protein Interactions | |
Zhang et al. | An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |