CN116206775A - Multi-dimensional characteristic fusion medicine-target interaction prediction method - Google Patents

Multi-dimensional characteristic fusion medicine-target interaction prediction method Download PDF

Info

Publication number
CN116206775A
CN116206775A CN202310038717.4A CN202310038717A CN116206775A CN 116206775 A CN116206775 A CN 116206775A CN 202310038717 A CN202310038717 A CN 202310038717A CN 116206775 A CN116206775 A CN 116206775A
Authority
CN
China
Prior art keywords
target
drug
information
medicine
interaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310038717.4A
Other languages
Chinese (zh)
Inventor
车超
姚奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University
Original Assignee
Dalian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University filed Critical Dalian University
Priority to CN202310038717.4A priority Critical patent/CN116206775A/en
Publication of CN116206775A publication Critical patent/CN116206775A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Theoretical Computer Science (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Toxicology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a medicine-target interaction prediction method fusing multidimensional characteristics; extracting information of the medicines and targets with interaction and related diseases and side effects of the medicines from a medical database, and constructing a heterogeneous network; extracting topological structure features in the heterogeneous network by using the heterogeneous graph semantic neural network; the molecular sequence of the medicine SMILES is expressed as a molecular diagram structure, and medicine characteristic information is extracted; simultaneously extracting target feature information; the extracted characteristic information of the medicine and the target point is integrated into the message transmission process of the heterograph attention neural network to train, a model is stored, and the relation between the medicine and the target point is predicted. The invention effectively utilizes the biological characteristic information of the medicine and the target, has higher accuracy in predicting the medicine-target relation, improves the efficiency and the accuracy of verifying the medicine-target relation, effectively shortens the medicine research and development period and greatly reduces the research and development cost of new medicines.

Description

Multi-dimensional characteristic fusion medicine-target interaction prediction method
Technical Field
The invention relates to the technical field of medical artificial intelligence, in particular to a medicine-target interaction prediction method fusing multidimensional features.
Background
New drug development is a lengthy and expensive process, typically taking 10-17 years from thinking to drug market, and capital investment will be between 7-27 billion dollars. The method for discovering the new indication by using the existing medicaments has the advantages of low research and development cost and short development time. Thus, it is becoming increasingly attractive to reuse existing drugs to treat common and rare diseases. Predicting drug-target interactions is an essential step in the identification of new candidate compounds with potential therapeutic effects. The medicine plays an important role in human body through interaction with various targets, can strengthen or inhibit the function of the medicine, and plays a regulating role to achieve the aim of treating a certain disease. Thus, identifying drug-target interactions can help understand the mechanism of action of a drug, playing a vital role in the discovery of new targets and drug repositioning.
Currently, structure-based methods, ligand similarity-based methods, and network-based methods are the primary ways of performing drug-target interaction prediction. Where structure-based methods generally require knowledge of the three-dimensional structure of the protein, the performance of those proteins whose structure is unknown is often poor. Methods based on ligand similarity make use of the common sense of known ligands to predict, and if the target compound is not indicated in the target ligand library, such methods will not yield a reliable prediction. Network-based methods make full use of potential correlations between drugs and targets, and have become a mainstream technique for analyzing and solving drug target interaction prediction-related problems. Inspired by information transmission and clustering tasks in deep learning, drug target prediction can perform large-scale data mining on a graph neural network, wherein a graph convolutional network-based method is particularly outstanding, a large amount of effective hidden information is stored in a huge heterogeneous network formed by drugs and related data, and potential association existing in the network can be effectively mined by processing the information through the graph convolutional network, so that drug discovery research is facilitated. However, these methods generally ignore the use of biological knowledge, such as the biological structural properties in the sequence of the compound, and thus cannot obtain the potential features in the data, and there is still a large room for improvement in terms of model performance.
Disclosure of Invention
The invention aims to provide a graphic neural network model which integrates the molecular structure information and the protein biological structure information of a medicine, and the graphic neural network model can automatically predict the interaction relation between the medicine and a target point, so that the verification efficiency is improved, and the verification cost is reduced.
In order to achieve the above purpose, the technical scheme of the application is as follows: a method of predicting drug-target interactions that incorporate multidimensional features, comprising:
step 1: extracting information of the medicines and targets with interaction and related diseases and side effects of the medicines from a medical database, preprocessing the information, and constructing a heterogeneous network;
step 2: extracting network topological structure characteristics in a heterogeneous network by using a heterogeneous graph ideographic neural network;
step 3: representing the molecular sequence of the medicine SMILES as a molecular diagram structure, and extracting the characteristic information of the medicine structure by using a molecular attention transducer network;
step 4: embedding and representing target sequence information, processing by using a convolutional neural network and a two-way long-short-term memory network, and extracting target structural feature information;
step 5: the extracted medicine structure characteristic information and target structure characteristic information are integrated into the message transmission process of the heterograph attention neural network;
step 6: optimizing and training a prediction model by using a cross entropy loss function, and then storing the prediction model;
step 7: and loading the prediction model, inputting information of the medicine and the target point to be predicted, performing relation prediction on the medicine and the target point, and outputting a prediction result.
Further, the specific implementation process of the step 1 comprises the following steps:
step 1.1: screening the drug information and target information from the medical database, and deleting the drug information and target information which have no interaction relation;
step 1.2: obtaining the SMILES molecular sequence corresponding to the medicine and the sequence information corresponding to the target point from a medical database, and respectively taking the SMILES molecular sequence and the sequence information corresponding to the target point as biological characteristic representation information of the medicine and the target point;
step 1.3: extracting information of diseases and side effects of the medicines related to the medicines and targets;
step 1.4: referring to fig. 2 (a), using the extracted drugs, targets, diseases and drug side effects as nodes, and the association information between them is represented as edges, and constructing a heterogeneous network g= (V, E), wherein V represents node sets and E represents edge sets;
step 1.5: integrating the drug and the target with interaction relationship and constructing a form of < drug number, target number, label >, and marking the label as 1;
step 1.6: according to the positive example: negative example is 1:10, randomly constructs an unknown drug-target relationship as negative example, and marks the label as 0.
Further, the specific implementation process of the step 2 comprises the following steps:
the heterogeneous network has an embedding formula f of an initialization node 0 :V→R d Wherein f 0 (v) A d-dimensional map representing each node v; the neighbor node information aggregation of node v is defined as:
Figure BDA0004050414580000041
wherein sigma (·) represents a nonlinear activation function in the propagation process of a layer of neural network, K is the number of attention layers, N v All neighboring nodes representing node v, W is a shared weight parameter, a represents the weight vector of the attention mechanism, aconC is a new type of activation function that can be adaptively learned.
Further, the specific implementation process of the step 3 comprises the following steps:
step 3.1: the SMILES molecular sequence of each drug is expressed as a molecular graph form by calling an RDkit function library in a Python library, wherein the top and the side of the graph respectively represent atoms and chemical bonds of the drug, each drug molecule is expressed by using a feature matrix and an adjacent matrix, and each row of the feature matrix corresponds to the attribute of each atom;
step 3.2: because each SMILES sequence has a different length, a maximum of 100 character length SMILES sequence is selected to create an effective representation, such that it covers at least 90% of the compounds in the dataset. Sequences greater than the maximum character length are truncated, while sequences less than the maximum character length are filled with 0 s;
step 3.3: referring to FIG. 2 (b), a molecular attention transducer network is used to extract a drug signature representation S drug The method comprises the steps of carrying out a first treatment on the surface of the The calculation formula of the molecular multi-head self-attention layer is as follows:
Figure BDA0004050414580000042
/>
wherein the method comprises the steps of
Figure BDA0004050414580000043
Adjacency matrix representing a molecular diagram, < >>
Figure BDA0004050414580000044
Representing the distance between atoms; />
Figure BDA0004050414580000045
A query vector matrix, a key vector matrix, and a value vector matrix, respectively, wherein W is a learnable parameter, i e (1,., h), h is the number of heads of multi-head attention; lambda (lambda) a 、λ d And lambda (lambda) g Scalar parameters representing weighted self-attention, distance and adjacency matrices.
Further, the specific implementation process of the step 4 includes:
step 4.1: randomly initializing an index table corresponding to all the amino acids appearing in the target sequence, wherein the size of the index table is 26 multiplied by 100; corresponding the amino acid in each target sequence with an index table to construct an embedding matrix of the target sequence; the length of the embedded matrix is the maximum length in the target sequence and is set to be 1000; in the model training process, the embedded vector is continuously optimized, so that the related information in the index table is continuously changed along with the optimization of the model;
step 4.2: referring to fig. 2 (c), a convolutional neural network and a two-way long-short term memory network are used to extract characteristic information in a target sequence.
Further, the specific implementation process of the step 4.2 includes:
step 4.2.1: taking the embedded matrix obtained in the step 4.1 as the input of a convolutional neural network; filling of empty labels is automatically carried out on target sequences smaller than the length of the embedded matrix; each CNN block uses three consecutive one-dimensional convolution layers, the number of convolution kernels increases with increasing number of layers, the second layer uses twice the first layer's convolution kernel, and the third layer uses three times the first layer's convolution kernel;
step 4.2.2: receiving the output of the convolution layer using the BiLSTM layer, the final output being the protein structural features, denoted S protein The formula is as follows:
Figure BDA0004050414580000051
wherein w and m respectively represent the weight matrix and the convolution window size, h is the LSTM hidden layer state, and x is the characteristic representation of the protein sequence.
Further, the specific implementation process of the step 5 includes:
referring to FIG. 2 (d), the drug structure feature vector S is obtained in step 3 drug And the target point structure characteristic vector S obtained in the step 4 protein Splicing is carried out in the heterogeneous graph meaning neural network message transmission stage, and the formula for updating node embedding in the formula (1) is as follows:
Figure BDA0004050414580000061
further, the specific implementation process of the step 6 includes:
step 6.1: referring to FIG. 2 (e), after obtaining a characterization of the drug and target, the inner product method is used for predictionDrug-target interactions; given a drug node u and a protein node v, f u And f v Representing their characteristics; the probability of interaction between u and v is:
P=σ((f u ) T f v ) (5)
wherein the method comprises the steps of
Figure BDA0004050414580000062
P represents the interaction prediction score between u and v as an s-type function;
step 6.2: optimizing and training the prediction Model by using a cross entropy loss function, testing the performance of the prediction Model by adopting 10 times of cross validation, and storing the prediction Model with the best effect best
Further, the specific implementation process of the step 7 includes:
loading the Model of the predictive Model in step 6.2 best Inputting the drug-target information in the verification data into a prediction model, judging whether the interaction relationship exists between the drug and the target, and outputting corresponding evaluation indexes.
By adopting the technical scheme, the invention can obtain the following technical effects: the invention adopts a deep learning model, utilizes the information of drugs, targets, diseases and drug side effects in a medicine database, combines the structural characteristics of the drugs and the targets, and automatically predicts the interaction information of the drugs and the targets through the model. The method effectively extracts the characteristic information in the medicine molecules and protein structures, has higher accuracy and robustness in the process of predicting the medicine-target point relationship, improves the efficiency and the accuracy of verifying the medicine-target point relationship, effectively shortens the medicine research and development period, greatly reduces the research and development cost of new medicines, and provides important foundation and guarantee for research and development of the new medicines and reuse of the medicines.
Drawings
FIG. 1 is a flow chart of a method for predicting drug-target interactions that incorporates multidimensional features;
FIG. 2 is a diagram of a model structure of a drug-target interaction prediction method incorporating multidimensional features.
Detailed Description
The embodiment of the invention is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are provided, but the protection scope of the invention is not limited to the following embodiment.
The present invention is described in detail below with reference to examples so that those skilled in the art can practice the same with reference to the present specification.
Example 1
In the embodiment, a Windows system is used as a development environment, pycharm is used as a development platform, python is used as a development language, and the medicine-target interaction prediction method which is fused with the multidimensional characteristics is adopted to predict the medicine-target interaction relation.
In this embodiment, a method for predicting drug-target interaction with multi-dimensional feature fusion includes the following steps:
extracting 708 drugs, 1512 proteins, 5603 diseases and 4192 drug side effects from DrugBank, pubChem database, HPRD database, comparative toxicological genomics database and SIDER database; marking the existing drug-target interaction relationship as positive examples, and setting the data label as 1 to be 1923 in total; 19230 cases are randomly selected from a drug-target pair which is not marked as a positive case, a negative case is constructed, and the data label is set to 0; constructing a heterogeneous network by using the obtained data;
and taking the heterogeneous network, the drug SMILES sequence and the protein sequence as inputs, training and storing a prediction model to obtain an evaluation index prediction score of the interaction relation between the drug and the target, wherein the evaluation index comprises an area under an operation characteristic curve (AUROC) of a receiver and an area under a precision-recall ratio curve (AUPR).
According to the steps, the invention compares the drug-target relation prediction effect with an EEG-DTI model, a NeoDTI model, a DTINet model, an MSCMF model and an HNM model. As can be seen from table 1, the process proposed herein is significantly better than the other processes in both AUROC and AUPR.
Table 1 comparison of different models for drug-target relationship prediction results
Figure BDA0004050414580000081
Example 2
In the embodiment, a Windows system is used as a development environment, pycharm is used as a development platform, python is used as a development language, and the method for predicting the drug-target interaction by fusing the multidimensional characteristics is used for predicting the potential therapeutic drugs of the COVID-19.
In this embodiment, a method for predicting drug-target interaction with multi-dimensional feature fusion includes the following steps:
146 targets closely related to the COVID-19 and 708 candidate drugs are extracted from a comparative toxicological genomics database and a drug Bank database; obtaining SMILES sequences and sequence structures of targets related to the medicaments from a PubCHem database; extracting 1456 diseases and 4192 drug side effects related to drugs and targets from the HPRD database and the SIDER database; constructing a heterogeneous network through the acquired data;
taking the sequence data of the heterogeneous network, the drugs and the proteins as input, and loading a stored prediction model to obtain a drug and different target interaction relation prediction Score; the predictive scores Score were sorted in descending order, extracting the top 10 drug candidates for each target confidence ranking, while requiring that these drug confidence scores be greater than 0.5. After such treatment, only 15 targets meet the requirements, and 150 candidate drugs are finally obtained by screening.
Of the 150 drugs screened, 54 had been shown in the covd-19 clinical study and some of the data are shown in table 2. By the method, candidate medicines can be quickly and more specifically searched for in subsequent wet experiments.
TABLE 2 screening of therapeutic drugs related to COVID-19 according to the present invention
Figure BDA0004050414580000091
/>
Figure BDA0004050414580000101
The foregoing descriptions of specific exemplary embodiments of the present invention are presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application to thereby enable one skilled in the art to make and utilize the invention in various exemplary embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Claims (9)

1. A method for predicting drug-target interactions with fusion of multidimensional features, comprising:
step 1: extracting information of the medicines and targets with interaction and related diseases and side effects of the medicines from a medical database, preprocessing the information, and constructing a heterogeneous network;
step 2: extracting network topological structure characteristics in a heterogeneous network by using a heterogeneous graph ideographic neural network;
step 3: representing the molecular sequence of the medicine SMILES as a molecular diagram structure, and extracting the characteristic information of the medicine structure by using a molecular attention transducer network;
step 4: embedding and representing target sequence information, processing by using a convolutional neural network and a two-way long-short-term memory network, and extracting target structural feature information;
step 5: the extracted medicine structure characteristic information and target structure characteristic information are integrated into the message transmission process of the heterograph attention neural network;
step 6: optimizing and training a prediction model by using a cross entropy loss function, and then storing the prediction model;
step 7: and loading the prediction model, inputting information of the medicine and the target point to be predicted, performing relation prediction on the medicine and the target point, and outputting a prediction result.
2. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 1 comprises:
step 1.1: screening the drug information and target information from the medical database, and deleting the drug information and target information which have no interaction relation;
step 1.2: obtaining the SMILES molecular sequence corresponding to the medicine and the sequence information corresponding to the target point from a medical database, and respectively taking the SMILES molecular sequence and the sequence information corresponding to the target point as biological characteristic representation information of the medicine and the target point;
step 1.3: extracting information of diseases and side effects of the medicines related to the medicines and targets;
step 1.4: taking the extracted medicines, targets, diseases and side effects of the medicines as nodes, and expressing the association information between the extracted medicines, targets, diseases and side effects of the medicines as edges to construct a heterogeneous network G= (V, E), wherein V represents a node set and E represents an edge set;
step 1.5: integrating the drug and the target with interaction relationship and constructing a form of < drug number, target number, label >, and marking the label as 1;
step 1.6: according to a certain proportion, an unknown drug-target relationship is randomly constructed as a negative example, and the label is marked as 0.
3. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 2 comprises:
the heterogeneous network has an embedding formula f of an initialization node 0 :V→R d Wherein f 0 (v) A d-dimensional map representing each node v; the neighbor node information aggregation of node v is defined as:
Figure FDA0004050414570000021
wherein sigma (·) represents a nonlinear activation function in the propagation process of a layer of neural network, K is the number of attention layers, N v All neighboring nodes representing node v, W is a shared weight parameter, a represents the weight vector of the attention mechanism, aconC is a new type of activation function that can be adaptively learned.
4. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 3 comprises:
step 3.1: the SMILES molecular sequence of each drug is expressed as a molecular graph form by calling an RDkit function library in a Python library, wherein the top and the side of the graph respectively represent atoms and chemical bonds of the drug, each drug molecule is expressed by using a feature matrix and an adjacent matrix, and each row of the feature matrix corresponds to the attribute of each atom;
step 3.2: selecting a maximum 100 character length SMILES sequence, wherein sequences larger than the maximum character length are truncated, and sequences smaller than the maximum character length are filled with 0;
step 3.3: extraction of drug characterization S using molecular attention transducer network drug The method comprises the steps of carrying out a first treatment on the surface of the The calculation formula of the molecular multi-head self-attention layer is as follows:
Figure FDA0004050414570000031
wherein the method comprises the steps of
Figure FDA0004050414570000032
Adjacency matrix representing a molecular diagram, < >>
Figure FDA0004050414570000033
Representing the distance between atoms; q (Q) i =XW i q ,K i =XW i k ,V i =XW i v A query vector matrix, a key vector matrix, and a value vector matrix, respectively, wherein W is a learnable parameter, i e (1,., h), h is the number of heads of multi-head attention; lambda (lambda) a 、λ d And lambda (lambda) g Scalar parameters representing weighted self-attention, distance and adjacency matrices.
5. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 4 comprises:
step 4.1: randomly initializing an index table corresponding to all the amino acids appearing in the target sequence; corresponding the amino acid in each target sequence with an index table to construct an embedding matrix of the target sequence; the length of the embedded matrix is the maximum length in the target sequence;
step 4.2: and extracting characteristic information in the target sequence by using a convolutional neural network and a two-way long-short-term memory network.
6. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 5, wherein the specific implementation process of step 4.2 comprises:
step 4.2.1: taking the embedded matrix obtained in the step 4.1 as the input of a convolutional neural network; filling of empty labels is automatically carried out on target sequences smaller than the length of the embedded matrix; each CNN block uses three consecutive one-dimensional convolution layers, the number of convolution kernels increases with increasing number of layers, i.e. the second layer uses twice the first layer's convolution kernel and the third layer uses three times the first layer's convolution kernel;
step 4.2.2: receiving the output of the convolution layer using the BiLSTM layer, the final output being the protein structural features, denoted S protein The formula is as follows:
Figure FDA0004050414570000041
wherein w and m respectively represent the weight matrix and the convolution window size, h is the LSTM hidden layer state, and x is the characteristic representation of the protein sequence.
7. The method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 5 comprises:
obtaining the drug structure characteristic vector S in the step 3 drug And the target point structure characteristic vector S obtained in the step 4 protein Splicing is carried out in the heterogeneous graph meaning neural network message transmission stage, and the formula for updating node embedding in the formula (1) is as follows:
Figure FDA0004050414570000042
/>
8. the method for predicting drug-target interaction with multi-dimensional feature fusion according to claim 1, wherein the specific implementation process of step 6 comprises:
step 6.1: after obtaining a characterization of the drug and target, predicting drug-target interactions using an inner product method; given a drug node u and a protein node v, f u And f v Representing their characteristics; the probability of interaction between u and v is:
P=σ((f u ) T f v )(5)
wherein the method comprises the steps of
Figure FDA0004050414570000051
P represents the interaction prediction score between u and v as an s-type function;
step 6.2: optimizing and training the prediction Model by using a cross entropy loss function, testing the performance of the prediction Model by adopting 10 times of cross validation, and storing the prediction Model with the best effect best
9. The method for predicting drug-target interactions with fusion of multidimensional features according to claim 8, wherein the step 7 is specifically implemented by:
loading the Model of the predictive Model in step 6.2 best Inputting the drug-target information in the verification data into a prediction model, judging whether the interaction relationship exists between the drug and the target, and outputting corresponding evaluation indexes.
CN202310038717.4A 2023-01-13 2023-01-13 Multi-dimensional characteristic fusion medicine-target interaction prediction method Pending CN116206775A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310038717.4A CN116206775A (en) 2023-01-13 2023-01-13 Multi-dimensional characteristic fusion medicine-target interaction prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310038717.4A CN116206775A (en) 2023-01-13 2023-01-13 Multi-dimensional characteristic fusion medicine-target interaction prediction method

Publications (1)

Publication Number Publication Date
CN116206775A true CN116206775A (en) 2023-06-02

Family

ID=86516594

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310038717.4A Pending CN116206775A (en) 2023-01-13 2023-01-13 Multi-dimensional characteristic fusion medicine-target interaction prediction method

Country Status (1)

Country Link
CN (1) CN116206775A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894180A (en) * 2023-09-11 2023-10-17 南京航空航天大学 Product manufacturing quality prediction method based on different composition attention network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894180A (en) * 2023-09-11 2023-10-17 南京航空航天大学 Product manufacturing quality prediction method based on different composition attention network
CN116894180B (en) * 2023-09-11 2023-11-24 南京航空航天大学 Product manufacturing quality prediction method based on different composition attention network

Similar Documents

Publication Publication Date Title
US11462304B2 (en) Artificial intelligence engine architecture for generating candidate drugs
CN112136145A (en) Attention filtering for multi-instance learning
CN110021341B (en) Heterogeneous network-based GPCR (GPCR-based drug and targeting pathway) prediction method
US11403316B2 (en) Generating enhanced graphical user interfaces for presentation of anti-infective design spaces for selecting drug candidates
CN113140254B (en) Meta-learning drug-target interaction prediction system and prediction method
CN113688248B (en) Medical event identification method and system under condition of small sample weak labeling
CN112562791A (en) Drug target action depth learning prediction system based on knowledge graph, computer equipment and storage medium
CN112131399A (en) Old medicine new use analysis method and system based on knowledge graph
CN114882970B (en) Medicine interaction effect prediction method based on pre-training model and molecular diagram
CN116206775A (en) Multi-dimensional characteristic fusion medicine-target interaction prediction method
CN115376704A (en) Medicine-disease interaction prediction method fusing multi-neighborhood correlation information
CN112837743B (en) Drug repositioning method based on machine learning
CN113764034A (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
CN116646001B (en) Method for predicting drug target binding based on combined cross-domain attention model
CN113284627A (en) Medication recommendation method based on patient characterization learning
CN116741408A (en) Method for multi-view self-attention prediction of drug to disease association
CN116630062A (en) Medical insurance fraud detection method, system and storage medium
AU2021104604A4 (en) Drug target prediction method for keeping consistency of chemical properties and functions of drugs
Wang et al. Predicting polypharmacy side effects based on an enhanced domain knowledge graph
CN114999566A (en) Drug repositioning method and system based on word vector characterization and attention mechanism
CN114300036A (en) Genetic variation pathogenicity prediction method and device, storage medium and computer equipment
US11915832B2 (en) Apparatus and method for processing multi-omics data for discovering new drug candidate substance
Iraji et al. Druggable protein prediction using a multi-canal deep convolutional neural network based on autocovariance method
CN115458061B (en) Medicine-protein interaction prediction method and system
Halsana et al. DensePPI: A Novel Image-based Deep Learning method for Prediction of Protein-Protein Interactions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination