CN115472305A - Method and system for predicting microorganism-drug association effect - Google Patents

Method and system for predicting microorganism-drug association effect Download PDF

Info

Publication number
CN115472305A
CN115472305A CN202210938454.8A CN202210938454A CN115472305A CN 115472305 A CN115472305 A CN 115472305A CN 202210938454 A CN202210938454 A CN 202210938454A CN 115472305 A CN115472305 A CN 115472305A
Authority
CN
China
Prior art keywords
drug
microorganism
attribute
matrix
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210938454.8A
Other languages
Chinese (zh)
Inventor
黄浩楠
刘冬宁
蔡阅霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202210938454.8A priority Critical patent/CN115472305A/en
Publication of CN115472305A publication Critical patent/CN115472305A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioethics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for predicting a microorganism-drug association effect, which comprises the following steps: s1, constructing a microorganism-drug association network Net1 through a microorganism-drug association database; s2, constructing an interaction network Net2 through a microorganism-drug association database; s3, constructing a multi-mode attribute map of the microorganisms and the drugs according to the comprehensive similarity of the drugs, the drug network topology of the drug network, the functional similarity of the microorganisms and the genome sequence; s4, establishing a graph neural network model introducing regularization; s5, obtaining embedded expressions Z1 and Z2; inputting Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; and S6, acquiring a data set to be predicted, and predicting the association effect of the microorganisms and the drugs in the data set to be predicted through a trained graph neural network. The invention solves the problem that the prior art can not construct interpretable node characteristics of organisms and medicines, and has the characteristic of considering the sparsity problem caused by the existing microorganism-medicine related data set.

Description

Method and system for predicting microorganism-drug association effect
Technical Field
The invention relates to the technical field of bioinformatics, in particular to a method for predicting a microorganism-drug association effect.
Background
In recent years, the focus of research in the medical field has been to explore the relationship between microbial community imbalance and drug efficacy and toxicity, however, there is still a lack of comprehensive understanding of the complex mechanisms by which microbial communities interact with drugs in the human body. At present, new drug development faces two major challenges. On the one hand, the discovery of antibiotics is of a few kinds, most of the work has focused on optimizing or combining known compounds, it is difficult to culture the target species under laboratory conditions, and most of the drugs fail during the course of the experiment. On the other hand, the number of resistant bacteria is also increasing at an alarming rate. More and more studies show that microorganisms and drugs have close interactions, and some connections between microorganisms and drugs have been confirmed by culture experiments, but are not enough to elucidate the complex interaction mechanism between human microorganisms and drugs. Therefore, there is an urgent need to develop an efficient method to systematically explore the possible association between microorganisms and drugs.
Two types of computational methods currently exist for predicting the relationship of microorganisms to drugs.
The first category of methods focuses primarily on similarity measures, such as the HMDAKATZ method using KATZ measures, but such measures are too simple to adequately reflect similarity, resulting in inaccurate association identification.
The second category of methods uses graph learning methods that use rich semantic information in the graph data representation with better predictive power than previous similarity metric based methods. There are currently two common methods of learning graph characteristics: meta-paths and graph convolution networks.
The meta-path algorithm mainly utilizes the marginal information associated with the microbial drugs for prediction. The meta-path algorithm combines meta-path 2vec with neural network recommendations for learning low-dimensional embedded representations of microorganisms and drugs. The meta-path algorithm does contribute to the improvement of the prediction ability of the model, but it relies too much on edge information, which naturally leads to a failure of prediction in consideration of the absence of existing edge information when a new drug or a new microorganism is introduced.
Compared with the meta path, the GCN method can capture not only edge information but also node information. Therefore, in the current methods, the use of the GCN method for predicting microbial drug correlations is of great interest. Long et al first applied the GCN encoder to the microbial drug correlation method GCNMDA and introduced a conditional random field into the GCN hidden layer. There is also a node level GCN attention method, EGATMDA, to learn node (i.e., microbe and drug) embedding that effectively preserves the target neighbors of the graph and only relevant information. However, existing methods fail to construct node features that contain biological information.
In summary, the existing methods for predicting microbe-drug related action have the problem that abundant interpretable node characteristics of organisms and drugs cannot be constructed, so how to invent a method for predicting microbe-drug related action, which can construct interpretable node characteristics of organisms and drugs, is a technical problem to be solved urgently in the field.
Disclosure of Invention
The invention provides a method for predicting a microorganism-drug association effect, aiming at solving the problem that the prior art can not construct interpretable node characteristics of organisms and drugs, and the method has the characteristic of considering the sparsity problem caused by the existing microorganism-drug association data set.
In order to realize the purpose of the invention, the technical scheme is as follows:
a method of predicting a microbe-drug association effect, comprising the steps of:
s1, constructing a microorganism-drug association network through a microorganism-drug association database, wherein the association network is called as Net1;
s2, retrieving related interaction of the microorganisms and the microorganisms through a microorganism database in a microorganism-drug association database, and retrieving related interaction of the drugs and the drugs through a drug database in a microorganism-drug association database; constructing an interaction network according to the related interaction of the microorganism and the related interaction of the drug and the drug, and calling the interaction network as Net2;
s3, constructing a topological attribute network of the medicine through a medicine database, and constructing a microbial gene sequence through a microbial database; constructing a multi-modal attribute map of the microorganism-drug according to the comprehensive similarity attribute and drug network topology attribute of the drug in the drug database, and the functional similarity attribute and genome sequence attribute of the microorganism in the microorganism database;
s4, establishing a graph neural network model introduced with regularization according to Net1, net2 and the multi-mode attribute graph of the microorganism-medicine;
s5, inputting the Net1 and the Net2 into a neural network model of the graph in combination with a multi-mode attribute diagram of the microorganism-medicament to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training to obtain a trained neural network of the graph;
and S6, acquiring a data set to be predicted, and predicting the correlation action of the microorganisms and the drugs in the data set to be predicted through a trained graph neural network.
The invention constructs a microorganism-drug association network through a microorganism-drug association database, and further obtains interaction networks Net1 and Net2; establishing a graph neural network model introduced with regularization, inputting Net1 and Net2 into the graph neural network model in combination with a multi-mode attribute graph of a microorganism-medicament to obtain embedded expressions Z1 and Z2, and inputting the embedded expressions Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; the interpretable node characteristics of the organisms and the medicines are constructed, and the problem of sparsity brought by the existing microorganism-medicine related data set is considered.
Preferably, in step S3, a topological attribute network of the drug is constructed through the drug database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute of the drugs in the drug database, the drug network topology attribute of the drug network, and the functional similarity attribute and the genome sequence attribute of the microorganisms in the microorganism database are as follows:
s301, constructing a similarity characteristic matrix of the medicines according to the medicine similarity attributes in the medicine database, and constructing a topological attribute network of the medicines through the medicine database, so as to obtain a second attribute characteristic matrix of the medicines;
s302, constructing a similarity characteristic matrix of microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain a second attribute characteristic matrix of the microorganisms;
s303, constructing a microorganism-drug similarity characteristic network according to the similarity characteristic matrix of the drugs and the similarity characteristic matrix of the microorganisms;
s304, constructing a microorganism-drug second attribute feature network according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism;
s305, combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multi-mode attribute map.
Further, in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attribute in the drug database, and a topological attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:
A1. calculating the similarity attribute of the drugs in the drug database by using SIMCOMP2 tool to obtain the molecular structure similarity matrix DS of the drugs struct (di,dj);
A2. The drug-drug interaction spectrum in Net2 is represented by matrix DIP, yielding the normalized kernel bandwidth:
Figure BDA0003784609010000031
where μ denotes the normalized kernel bandwidth and μ' is the original bandwidth, set to 1,DIP (d) i ) Denotes the drug d i Interaction with other drugs, nd represents the number of microorganisms in the Net1;
A3. the similarity characteristic matrix of the drugs is expressed as S d (d i ,d j ):
Figure BDA0003784609010000041
A4. Constructing a drug network topology attribute in a drug database by a random walk method with restart, performing random drift and restart on a drug network until the drug network is converged to complete the construction of the drug network, thereby obtaining a probability distribution vector of each drug, and constructing a second attribute feature matrix F of the drug d ∈R nd×nd
Further, in step A4, the formula of random drift and restart is:
Figure BDA0003784609010000042
wherein,
Figure BDA0003784609010000043
representing the probability that the ith node of the drug network moves to other nodes at time T +1, theta is the restart probability, T is the transition probability matrix, p i (0) ∈R n×1 Starting probability vector, p, representing the ith node of the drug network i (t) ∈R n×1 Representing the probability that the ith node of the drug network moves to other nodes at time t.
Further, in step S302, the specific steps of constructing a similarity feature matrix of the microorganism according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain a second attribute feature matrix of the microorganism, are as follows:
B1. calculating the functional similarity attribute of the microorganism in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganism m ∈R nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m i And microorganismsm j The similarity between them is represented as S m (m i ,m j );
B2. Encoding an original gene sequence of microbial data in a microbial database to obtain a microbial gene sequence;
B3. filling all the encoded microbial gene sequences with zeros to ensure that the lengths of all the filled microbial gene sequences are the same;
B4. analyzing all the filled microorganism gene sequences by using a principal component analysis method to obtain a k-dimensional matrix, and expressing a second attribute characteristic matrix of the microorganism as F by the k-dimensional matrix m ∈R nm×k
Further, in step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:
C1. constructing a microorganism-drug similarity feature network X according to the similarity feature matrix of the drug and the similarity feature matrix of the microorganism simility
Figure BDA0003784609010000051
C2. Constructing a microorganism-drug second attribute feature network X according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism secondary
Figure BDA0003784609010000052
C3. Combining the microorganism-drug similarity characteristic network and the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multimodal attribute map X:
X=[X simility ,X secondary ]。
furthermore, in step S4, a regularized graph neural network model is established according to Net1, net2 and the multi-modal property graph of the microorganism-drug, and the specific steps are as follows:
s401, establishing a microorganism-drug characteristic matrix according to a microorganism-drug multi-mode attribute diagram, and constructing a microorganism-drug heterogeneous matrix, wherein in the heterogeneous matrix, vi represents microorganisms or drugs of any node, and the heterogeneous matrix is represented as follows:
Figure BDA0003784609010000053
wherein Y is a characteristic matrix of the microorganism-drug,
Figure BDA0003784609010000054
representing the content feature vector of the node vi;
s402, setting a learnable matrix W epsilon R m Xf, and assigning an initial value to an element of the learnable matrix using a random number, where f is a dimension of node embedding representation set by a hyper-parameter, n = nd + nm is the number of nodes, m =2 × (nd + nm) is a characteristic dimension of the nodes, based on
Figure BDA0003784609010000055
And W generating a feature transformed vector
Figure BDA0003784609010000056
Figure BDA0003784609010000057
S403, setting a scaling constant s epsilon R which represents the norm of the propagated hidden features and generating normalized feature transformation vectors from the GNCN network of the regularized graph neural network model
Figure BDA0003784609010000058
Figure BDA0003784609010000059
S404, solving a formula g () of L2 regularization:
Figure BDA00037846090100000510
s405. Encoding the microbe-drug association network and the microbe-drug multi-modal attribute map using a GNCN encoder:
Figure BDA0003784609010000061
wherein A ∈ R nd×nm An adjacency matrix of the correlation network in the step S1, if known correlation exists between the nodes i and j in the correlation network, setting the element Aij in the A to be 1, otherwise, setting the element Aij to be 0;
Figure BDA0003784609010000062
wherein I N Is an identity matrix of the order of N,
Figure BDA0003784609010000063
is composed of
Figure BDA0003784609010000064
The degree matrix of (c).
Furthermore, in the step S5, net1 and Net2 are input into the neural network model of the graph in combination with the multi-modal property diagram of the microorganism-drug to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and obtaining the trained neural network of the graph specifically comprises the following steps:
s501, propagating the normalized vector through the GNCN network to generate a node-embedded vector
Figure BDA0003784609010000065
Figure BDA0003784609010000066
Wherein,
Figure BDA0003784609010000067
is the unit vector of i in the matrix,
Figure BDA0003784609010000068
the unit vector of j in the matrix is defined, and degi is the degree of the node i; degj is the degree of the node j;
s502. Embedding vectors according to nodes
Figure BDA0003784609010000069
Generating a node embedding matrix, generating an implicit variable Z belonging to Rn multiplied by f of the GNCN encoder, and obtaining Z1 corresponding to Net1 and Z2 corresponding to Net 2:
Zi=GNCN(X,A,s);
s503, defining a loss function, wherein the loss function is binary cross entropy between the multi-modal attribute graph and a reconstructed graph obtained by a graph neural network in training:
Figure BDA00037846090100000610
wherein, L is a loss function, N is the total number of all nodes, y represents the value of a certain element in the adjacency matrix A and takes the value of 0 or 1,
Figure BDA00037846090100000611
adjacency matrix representing reconstruction
Figure BDA00037846090100000612
The value of the corresponding element is between 0 and 1;
s504, inputting Z1 and Z2 into a DNN classifier of the graph neural network model, setting the training times epoch as k2, adopting random gradient descent in the training process, and stopping training when the loss function is converged to obtain the trained graph neural network.
Furthermore, in the step S5, after the graph neural network model is trained, the graph neural network model is verified, and the verification specifically includes the steps of:
D1. introducing a k-fold cross validation framework, randomly dividing all known microorganism-drug associated data on the existing microorganism-drug associated database into k1 groups under the k-fold cross validation framework, selecting a subset of random sampling unknown associated pairs with the same size batch in each of the k1 groups as a test set, and selecting the remaining known associated pairs as a training set;
D2. inputting the test set into the trained graph neural network model to obtain a classification result;
D3. if the classification result is positive, predicting that the microorganism is associated with the medicine, and if the classification result is negative, predicting that the microorganism is not associated with the medicine;
D4. obtaining an AUC value of the trained graph neural network model according to the classification result; and verifying the accuracy of the neural network model of the graph according to the AUC value.
Further, in step D4, an AUC value of the trained neural network model is obtained according to the classification result, and the specific steps are as follows;
E1. inputting the training set into a model to obtain a reconstruction graph of the current model to the training set, and recording scores of edges between nodes in the reconstruction graph of the current model to the training set as association probability, wherein the association probability takes a value between 0 and 1;
E2. the association probability is used as a classification threshold, when other association probabilities are larger than the classification threshold, the samples are regarded as positive samples, and when other association probabilities are smaller than the classification threshold, the samples are regarded as negative samples;
E3. obtaining a label truth value of an edge in the training set according to the incidence relation of the microorganism and the medicine in the training set, wherein the label truth value is 0 or 1, wherein 0 represents that the edge does not exist, namely the incidence relation does not exist, namely the negative sample actually exists, and 1 represents that the edge exists, namely the incidence relation exists, namely the positive sample actually exists;
E4. and (3) counting the true positive rate and the false positive rate under each classification threshold:
Figure BDA0003784609010000071
Figure BDA0003784609010000072
wherein, TPRate is a true positive rate, FPRate is a false positive rate, TP is a true positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, FN is a false negative rate, which indicates the number of samples actually predicted as negative samples from positive samples, FP is a false positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, TN is a true negative rate, which indicates the number of samples actually predicted as negative samples from negative samples;
E5. and (3) drawing an ROC curve by taking the FPRate as a horizontal axis and the TPrate as a vertical axis, and calculating the area of the ROC curve by using a infinitesimal method, namely an AUC value.
The invention has the following beneficial effects:
the invention constructs a microorganism-drug association network through a microorganism-drug association database, and further obtains interaction networks Net1 and Net2; establishing a graph neural network model introduced with regularization, inputting Net1 and Net2 into the graph neural network model in combination with a multi-mode attribute graph of a microorganism-medicament to obtain embedded expressions Z1 and Z2, and inputting the embedded expressions Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; the interpretable node characteristics of the organisms and the medicines are constructed, and the problem of sparsity brought by the existing microorganism-medicine related data set is considered.
Drawings
FIG. 1 is a schematic flow diagram of a method of predicting a microorganism-drug association of the present invention.
FIG. 2 is a schematic flow chart of a method for predicting a microorganism-drug association effect to construct a multi-modal property map.
FIG. 3 is a schematic flow chart of the method for predicting the association probability of a microorganism-drug association according to the present invention.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
As shown in fig. 1, a method for predicting a microbe-drug association effect includes the steps of:
s1, constructing a microorganism-drug association network through a microorganism-drug association database, wherein the association network is called as Net1;
s2, retrieving related interaction of the microorganisms and the microorganisms through a microorganism database in a microorganism-drug association database, and retrieving related interaction of the drugs and the drugs through a drug database in a microorganism-drug association database; constructing an interaction network according to the related interaction of the microorganism and the related interaction of the drug and the drug, and calling the interaction network as Net2;
s3, constructing a topological attribute network of the medicine through a medicine database, and constructing a microbial gene sequence through a microbial database; constructing a microorganism-drug multi-mode attribute graph according to the comprehensive similarity attribute of the drugs in the drug database, the drug network topology attribute, and the functional similarity attribute and the genome sequence attribute of the microorganisms in the microorganism database;
s4, establishing a graph neural network model introduced with regularization according to Net1, net2 and the multi-mode attribute graph of the microorganism-medicine;
s5, inputting the Net1 and the Net2 into a neural network model of the graph in combination with a multi-mode attribute diagram of the microorganism-medicament to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training to obtain a trained neural network of the graph;
and S6, acquiring a data set to be predicted, and predicting the correlation action of the microorganisms and the drugs in the data set to be predicted through a trained graph neural network.
The invention constructs a microorganism-drug association network through a microorganism-drug association database, and further obtains interaction networks Net1 and Net2; establishing a graph neural network model introduced with regularization, inputting Net1 and Net2 into the graph neural network model in combination with a multi-mode attribute graph of a microorganism-medicament to obtain embedded expressions Z1 and Z2, and inputting the embedded expressions Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; interpretable node characteristics of organisms and medicines are constructed, and the problem of sparsity brought by an existing microorganism-medicine related data set is considered.
Example 2
Specifically, as shown in fig. 2, in a specific embodiment, in step S3, a topological attribute network of the drug is constructed through the drug database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute and drug network topology attribute of the drugs in the drug database and the functional similarity attribute and genome sequence attribute of the microorganisms in the microorganism database are as follows:
s301, constructing a similarity characteristic matrix of the medicines according to the medicine similarity attributes in the medicine database, and constructing a topological attribute network of the medicines through the medicine database, so as to obtain a second attribute characteristic matrix of the medicines;
s302, constructing a similarity characteristic matrix of microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain a second attribute characteristic matrix of the microorganisms;
s303, constructing a microorganism-medicament similarity characteristic network according to the medicament similarity characteristic matrix and the microorganism similarity characteristic matrix;
s304, constructing a microorganism-drug second attribute feature network according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism;
s305, combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multi-mode attribute map.
In a specific embodiment, in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attribute in the drug database, and a topological attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:
A1. calculating the similarity attribute of the drugs in the drug database by using SIMCOMP2 tool to obtain a molecular structure similarity matrix DS of the drugs struct (di,dj);
A2. The drug-drug interaction spectrum in Net2 is represented by matrix DIP, resulting in a normalized kernel bandwidth:
Figure BDA0003784609010000091
where μ denotes the normalized kernel bandwidth and μ' is the original bandwidth, set to 1,DIP (d) i ) Denotes the drug d i Interaction with other drugs, nd represents the number of microorganisms in the Net1;
A3. the similarity characteristic matrix of the drugs is expressed as S d (d i ,d j ):
Figure BDA0003784609010000101
A4. Constructing the topological attribute of a drug network in a drug database by a random walk method with restart, performing random drift and restart on the drug network until the drug network is converged, completing the construction of the drug network, thereby obtaining the probability distribution vector of each drug, and constructing a second attribute feature matrix F of the drug d ∈R nd×nd
In a specific embodiment, in step A4, the formula of random drift and restart is:
Figure BDA0003784609010000102
wherein,
Figure BDA0003784609010000103
representing the probability that the ith node of the drug network moves to other nodes at time T +1, theta is the restart probability, T is the transition probability matrix, p i (0) ∈R n×1 Starting probability vector, p, representing the ith node of a drug network i (t) ∈R n×1 Representing the probability that the ith node of the drug network moves to other nodes at time t.
In an embodiment, in the step S302, the specific steps of constructing a similarity feature matrix of the microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain the second attribute feature matrix of the microorganisms are:
B1. calculating the functional similarity attribute of the microorganism in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganism m ∈R nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m i And a microorganism m j The similarity between them is represented as S m (m i ,m j );
B2. Encoding an original gene sequence of microbial data in a microbial database to obtain a microbial gene sequence;
B3. filling all the encoded microbial gene sequences with zeros to ensure that the lengths of all the filled microbial gene sequences are the same;
B4. analyzing all the filled microorganism gene sequences by using a principal component analysis method to obtain a k-dimensional matrix, and expressing a second attribute characteristic matrix of the microorganism as F by the k-dimensional matrix m ∈R nm×k
In a specific embodiment, in step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:
C1. constructing a microorganism-drug similarity feature network X according to the similarity feature matrix of the drug and the similarity feature matrix of the microorganism simility
Figure BDA0003784609010000111
C2. Constructing a microorganism-drug second attribute feature network X according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism secondary
Figure BDA0003784609010000112
C3. Combining the microorganism-drug similarity characteristic network and the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multimodal attribute map X:
X=[X simility ,X secondary ]。
in a specific embodiment, in step S4, a regularized graph neural network model is established according to Net1, net2, and a multi-modal property graph of a microbe-drug, and the specific steps are as follows:
s401, establishing a microorganism-drug characteristic matrix according to a microorganism-drug multi-modal attribute map, and establishing a microorganism-drug heterogeneous matrix, wherein in the heterogeneous matrix, vi represents a microorganism or a drug of any node, and the heterogeneous matrix is represented as follows:
Figure BDA0003784609010000113
wherein Y is a characteristic matrix of the microorganism-drug,
Figure BDA0003784609010000114
representing content feature vectors of the nodes vi;
s402, setting a learnable matrix W epsilon R m F and assigning initial values to elements of the learnable matrix using random numbers, wherein f is a dimension of node embedding representation set by a hyper-parameter, n = nd + nm is the number of nodes, m =2 × (nd + nm) is a characteristic dimension of the nodes, based on
Figure BDA0003784609010000115
And W generating a feature transformed vector
Figure BDA0003784609010000116
Figure BDA0003784609010000117
S403, setting a scaling constant s epsilon R which represents the norm of the propagated hidden features and generating normalized feature transformation vectors from the GNCN network of the regularized graph neural network model
Figure BDA0003784609010000118
Figure BDA0003784609010000119
S404, solving a formula g () of L2 regularization:
Figure BDA00037846090100001110
s405. Encoding the microbe-drug association network and the microbe-drug multi-modal attribute map using a GNCN encoder:
Figure BDA0003784609010000121
wherein A ∈ R nd×nm An adjacency matrix of the correlation network in the step S1, if known correlation exists between the nodes i and j in the correlation network, setting the element Aij in the A to be 1, otherwise, setting the element Aij to be 0;
Figure BDA0003784609010000122
wherein I N Is an identity matrix of the order of N,
Figure BDA0003784609010000123
is composed of
Figure BDA0003784609010000124
The degree matrix of (c).
In one embodiment, as shown in fig. 3, in step S5, net1, net2 are input into the graph neural network model in combination with the multi-modal property map of the microbe-drug to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and specifically obtaining the trained neural network of the graph comprises the following steps:
s501, propagating the normalized vector through the GNCN network to generate a node-embedded vector
Figure BDA0003784609010000125
Figure BDA0003784609010000126
Wherein,
Figure BDA0003784609010000127
is the unit vector of i in the matrix,
Figure BDA0003784609010000128
the unit vector of j in the matrix is defined, and degi is the degree of the node i; degj is the degree of the node j;
s502. Embedding vectors according to nodes
Figure BDA0003784609010000129
Generating a node embedding matrix, generating an implicit variable Z belonging to Rn multiplied by f of the GNCN encoder, and obtaining Z1 corresponding to Net1 and Z2 corresponding to Net 2:
Zi=GNCN(X,A,s);
s503, defining a loss function, wherein the loss function is binary cross entropy between the multi-modal attribute graph and a reconstructed graph obtained by a graph neural network in training:
Figure BDA00037846090100001210
wherein, L is a loss function, N is the total number of all nodes, y represents the value of a certain element in the adjacency matrix A and takes the value of 0 or 1,
Figure BDA00037846090100001211
adjacency matrix representing reconstruction
Figure BDA00037846090100001212
The value of the corresponding element is between 0 and 1;
s504, inputting the Z1 and the Z2 into a DNN classifier of the graph neural network model, setting the training times epoch as k2, adopting random gradient descent in the training process, and stopping training when a loss function is converged to obtain the trained graph neural network.
Example 3
In a specific embodiment, in the step S5, after the graph neural network model is trained, verification of the graph neural network model is further performed, where the verification specifically includes:
D1. introducing a k-fold cross validation framework, randomly dividing all known microorganism-drug associated data on the existing microorganism-drug associated database into k1 groups under the k-fold cross validation framework, selecting a subset of random sampling unknown associated pairs with the same size batch in each of the k1 groups as a test set, and selecting the remaining known associated pairs as a training set;
D2. inputting the test set into the trained graph neural network model to obtain a classification result;
D3. if the classification result is positive, predicting that the microorganism is associated with the medicine, and if the classification result is negative, predicting that the microorganism is not associated with the medicine;
D4. obtaining an AUC value of the trained graph neural network model according to the classification result; and verifying the accuracy of the graph neural network model according to the AUC value.
In a specific embodiment, in step D4, an AUC value of the trained neural network model is obtained according to the classification result, and the specific steps are as follows;
E1. inputting the training set into a model to obtain a reconstruction graph of the current model to the training set, and recording scores of edges between nodes in the reconstruction graph of the current model to the training set as association probability, wherein the association probability takes a value between 0 and 1;
E2. the association probability is used as a classification threshold, when other association probabilities are larger than the classification threshold, the samples are regarded as positive samples, and when other association probabilities are smaller than the classification threshold, the samples are regarded as negative samples;
E3. obtaining a label truth value of an edge in the training set according to the incidence relation of the microorganism and the medicine in the training set, wherein the label truth value is 0 or 1, wherein 0 represents that the edge does not exist, namely the incidence relation does not exist, namely the negative sample actually exists, and 1 represents that the edge exists, namely the incidence relation exists, namely the positive sample actually exists;
E4. and (3) counting the true positive rate and the false positive rate under each classification threshold:
Figure BDA0003784609010000131
Figure BDA0003784609010000132
wherein, TPRate is a true positive rate, FPRate is a false positive rate, TP is a true positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, FN is a false negative rate, which indicates the number of samples actually predicted as negative samples from positive samples, FP is a false positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, TN is a true negative rate, which indicates the number of samples actually predicted as negative samples from negative samples;
E5. and (3) drawing an ROC curve by taking FPRate as a horizontal axis and TPrate as a vertical axis, and calculating the area of the ROC curve by using a infinitesimal method, namely an AUC value.
In this example, in order to verify the method for predicting the microbe-drug association effect of the present invention, this example uses default parameter settings, runs the method and 5 existing methods on MDAD dataset, uses AUC value as the performance evaluation index, and the greater the AUC value, the higher the accuracy of the method.
In this example, 5-fold cross validation and 10-fold cross validation were performed on all methods including the present invention, and the experimentally validated drug combinations were randomly divided into 5 or 10 subsets of the same size, each subset in turn being used as a test set, with the remainder being used to train the model. In order to eliminate random sampling deviation, the process is repeated for 10 times, a final AUC score is calculated according to the average value of AUC values in 10 repeated verifications, and the final AUC score is used as a performance index to evaluate the accuracy of each method.
The validation results are shown in the table below, and in the 5-fold cross validation, the method employed by the present invention is expressed as G2 gnamda, and the final AUC score of the present invention is the highest of all methods. The final AUC score of the invention was also the highest among all methods in the 10-fold cross validation. Therefore, the accuracy of the method is superior to that of the prior method:
Figure BDA0003784609010000141
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A method of predicting a microorganism-drug association effect, comprising: the method comprises the following steps:
s1, constructing a microorganism-drug association network through a microorganism-drug association database, wherein the association network is called as Net1;
s2, retrieving related interaction of the microorganisms and the microorganisms through a microorganism database in a microorganism-drug association database, and retrieving related interaction of the drugs and the drugs through a drug database in a microorganism-drug association database; constructing an interaction network according to the related interaction of the microorganism and the related interaction of the drug and the drug, and calling the interaction network as Net2;
s3, constructing a topological attribute network of the medicine through a medicine database, and constructing a microbial gene sequence through a microbial database; constructing a multi-modal attribute map of the microorganism-drug according to the comprehensive similarity attribute and drug network topology attribute of the drug in the drug database, and the functional similarity attribute and genome sequence attribute of the microorganism in the microorganism database;
s4, establishing a graph neural network model introduced with regularization according to Net1, net2 and the multi-mode attribute graph of the microorganism-medicine;
s5, inputting Net1 and Net2 into a neural network model of a graph by combining with a multi-modal attribute diagram of the microorganism and the medicament to obtain embedded expressions Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training to obtain a trained neural network of the graph;
s6, predicting the microorganism-drug association effect in the microorganism-drug data set by training a graph neural network.
2. The method of predicting a microbe-drug association as recited in claim 1, wherein: in the step S3, a topological attribute network of the medicine is constructed through the medicine database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute and drug network topology attribute of the drugs in the drug database and the functional similarity attribute and genome sequence attribute of the microorganisms in the microorganism database are as follows:
s301, constructing a similarity characteristic matrix of the medicines according to the medicine similarity attributes in the medicine database, and constructing a topological attribute network of the medicines through the medicine database, so as to obtain a second attribute characteristic matrix of the medicines;
s302, constructing a similarity characteristic matrix of the microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database so as to obtain a second attribute characteristic matrix of the microorganisms;
s303, constructing a microorganism-medicament similarity characteristic network according to the medicament similarity characteristic matrix and the microorganism similarity characteristic matrix;
s304, constructing a microorganism-drug second attribute feature network according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism;
s305, combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multi-mode attribute map.
3. The method of predicting a microbe-drug association effect of claim 2, wherein: in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attributes in the drug database, and a topology attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:
A1. calculating the similarity attribute of the drugs in the drug database by using SIMCOMP2 tool to obtain a molecular structure similarity matrix DS of the drugs struct (di,dj);
A2. The drug-drug interaction spectrum in Net2 is represented by matrix DIP, yielding the normalized kernel bandwidth:
Figure FDA0003784609000000021
where μ represents the normalized kernel bandwidth and μ' is the original bandwidth, set to 1,DIP (d) i ) Denotes the drug d i Interaction with other drugs, nd represents the number of microorganisms in the Net1;
A3. the similarity characteristic matrix of the drugs is expressed as S d (d i ,d j ):
Figure FDA0003784609000000022
A4. Constructing a drug network topology attribute in a drug database by a random walk method with restart, performing random drift and restart on a drug network until the drug network is converged to complete the construction of the drug network, thereby obtaining a probability distribution vector of each drug, and constructing a second attribute feature matrix F of the drug d ∈R nd×nd
4. The method of predicting a microbe-drug association as recited in claim 3, wherein: in step A4, the formula of random drift and restart is:
Figure FDA0003784609000000023
wherein,
Figure FDA0003784609000000024
representing the probability that the ith node of the drug network moves to other nodes at time T +1, theta is the restart probability, T is the transition probability matrix, p i (0) ∈R n×1 Starting probability vector, p, representing the ith node of a drug network i (t) ∈R n×1 Representing the probability that the ith node of the drug network moves to other nodes at time t.
5. The method of predicting a microbe-drug association effect of claim 2, wherein: in the step S302, the specific steps of constructing the similarity feature matrix of the microorganism according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing the microorganism gene sequence through the microorganism database, thereby obtaining the second attribute feature matrix of the microorganism, are:
B1. calculating the functional similarity attribute of the microorganisms in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganisms m ∈R nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m i And a microorganism m j The similarity between them is represented as S m (m i ,m j );
B2. Encoding an original gene sequence of microbial data in a microbial database to obtain a microbial gene sequence;
B3. filling all the encoded microbial gene sequences with zeros to ensure that the lengths of all the filled microbial gene sequences are the same;
B4. analyzing all the filled microorganism gene sequences by using a principal component analysis method to obtain a k-dimensional matrix, and expressing a second attribute characteristic matrix of the microorganism as F by the k-dimensional matrix m ∈R nm×k
6. The method of predicting a microbe-drug association as recited in claim 5, wherein: in the step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:
C1. constructing a microorganism-drug similarity feature network X according to the similarity feature matrix of the drug and the similarity feature matrix of the microorganism simility
Figure FDA0003784609000000031
C2. Constructing a microorganism-drug second attribute feature network X according to the drug second attribute feature matrix and the microorganism second attribute feature matrix secondary
Figure FDA0003784609000000032
C3. Combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multimodal attribute diagram X:
X=[X simility ,X secondary ]。
7. the method of predicting a microbe-drug association effect of claim 6, wherein: in the step S4, a graph neural network model introduced with regularization is established according to Net1, net2 and a multi-modal attribute graph of microorganism-medicament, and the concrete steps are as follows:
s401, establishing a microorganism-drug characteristic matrix according to a microorganism-drug multi-mode attribute diagram, and constructing a microorganism-drug heterogeneous matrix, wherein in the heterogeneous matrix, vi represents microorganisms or drugs of any node, and the heterogeneous matrix is represented as follows:
Figure FDA0003784609000000041
wherein Y is a characteristic matrix of the microorganism-drug,
Figure FDA0003784609000000042
representing content feature vectors of the nodes vi;
s402, setting a learnable matrix W epsilon R m F and assigning initial values to elements of the learnable matrix using random numbers, wherein f is a dimension of node embedding representation set by a hyper-parameter, n = nd + nm is the number of nodes, m =2 × (nd + nm) is a characteristic dimension of the nodes, based on
Figure FDA0003784609000000043
And W generating a feature transformed vector
Figure FDA0003784609000000044
Figure FDA0003784609000000045
S403, setting a scaling constant s epsilon R which represents the norm of the propagated hidden features and generating normalized feature transformation vectors from the GNCN network of the regularized graph neural network model
Figure FDA0003784609000000046
Figure FDA0003784609000000047
S404, solving a formula g () of L2 regularization:
Figure FDA0003784609000000048
s405. Encoding the microbe-drug association network and the microbe-drug multi-modal attribute map using a GNCN encoder:
Figure FDA0003784609000000049
wherein A ∈ R nd×nm Setting an element Ai j in the A to be 1 if known correlation exists between nodes i and j in the correlation network for the adjacency matrix of the correlation network in the step S1, otherwise, setting the element Ai j to be 0;
Figure FDA00037846090000000410
wherein I N Is an identity matrix of the order of N,
Figure FDA00037846090000000411
is composed of
Figure FDA00037846090000000412
The degree matrix of (c).
8. The method of predicting a microbe-drug association effect of claim 7, wherein: in the step S5, net1 and Net2 are combined with a multi-modal attribute diagram of the microorganism-drug to be input into a graph neural network model to obtain embedded expressions Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and specifically obtaining the trained neural network of the graph comprises the following steps:
s501, propagating the normalized vector through the GNCN network to generate a node-embedded vector
Figure FDA0003784609000000051
Figure FDA0003784609000000052
Wherein,
Figure FDA0003784609000000053
is the unit vector of i in the matrix,
Figure FDA0003784609000000054
the unit vector of j in the matrix is defined, and degi is the degree of the node i; degj is the degree of the node j;
s502. Embedding vectors according to nodes
Figure FDA0003784609000000055
Generating a node embedding matrix, generating an implicit variable Z belonging to Rn multiplied by f of the GNCN encoder, and obtaining Z1 corresponding to Net1 and Z2 corresponding to Net 2:
Z=GNCN(X,A,s);
s503, defining a loss function, wherein the loss function is binary cross entropy between the multi-modal attribute graph and a reconstructed graph obtained by a graph neural network in training:
Figure FDA0003784609000000056
wherein, L is a loss function, N is the total number of all nodes, y represents the value of a certain element in the adjacency matrix A and takes the value of 0 or 1,
Figure FDA0003784609000000057
adjacency matrix representing reconstruction
Figure FDA0003784609000000058
The value of the corresponding element is between 0 and 1;
s504, inputting Z1 and Z2 into a DNN classifier of the graph neural network model, setting the training times epoch as k2, adopting random gradient descent in the training process, and stopping training when the loss function is converged to obtain the trained graph neural network.
9. The method of predicting a microbe-drug association effect of claim 8, wherein: in the step S5, after the neural network model of the graph is trained, the accuracy of the neural network model of the graph is verified, and the verification specifically includes the steps of:
D1. introducing a k-fold cross validation framework, randomly dividing all known microorganism-drug associated data on the existing microorganism-drug associated database into k1 groups under the k-fold cross validation framework, selecting a subset of random sampling unknown associated pairs with the same size batch in each of the k1 groups as a test set, and selecting the remaining known associated pairs as a training set;
D2. inputting the test set into the trained graph neural network model to obtain a classification result;
D3. if the classification result is positive, predicting that the microorganism is associated with the medicine, and if the classification result is negative, predicting that the microorganism is not associated with the medicine;
D4. obtaining an AUC value of the trained graph neural network model according to the classification result; and verifying the accuracy of the graph neural network model according to the AUC value.
10. The method of predicting a microbe-drug association effect of claim 9, wherein: in the step D4, obtaining an AUC value of the trained graph neural network model according to the classification result, wherein the specific step is as follows;
E1. inputting the training set into a model to obtain a reconstruction graph of the current model to the training set, and recording scores of edges between nodes in the reconstruction graph of the current model to the training set as association probability, wherein the association probability takes a value between 0 and 1;
E2. the association probability is used as a classification threshold value, when other association probabilities are larger than the classification threshold value, the sample is regarded as a positive sample, and when other association probabilities are smaller than the classification threshold value, the sample is regarded as a negative sample;
E3. obtaining a label truth value of an edge in the training set according to the association relationship of the microorganisms and the medicines in the training set, wherein the label truth value is 0 or 1, wherein 0 represents that the edge does not exist, namely the association relationship does not exist, namely the negative sample actually exists, and 1 represents that the edge exists, namely the association relationship exists, namely the positive sample actually exists;
E4. and (3) counting the true positive rate and the false positive rate under each classification threshold:
Figure FDA0003784609000000061
Figure FDA0003784609000000062
wherein, TPRate is a true positive rate, FPRate is a false positive rate, TP is a true positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, FN is a false negative rate, which indicates the number of samples actually predicted as negative samples from positive samples, FP is a false positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, TN is a true negative rate, which indicates the number of samples actually predicted as negative samples from negative samples;
E5. and (3) drawing an ROC curve by taking the FPRate as a horizontal axis and the TPrate as a vertical axis, and calculating the area of the ROC curve by using a infinitesimal method, namely an AUC value.
CN202210938454.8A 2022-08-05 2022-08-05 Method and system for predicting microorganism-drug association effect Pending CN115472305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210938454.8A CN115472305A (en) 2022-08-05 2022-08-05 Method and system for predicting microorganism-drug association effect

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210938454.8A CN115472305A (en) 2022-08-05 2022-08-05 Method and system for predicting microorganism-drug association effect

Publications (1)

Publication Number Publication Date
CN115472305A true CN115472305A (en) 2022-12-13

Family

ID=84366630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210938454.8A Pending CN115472305A (en) 2022-08-05 2022-08-05 Method and system for predicting microorganism-drug association effect

Country Status (1)

Country Link
CN (1) CN115472305A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095741A (en) * 2023-10-19 2023-11-21 华东交通大学 Graph self-attention-based microorganism-drug association prediction method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095741A (en) * 2023-10-19 2023-11-21 华东交通大学 Graph self-attention-based microorganism-drug association prediction method
CN117095741B (en) * 2023-10-19 2024-01-30 华东交通大学 Graph self-attention-based microorganism-drug association prediction method

Similar Documents

Publication Publication Date Title
US11462304B2 (en) Artificial intelligence engine architecture for generating candidate drugs
Zeebaree et al. Machine Learning Semi-Supervised Algorithms for Gene Selection: A Review
Hu et al. Active learning with partial feedback
Urbanowicz et al. An analysis pipeline with statistical and visualization-guided knowledge discovery for michigan-style learning classifier systems
Huang et al. Machine learning applications for therapeutic tasks with genomics data
CN113764034B (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
CN114582429B (en) Mycobacterium tuberculosis drug resistance prediction method and device based on hierarchical attention neural network
Sekaran et al. Predicting autism spectrum disorder from associative genetic markers of phenotypic groups using machine learning
Zhao et al. Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network
CN116386899A (en) Graph learning-based medicine disease association relation prediction method and related equipment
Choi et al. DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
Yelmen et al. Deep convolutional and conditional neural networks for large-scale genomic data generation
CN115472305A (en) Method and system for predicting microorganism-drug association effect
KR20200133067A (en) Method and system for predicting disease from gut microbial data
Pamulaparthyvenkata et al. Leveraging Interpretable Machine Learning for Granular Risk Stratification in Hospital Readmission: Unveiling Actionable Insights from Electronic Health Records
CN117875444A (en) Model training method, antibacterial peptide prediction method and system
Dedja et al. BELLATREX: Building explanations through a locally accurate rule extractor
CN113284627A (en) Medication recommendation method based on patient characterization learning
CN115148303A (en) Microorganism-drug association prediction method based on normalized graph neural network
Fan et al. Large margin nearest neighbor embedding for knowledge representation
Leke-Betechuoh et al. Prediction of HIV status from demographic data using neural networks
Souliotis Bayesian and machine learning approaches in metagenomics
CN115346688A (en) Method for predicting relation between microorganisms and medicines based on multi-association graph
Kurz et al. Isolating cost drivers in interstitial lung disease treatment using nonparametric Bayesian methods
CN118609823B (en) Glioma risk prediction method and glioma risk prediction system based on multi-modal information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination