CN115472305A

CN115472305A - Method and system for predicting microorganism-drug association effect

Info

Publication number: CN115472305A
Application number: CN202210938454.8A
Authority: CN
Inventors: 黄浩楠; 刘冬宁; 蔡阅霖
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2022-12-13

Abstract

The invention discloses a method for predicting a microorganism-drug association effect, which comprises the following steps: s1, constructing a microorganism-drug association network Net1 through a microorganism-drug association database; s2, constructing an interaction network Net2 through a microorganism-drug association database; s3, constructing a multi-mode attribute map of the microorganisms and the drugs according to the comprehensive similarity of the drugs, the drug network topology of the drug network, the functional similarity of the microorganisms and the genome sequence; s4, establishing a graph neural network model introducing regularization; s5, obtaining embedded expressions Z1 and Z2; inputting Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; and S6, acquiring a data set to be predicted, and predicting the association effect of the microorganisms and the drugs in the data set to be predicted through a trained graph neural network. The invention solves the problem that the prior art can not construct interpretable node characteristics of organisms and medicines, and has the characteristic of considering the sparsity problem caused by the existing microorganism-medicine related data set.

Description

Method and system for predicting microorganism-drug association effect

Technical Field

The invention relates to the technical field of bioinformatics, in particular to a method for predicting a microorganism-drug association effect.

Background

In recent years, the focus of research in the medical field has been to explore the relationship between microbial community imbalance and drug efficacy and toxicity, however, there is still a lack of comprehensive understanding of the complex mechanisms by which microbial communities interact with drugs in the human body. At present, new drug development faces two major challenges. On the one hand, the discovery of antibiotics is of a few kinds, most of the work has focused on optimizing or combining known compounds, it is difficult to culture the target species under laboratory conditions, and most of the drugs fail during the course of the experiment. On the other hand, the number of resistant bacteria is also increasing at an alarming rate. More and more studies show that microorganisms and drugs have close interactions, and some connections between microorganisms and drugs have been confirmed by culture experiments, but are not enough to elucidate the complex interaction mechanism between human microorganisms and drugs. Therefore, there is an urgent need to develop an efficient method to systematically explore the possible association between microorganisms and drugs.

Two types of computational methods currently exist for predicting the relationship of microorganisms to drugs.

The first category of methods focuses primarily on similarity measures, such as the HMDAKATZ method using KATZ measures, but such measures are too simple to adequately reflect similarity, resulting in inaccurate association identification.

The second category of methods uses graph learning methods that use rich semantic information in the graph data representation with better predictive power than previous similarity metric based methods. There are currently two common methods of learning graph characteristics: meta-paths and graph convolution networks.

The meta-path algorithm mainly utilizes the marginal information associated with the microbial drugs for prediction. The meta-path algorithm combines meta-path 2vec with neural network recommendations for learning low-dimensional embedded representations of microorganisms and drugs. The meta-path algorithm does contribute to the improvement of the prediction ability of the model, but it relies too much on edge information, which naturally leads to a failure of prediction in consideration of the absence of existing edge information when a new drug or a new microorganism is introduced.

Compared with the meta path, the GCN method can capture not only edge information but also node information. Therefore, in the current methods, the use of the GCN method for predicting microbial drug correlations is of great interest. Long et al first applied the GCN encoder to the microbial drug correlation method GCNMDA and introduced a conditional random field into the GCN hidden layer. There is also a node level GCN attention method, EGATMDA, to learn node (i.e., microbe and drug) embedding that effectively preserves the target neighbors of the graph and only relevant information. However, existing methods fail to construct node features that contain biological information.

In summary, the existing methods for predicting microbe-drug related action have the problem that abundant interpretable node characteristics of organisms and drugs cannot be constructed, so how to invent a method for predicting microbe-drug related action, which can construct interpretable node characteristics of organisms and drugs, is a technical problem to be solved urgently in the field.

Disclosure of Invention

The invention provides a method for predicting a microorganism-drug association effect, aiming at solving the problem that the prior art can not construct interpretable node characteristics of organisms and drugs, and the method has the characteristic of considering the sparsity problem caused by the existing microorganism-drug association data set.

In order to realize the purpose of the invention, the technical scheme is as follows:

a method of predicting a microbe-drug association effect, comprising the steps of:

s1, constructing a microorganism-drug association network through a microorganism-drug association database, wherein the association network is called as Net1;

s2, retrieving related interaction of the microorganisms and the microorganisms through a microorganism database in a microorganism-drug association database, and retrieving related interaction of the drugs and the drugs through a drug database in a microorganism-drug association database; constructing an interaction network according to the related interaction of the microorganism and the related interaction of the drug and the drug, and calling the interaction network as Net2;

s3, constructing a topological attribute network of the medicine through a medicine database, and constructing a microbial gene sequence through a microbial database; constructing a multi-modal attribute map of the microorganism-drug according to the comprehensive similarity attribute and drug network topology attribute of the drug in the drug database, and the functional similarity attribute and genome sequence attribute of the microorganism in the microorganism database;

s4, establishing a graph neural network model introduced with regularization according to Net1, net2 and the multi-mode attribute graph of the microorganism-medicine;

s5, inputting the Net1 and the Net2 into a neural network model of the graph in combination with a multi-mode attribute diagram of the microorganism-medicament to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training to obtain a trained neural network of the graph;

and S6, acquiring a data set to be predicted, and predicting the correlation action of the microorganisms and the drugs in the data set to be predicted through a trained graph neural network.

The invention constructs a microorganism-drug association network through a microorganism-drug association database, and further obtains interaction networks Net1 and Net2; establishing a graph neural network model introduced with regularization, inputting Net1 and Net2 into the graph neural network model in combination with a multi-mode attribute graph of a microorganism-medicament to obtain embedded expressions Z1 and Z2, and inputting the embedded expressions Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; the interpretable node characteristics of the organisms and the medicines are constructed, and the problem of sparsity brought by the existing microorganism-medicine related data set is considered.

Preferably, in step S3, a topological attribute network of the drug is constructed through the drug database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute of the drugs in the drug database, the drug network topology attribute of the drug network, and the functional similarity attribute and the genome sequence attribute of the microorganisms in the microorganism database are as follows:

s301, constructing a similarity characteristic matrix of the medicines according to the medicine similarity attributes in the medicine database, and constructing a topological attribute network of the medicines through the medicine database, so as to obtain a second attribute characteristic matrix of the medicines;

s302, constructing a similarity characteristic matrix of microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain a second attribute characteristic matrix of the microorganisms;

s303, constructing a microorganism-drug similarity characteristic network according to the similarity characteristic matrix of the drugs and the similarity characteristic matrix of the microorganisms;

s304, constructing a microorganism-drug second attribute feature network according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism;

s305, combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multi-mode attribute map.

Further, in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attribute in the drug database, and a topological attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:

A1. calculating the similarity attribute of the drugs in the drug database by using SIMCOMP2 tool to obtain the molecular structure similarity matrix DS of the drugs ^struct (di,dj)；

A2. The drug-drug interaction spectrum in Net2 is represented by matrix DIP, yielding the normalized kernel bandwidth:

where μ denotes the normalized kernel bandwidth and μ' is the original bandwidth, set to 1,DIP (d) _i ) Denotes the drug d _i Interaction with other drugs, nd represents the number of microorganisms in the Net1;

A3. the similarity characteristic matrix of the drugs is expressed as S _d (d _i ,d _j )：

A4. Constructing a drug network topology attribute in a drug database by a random walk method with restart, performing random drift and restart on a drug network until the drug network is converged to complete the construction of the drug network, thereby obtaining a probability distribution vector of each drug, and constructing a second attribute feature matrix F of the drug _d ∈R ^nd×nd 。

Further, in step A4, the formula of random drift and restart is:

wherein,

representing the probability that the ith node of the drug network moves to other nodes at time T +1, theta is the restart probability, T is the transition probability matrix, p _i ⁽⁰⁾ ∈R ^n×1 Starting probability vector, p, representing the ith node of the drug network _i ^(t) ∈R ^n×1 Representing the probability that the ith node of the drug network moves to other nodes at time t.

Further, in step S302, the specific steps of constructing a similarity feature matrix of the microorganism according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain a second attribute feature matrix of the microorganism, are as follows:

B1. calculating the functional similarity attribute of the microorganism in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganism _m ∈R ^nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m _i And microorganismsm _j The similarity between them is represented as S _m (m _i ，m _j )；

B2. Encoding an original gene sequence of microbial data in a microbial database to obtain a microbial gene sequence;

B3. filling all the encoded microbial gene sequences with zeros to ensure that the lengths of all the filled microbial gene sequences are the same;

B4. analyzing all the filled microorganism gene sequences by using a principal component analysis method to obtain a k-dimensional matrix, and expressing a second attribute characteristic matrix of the microorganism as F by the k-dimensional matrix _m ∈R ^nm×k 。

Further, in step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:

C1. constructing a microorganism-drug similarity feature network X according to the similarity feature matrix of the drug and the similarity feature matrix of the microorganism _simility ：

C2. Constructing a microorganism-drug second attribute feature network X according to the second attribute feature matrix of the drug and the second attribute feature matrix of the microorganism _secondary ：

C3. Combining the microorganism-drug similarity characteristic network and the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multimodal attribute map X:

X＝[X _simility ,X _secondary ]。

furthermore, in step S4, a regularized graph neural network model is established according to Net1, net2 and the multi-modal property graph of the microorganism-drug, and the specific steps are as follows:

s401, establishing a microorganism-drug characteristic matrix according to a microorganism-drug multi-mode attribute diagram, and constructing a microorganism-drug heterogeneous matrix, wherein in the heterogeneous matrix, vi represents microorganisms or drugs of any node, and the heterogeneous matrix is represented as follows:

wherein Y is a characteristic matrix of the microorganism-drug,

representing the content feature vector of the node vi;

s402, setting a learnable matrix W epsilon R _m Xf, and assigning an initial value to an element of the learnable matrix using a random number, where f is a dimension of node embedding representation set by a hyper-parameter, n = nd + nm is the number of nodes, m =2 × (nd + nm) is a characteristic dimension of the nodes, based on

And W generating a feature transformed vector

S403, setting a scaling constant s epsilon R which represents the norm of the propagated hidden features and generating normalized feature transformation vectors from the GNCN network of the regularized graph neural network model

S404, solving a formula g () of L2 regularization:

s405. Encoding the microbe-drug association network and the microbe-drug multi-modal attribute map using a GNCN encoder:

wherein A ∈ R _nd×nm An adjacency matrix of the correlation network in the step S1, if known correlation exists between the nodes i and j in the correlation network, setting the element Aij in the A to be 1, otherwise, setting the element Aij to be 0;

wherein I _N Is an identity matrix of the order of N,

is composed of

The degree matrix of (c).

Furthermore, in the step S5, net1 and Net2 are input into the neural network model of the graph in combination with the multi-modal property diagram of the microorganism-drug to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and obtaining the trained neural network of the graph specifically comprises the following steps:

s501, propagating the normalized vector through the GNCN network to generate a node-embedded vector

Wherein,

is the unit vector of i in the matrix,

the unit vector of j in the matrix is defined, and degi is the degree of the node i; degj is the degree of the node j;

s502. Embedding vectors according to nodes

Generating a node embedding matrix, generating an implicit variable Z belonging to Rn multiplied by f of the GNCN encoder, and obtaining Z1 corresponding to Net1 and Z2 corresponding to Net 2:

Zi＝GNCN(X,A,s)；

s503, defining a loss function, wherein the loss function is binary cross entropy between the multi-modal attribute graph and a reconstructed graph obtained by a graph neural network in training:

wherein, L is a loss function, N is the total number of all nodes, y represents the value of a certain element in the adjacency matrix A and takes the value of 0 or 1,

adjacency matrix representing reconstruction

The value of the corresponding element is between 0 and 1;

s504, inputting Z1 and Z2 into a DNN classifier of the graph neural network model, setting the training times epoch as k2, adopting random gradient descent in the training process, and stopping training when the loss function is converged to obtain the trained graph neural network.

Furthermore, in the step S5, after the graph neural network model is trained, the graph neural network model is verified, and the verification specifically includes the steps of:

D1. introducing a k-fold cross validation framework, randomly dividing all known microorganism-drug associated data on the existing microorganism-drug associated database into k1 groups under the k-fold cross validation framework, selecting a subset of random sampling unknown associated pairs with the same size batch in each of the k1 groups as a test set, and selecting the remaining known associated pairs as a training set;

D2. inputting the test set into the trained graph neural network model to obtain a classification result;

D3. if the classification result is positive, predicting that the microorganism is associated with the medicine, and if the classification result is negative, predicting that the microorganism is not associated with the medicine;

D4. obtaining an AUC value of the trained graph neural network model according to the classification result; and verifying the accuracy of the neural network model of the graph according to the AUC value.

Further, in step D4, an AUC value of the trained neural network model is obtained according to the classification result, and the specific steps are as follows;

E1. inputting the training set into a model to obtain a reconstruction graph of the current model to the training set, and recording scores of edges between nodes in the reconstruction graph of the current model to the training set as association probability, wherein the association probability takes a value between 0 and 1;

E2. the association probability is used as a classification threshold, when other association probabilities are larger than the classification threshold, the samples are regarded as positive samples, and when other association probabilities are smaller than the classification threshold, the samples are regarded as negative samples;

E3. obtaining a label truth value of an edge in the training set according to the incidence relation of the microorganism and the medicine in the training set, wherein the label truth value is 0 or 1, wherein 0 represents that the edge does not exist, namely the incidence relation does not exist, namely the negative sample actually exists, and 1 represents that the edge exists, namely the incidence relation exists, namely the positive sample actually exists;

E4. and (3) counting the true positive rate and the false positive rate under each classification threshold:

wherein, TPRate is a true positive rate, FPRate is a false positive rate, TP is a true positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, FN is a false negative rate, which indicates the number of samples actually predicted as negative samples from positive samples, FP is a false positive rate, which indicates the number of samples actually predicted as positive samples from negative samples, TN is a true negative rate, which indicates the number of samples actually predicted as negative samples from negative samples;

E5. and (3) drawing an ROC curve by taking the FPRate as a horizontal axis and the TPrate as a vertical axis, and calculating the area of the ROC curve by using a infinitesimal method, namely an AUC value.

The invention has the following beneficial effects:

Drawings

FIG. 1 is a schematic flow diagram of a method of predicting a microorganism-drug association of the present invention.

FIG. 2 is a schematic flow chart of a method for predicting a microorganism-drug association effect to construct a multi-modal property map.

FIG. 3 is a schematic flow chart of the method for predicting the association probability of a microorganism-drug association according to the present invention.

Detailed Description

The invention is described in detail below with reference to the drawings and the detailed description.

Example 1

As shown in fig. 1, a method for predicting a microbe-drug association effect includes the steps of:

s3, constructing a topological attribute network of the medicine through a medicine database, and constructing a microbial gene sequence through a microbial database; constructing a microorganism-drug multi-mode attribute graph according to the comprehensive similarity attribute of the drugs in the drug database, the drug network topology attribute, and the functional similarity attribute and the genome sequence attribute of the microorganisms in the microorganism database;

The invention constructs a microorganism-drug association network through a microorganism-drug association database, and further obtains interaction networks Net1 and Net2; establishing a graph neural network model introduced with regularization, inputting Net1 and Net2 into the graph neural network model in combination with a multi-mode attribute graph of a microorganism-medicament to obtain embedded expressions Z1 and Z2, and inputting the embedded expressions Z1 and Z2 into the graph neural network for training to obtain a trained graph neural network; interpretable node characteristics of organisms and medicines are constructed, and the problem of sparsity brought by an existing microorganism-medicine related data set is considered.

Example 2

Specifically, as shown in fig. 2, in a specific embodiment, in step S3, a topological attribute network of the drug is constructed through the drug database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute and drug network topology attribute of the drugs in the drug database and the functional similarity attribute and genome sequence attribute of the microorganisms in the microorganism database are as follows:

s303, constructing a microorganism-medicament similarity characteristic network according to the medicament similarity characteristic matrix and the microorganism similarity characteristic matrix;

In a specific embodiment, in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attribute in the drug database, and a topological attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:

A1. calculating the similarity attribute of the drugs in the drug database by using SIMCOMP2 tool to obtain a molecular structure similarity matrix DS of the drugs ^struct (di,dj)；

A2. The drug-drug interaction spectrum in Net2 is represented by matrix DIP, resulting in a normalized kernel bandwidth:

A4. Constructing the topological attribute of a drug network in a drug database by a random walk method with restart, performing random drift and restart on the drug network until the drug network is converged, completing the construction of the drug network, thereby obtaining the probability distribution vector of each drug, and constructing a second attribute feature matrix F of the drug _d ∈R ^nd×nd 。

In a specific embodiment, in step A4, the formula of random drift and restart is:

wherein,

representing the probability that the ith node of the drug network moves to other nodes at time T +1, theta is the restart probability, T is the transition probability matrix, p _i ⁽⁰⁾ ∈R ^n×1 Starting probability vector, p, representing the ith node of a drug network _i ^(t) ∈R ^n×1 Representing the probability that the ith node of the drug network moves to other nodes at time t.

In an embodiment, in the step S302, the specific steps of constructing a similarity feature matrix of the microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database, so as to obtain the second attribute feature matrix of the microorganisms are:

B1. calculating the functional similarity attribute of the microorganism in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganism _m ∈R ^nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m _i And a microorganism m _j The similarity between them is represented as S _m (m _i ，m _j )；

In a specific embodiment, in step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:

X＝[X _simility ,X _secondary ]。

in a specific embodiment, in step S4, a regularized graph neural network model is established according to Net1, net2, and a multi-modal property graph of a microbe-drug, and the specific steps are as follows:

s401, establishing a microorganism-drug characteristic matrix according to a microorganism-drug multi-modal attribute map, and establishing a microorganism-drug heterogeneous matrix, wherein in the heterogeneous matrix, vi represents a microorganism or a drug of any node, and the heterogeneous matrix is represented as follows:

wherein Y is a characteristic matrix of the microorganism-drug,

representing content feature vectors of the nodes vi;

s402, setting a learnable matrix W epsilon R _m F and assigning initial values to elements of the learnable matrix using random numbers, wherein f is a dimension of node embedding representation set by a hyper-parameter, n = nd + nm is the number of nodes, m =2 × (nd + nm) is a characteristic dimension of the nodes, based on

And W generating a feature transformed vector

S404, solving a formula g () of L2 regularization:

wherein I _N Is an identity matrix of the order of N,

is composed of

The degree matrix of (c).

In one embodiment, as shown in fig. 3, in step S5, net1, net2 are input into the graph neural network model in combination with the multi-modal property map of the microbe-drug to obtain embedded representations Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and specifically obtaining the trained neural network of the graph comprises the following steps:

Wherein,

is the unit vector of i in the matrix,

s502. Embedding vectors according to nodes

Zi＝GNCN(X,A,s)；

adjacency matrix representing reconstruction

The value of the corresponding element is between 0 and 1;

s504, inputting the Z1 and the Z2 into a DNN classifier of the graph neural network model, setting the training times epoch as k2, adopting random gradient descent in the training process, and stopping training when a loss function is converged to obtain the trained graph neural network.

Example 3

In a specific embodiment, in the step S5, after the graph neural network model is trained, verification of the graph neural network model is further performed, where the verification specifically includes:

D4. obtaining an AUC value of the trained graph neural network model according to the classification result; and verifying the accuracy of the graph neural network model according to the AUC value.

In a specific embodiment, in step D4, an AUC value of the trained neural network model is obtained according to the classification result, and the specific steps are as follows;

E5. and (3) drawing an ROC curve by taking FPRate as a horizontal axis and TPrate as a vertical axis, and calculating the area of the ROC curve by using a infinitesimal method, namely an AUC value.

In this example, in order to verify the method for predicting the microbe-drug association effect of the present invention, this example uses default parameter settings, runs the method and 5 existing methods on MDAD dataset, uses AUC value as the performance evaluation index, and the greater the AUC value, the higher the accuracy of the method.

In this example, 5-fold cross validation and 10-fold cross validation were performed on all methods including the present invention, and the experimentally validated drug combinations were randomly divided into 5 or 10 subsets of the same size, each subset in turn being used as a test set, with the remainder being used to train the model. In order to eliminate random sampling deviation, the process is repeated for 10 times, a final AUC score is calculated according to the average value of AUC values in 10 repeated verifications, and the final AUC score is used as a performance index to evaluate the accuracy of each method.

The validation results are shown in the table below, and in the 5-fold cross validation, the method employed by the present invention is expressed as G2 gnamda, and the final AUC score of the present invention is the highest of all methods. The final AUC score of the invention was also the highest among all methods in the 10-fold cross validation. Therefore, the accuracy of the method is superior to that of the prior method:

it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A method of predicting a microorganism-drug association effect, comprising: the method comprises the following steps:

s5, inputting Net1 and Net2 into a neural network model of a graph by combining with a multi-modal attribute diagram of the microorganism and the medicament to obtain embedded expressions Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training to obtain a trained neural network of the graph;

s6, predicting the microorganism-drug association effect in the microorganism-drug data set by training a graph neural network.

2. The method of predicting a microbe-drug association as recited in claim 1, wherein: in the step S3, a topological attribute network of the medicine is constructed through the medicine database, and a microbial gene sequence is constructed through the microbial database; the specific steps of constructing the microorganism-drug multi-modal attribute map according to the comprehensive similarity attribute and drug network topology attribute of the drugs in the drug database and the functional similarity attribute and genome sequence attribute of the microorganisms in the microorganism database are as follows:

s302, constructing a similarity characteristic matrix of the microorganisms according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing a microorganism gene sequence through the microorganism database so as to obtain a second attribute characteristic matrix of the microorganisms;

3. The method of predicting a microbe-drug association effect of claim 2, wherein: in step S301, a similarity feature matrix of the drug is constructed according to the drug similarity attributes in the drug database, and a topology attribute network of the drug is constructed through the drug database, so as to obtain a second attribute feature matrix of the drug, which specifically includes:

where μ represents the normalized kernel bandwidth and μ' is the original bandwidth, set to 1,DIP (d) _i ) Denotes the drug d _i Interaction with other drugs, nd represents the number of microorganisms in the Net1;

4. The method of predicting a microbe-drug association as recited in claim 3, wherein: in step A4, the formula of random drift and restart is:

wherein,

5. The method of predicting a microbe-drug association effect of claim 2, wherein: in the step S302, the specific steps of constructing the similarity feature matrix of the microorganism according to the functional similarity attributes of the microorganisms in the microorganism database, and constructing the microorganism gene sequence through the microorganism database, thereby obtaining the second attribute feature matrix of the microorganism, are:

B1. calculating the functional similarity attribute of the microorganisms in the biological database by using a Kamneva tool to obtain a similarity feature matrix S of the microorganisms _m ∈R ^nm×nm Wherein nm represents the number of microorganisms in Net1; microorganism m _i And a microorganism m _j The similarity between them is represented as S _m (m _i ，m _j )；

6. The method of predicting a microbe-drug association as recited in claim 5, wherein: in the step S303, a microorganism-drug similarity feature network is constructed according to the drug similarity feature matrix and the microorganism similarity feature matrix, and the specific steps are as follows:

C2. Constructing a microorganism-drug second attribute feature network X according to the drug second attribute feature matrix and the microorganism second attribute feature matrix _secondary ：

C3. Combining the microorganism-drug similarity characteristic network with the microorganism-drug second attribute characteristic network to obtain a microorganism-drug multimodal attribute diagram X:

X＝[X _simility ,X _secondary ]。

7. the method of predicting a microbe-drug association effect of claim 6, wherein: in the step S4, a graph neural network model introduced with regularization is established according to Net1, net2 and a multi-modal attribute graph of microorganism-medicament, and the concrete steps are as follows:

wherein Y is a characteristic matrix of the microorganism-drug,

representing content feature vectors of the nodes vi;

And W generating a feature transformed vector

S404, solving a formula g () of L2 regularization:

wherein A ∈ R _nd×nm Setting an element Ai j in the A to be 1 if known correlation exists between nodes i and j in the correlation network for the adjacency matrix of the correlation network in the step S1, otherwise, setting the element Ai j to be 0;

wherein I _N Is an identity matrix of the order of N,

is composed of

The degree matrix of (c).

8. The method of predicting a microbe-drug association effect of claim 7, wherein: in the step S5, net1 and Net2 are combined with a multi-modal attribute diagram of the microorganism-drug to be input into a graph neural network model to obtain embedded expressions Z1 and Z2; inputting the embedded expressions Z1 and Z2 into a neural network of the graph for training, and specifically obtaining the trained neural network of the graph comprises the following steps:

Wherein,

is the unit vector of i in the matrix,

s502. Embedding vectors according to nodes

Z＝GNCN(X,A,s)；

adjacency matrix representing reconstruction

The value of the corresponding element is between 0 and 1;

9. The method of predicting a microbe-drug association effect of claim 8, wherein: in the step S5, after the neural network model of the graph is trained, the accuracy of the neural network model of the graph is verified, and the verification specifically includes the steps of:

10. The method of predicting a microbe-drug association effect of claim 9, wherein: in the step D4, obtaining an AUC value of the trained graph neural network model according to the classification result, wherein the specific step is as follows;

E2. the association probability is used as a classification threshold value, when other association probabilities are larger than the classification threshold value, the sample is regarded as a positive sample, and when other association probabilities are smaller than the classification threshold value, the sample is regarded as a negative sample;

E3. obtaining a label truth value of an edge in the training set according to the association relationship of the microorganisms and the medicines in the training set, wherein the label truth value is 0 or 1, wherein 0 represents that the edge does not exist, namely the association relationship does not exist, namely the negative sample actually exists, and 1 represents that the edge exists, namely the association relationship exists, namely the positive sample actually exists;