CN115618745A - Biological network interaction construction method - Google Patents

Biological network interaction construction method Download PDF

Info

Publication number
CN115618745A
CN115618745A CN202211462889.6A CN202211462889A CN115618745A CN 115618745 A CN115618745 A CN 115618745A CN 202211462889 A CN202211462889 A CN 202211462889A CN 115618745 A CN115618745 A CN 115618745A
Authority
CN
China
Prior art keywords
source node
node
nodes
graph
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211462889.6A
Other languages
Chinese (zh)
Other versions
CN115618745B (en
Inventor
赵玉凤
庞华鑫
张小平
周佩
刘佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute Of Information On Traditional Chinese Medicine Cacms
Original Assignee
Institute Of Information On Traditional Chinese Medicine Cacms
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute Of Information On Traditional Chinese Medicine Cacms filed Critical Institute Of Information On Traditional Chinese Medicine Cacms
Priority to CN202211462889.6A priority Critical patent/CN115618745B/en
Publication of CN115618745A publication Critical patent/CN115618745A/en
Application granted granted Critical
Publication of CN115618745B publication Critical patent/CN115618745B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of deep learning big data medical treatment, and particularly relates to a biological network interaction construction method, which comprises the steps of establishing a undirected connection network graph G; performing primary processing on the undirected connection network graph G: aggregating the information of neighbor nodes of different levels to a source node to generate a wide-area source node characterization vector; constructing a specificity subgraph for each source node from the original network graph G, and learning a specificity characterization vector of the source node; and predicting the interactive relation existing between any two nodes based on the sum. According to part of information, the method can accurately predict the existing interactive relationship between the nodes, and can predict the unknown interactive relationship, thereby providing a direction for subsequent biomedical research. In addition, the depth model can be clearly shown to infer the interaction relations based on which neighbor node information, and the method has high reliability and robustness.

Description

Biological network interaction construction method
Technical Field
The invention belongs to the technical field of deep learning big data medical treatment, and particularly relates to a biological network interaction construction method.
Background
Biological systems are complex networks of various molecular entities (e.g., genes, proteins, and other biomolecules) linked together by interactions. Complex interactions between different molecular entities can be represented as an interaction network, with the molecular entities as nodes and their interactions as edges. Network characterization of biological systems provides a conceptual and intuitive framework for studying and understanding the direct or indirect interactions between different molecular entities in biological systems. Based on the network representation of the nodes, the interaction between new nodes is discovered by utilizing a deep learning method, which is beneficial to promoting the understanding and understanding of the biological system and further elaborating the nature and the rule of the clear life activity. For example, the discovery of new protein-protein interactions can help to understand whether there is a synergistic effect between the two proteins, or the discovery of a high probability linkage between a gene and a disease can guide us whether abnormal expression of the gene can induce a disease. Therefore, it is significant to design an interactive relationship mining model based on a biological network.
In the field of traditional Chinese medicine, medicines for treating diseases are often provided in the form of a prescription, wherein the prescription contains a plurality of medicinal materials, and each medicinal material contains a plurality of active ingredients. According to the traditional Chinese medicine diagnosis and treatment concept, the compounds form a whole and act on certain diseases together, so that the body is regulated from multiple aspects, and multiple channels are used for treating certain diseases. This treatment pattern can be spontaneously described as a network structure with multiple components associated with multiple classes of disease, with many-to-many attributes. Exploring such problems, exploring the truth of these interactions therein, has led to the elucidation of the molecular-level mechanisms of action of TCM in treating diseases. In addition, the traditional Chinese medicine has the concept of 'treating both different diseases and treating both different diseases', and the scientific connotation and the principle thereof existing behind the concept are clarified, and both the support and the application of the biological network mining technology are needed.
Because the interaction is complex and diverse after various factors such as disease symptoms, genes, medicines and the like are fused, how to process the undirected connection network graph after the fusion of the various factors to construct a biological network with differentiated characteristic forms is realized, and the complexity and difficulty are still high after the existing or unknown interaction relation is predicted. Therefore, the invention provides a biological network interaction construction method.
Disclosure of Invention
In order to solve the above technical problems in the prior art, the present invention provides a biological network interaction construction method.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a biological network interaction construction method comprises the following steps:
s1, establishing a undirected connection network graph G, wherein a node V in the graph represents syndrome information, edges E between nodes represent biological interaction relation, and a node characterization vector X represents a combination of any information in biological description, chemical structure and information coding of the nodes;
s2, performing primary processing on the undirected connection network graph G by using a multi-level graph aggregation module: aggregating the information of the neighbor nodes of different levels to the source node to generate a wide-area source node characterization vector
Figure 834706DEST_PATH_IMAGE001
S3, constructing a specific subgraph for each source node from the original network graph G by using the wide-area source node characterization vector through a subgraph selection module;
s4, learning a specific characterization vector of the source node based on the specific subgraph
Figure 912383DEST_PATH_IMAGE002
S5, source node characterization vector based on wide area
Figure 85876DEST_PATH_IMAGE001
And a source node's specificity characterization vector
Figure 830978DEST_PATH_IMAGE002
Predicting the intersection existing between any two nodesAnd (4) the relationship of each other.
Further, the syndrome information comprises diseases, symptoms, genes, medicines and biological targets; biological interactions include any biological relationship of disease-disease, disease-symptom, disease-gene, disease-drug, gene-target, etc.
Furthermore, the multi-level graph aggregation module aggregates information of neighbor nodes of different levels to the source node according to the adjacent matrix A and the dimension conversion matrix of different orders to generate a wide-area node characterization vector.
Furthermore, the multi-level graph aggregation module firstly transforms the initial information characteristics in the network graph G, and applies a full connection layer to map the initial information into the same low-dimensional shared subspace, and the specific method is as follows:
Figure 533354DEST_PATH_IMAGE003
wherein, W represents an initial characteristic mapping parameter matrix, and b represents a bias coefficient; h represents the embedded representation obtained after mapping;
then, a high-order graph convolution encoder is applied to aggregate node information of different orders in the biological network graph G, and the specific method is as follows:
Figure 363907DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 544353DEST_PATH_IMAGE005
a transition probability matrix representing the k-th order of the node,
Figure 675120DEST_PATH_IMAGE006
is as followslA learnable parameter matrix of the k-th order of the layer;
embedding and splicing the nodes of different orders into the representations to obtain the representation vectors of the nodes
Figure 797665DEST_PATH_IMAGE001
Further, in the present invention,
Figure 115514DEST_PATH_IMAGE005
the generation method comprises the following steps: firstly, the adjacent matrix A constructed based on the edge is subjected to Laplace transform and normalization, and then k power is obtained.
Furthermore, according to the embedded representation obtained after the multi-level graph aggregation model is mapped, calculating the information weight of the learning contribution of edges of different levels around the source node to the representation of the source node; setting the source node as u, and searching the neighbor node set of the P layer of the source node u from the biological network graph G
Figure 365230DEST_PATH_IMAGE007
And the set of edges existing between these nodes
Figure 84924DEST_PATH_IMAGE008
The characteristic vector learned by nodes at two ends of the edge in the set through a multi-level graph aggregation module
Figure 129104DEST_PATH_IMAGE009
And
Figure 199828DEST_PATH_IMAGE010
token vector with source node u
Figure 987655DEST_PATH_IMAGE011
Performing splicing to calculate
Figure 561856DEST_PATH_IMAGE008
The importance of each edge in the graph to the source node; the method specifically comprises the following steps:
Figure 776937DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 69378DEST_PATH_IMAGE013
represents an edge (i,j) For the weight value of the source node u,
Figure 660896DEST_PATH_IMAGE014
the expression parameter is
Figure 338871DEST_PATH_IMAGE015
The multi-layer sensor module of (1).
Further, the weight value of each edge in the P-layer neighbor set of the source node u is utilized
Figure 990432DEST_PATH_IMAGE016
Discretizing the weighted values, wherein the specific method comprises the following steps:
Figure 770170DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 899800DEST_PATH_IMAGE018
representing the sigmoid function, mapping the calculated value to
Figure 448593DEST_PATH_IMAGE019
In the interval of the time interval,
Figure 271055DEST_PATH_IMAGE020
are random values, obey a uniform distribution within (0,1),
Figure 272509DEST_PATH_IMAGE021
is the temperature coefficient.
Further, a simple bilinear layer is defined to learn the characterization of the potential edges:
Figure 940251DEST_PATH_IMAGE022
wherein the content of the first and second substances,
Figure 609130DEST_PATH_IMAGE023
representing a learnable fusion matrix, b is a bias coefficient,
Figure 602493DEST_PATH_IMAGE024
representing the representation of the edge obtained by the node i and the node j through the bilinear layer.
The obtained edge characteristics are input into a full-connection layer network of the 2 layer to predict the possibility of the edge existence of two nodes
Figure 825664DEST_PATH_IMAGE025
The specific method comprises the following steps:
Figure 828255DEST_PATH_IMAGE026
where FC denotes a full connection layer, sigmoid and ELU denote nonlinear activation functions.
Further, a probability parameter of edge-to-edge
Figure 335329DEST_PATH_IMAGE025
And (3) solving the loss by applying a binary cross loss function, which specifically comprises the following steps:
Figure 234015DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 475640DEST_PATH_IMAGE028
is an adjacent matrix of 0-1, and the adjacent matrix is a matrix,
after the prediction error loss value is obtained, parameters in the model are updated by applying a back propagation algorithm and model learning rate parameters, and the prediction error is reduced.
Compared with the prior art, the invention has the following beneficial effects:
the invention firstly establishes a non-directional connection network, then aggregates neighbor node information of different levels to a source node to generate a wide-area source node characterization vector
Figure 750764DEST_PATH_IMAGE001
And then constructing a specificity subgraph of the source node to obtain a specificity characterization vector
Figure 863076DEST_PATH_IMAGE002
Wide area based source node characterization vectors
Figure 198243DEST_PATH_IMAGE001
And a source node's specificity characterization vector
Figure 661585DEST_PATH_IMAGE002
The interactive relationship between any two nodes is predicted, so that the known probability relationship and the unknown relationship among elements such as diseases, medicines, genes, symptoms and the like can be predicted, and reliable data basis and research direction are provided for subsequent biomedical research. In addition, a subgraph is constructed for each node, so that the depth model can be clearly shown, the interaction relations are deduced based on the information of the neighbor nodes, and the reliability and the robustness are high.
Drawings
FIG. 1 is a general block diagram of an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a multi-level graph aggregation module according to an embodiment of the invention.
Fig. 3 is a schematic processing diagram of a sub-graph selection module according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of selecting different levels of subgraphs for sample nodes on a DTI data set according to an embodiment of the invention.
Detailed Description
The technical solutions of the present invention will be described in detail with reference to the accompanying drawings, and it is obvious that the described embodiments are not all embodiments of the present invention, and all other embodiments obtained by those skilled in the art without any inventive work belong to the protection scope of the present invention.
The embodiment of the invention is shown in figure 1: first, preprocessing the collected medical record data and the public biological information data, extracting important information such as main disease types and used medicines in each medical record, and constructing a plurality of entity graphs, thereby generating an undirected disease-medicine network graph. Then, all the disease and drug types appearing in the calendar data are counted and labeled one by one, so as to determine the uniqueness of the disease and drug. Similarly, based on the selected drugs and diseases, matching descriptive characteristics or chemical component characteristics are found from public biological libraries such as TCMSP, drug Bank, OMIM, etc. as the initialization characteristics of these entities.
After data preprocessing is completed, an original data disease-drug network diagram G and an entity initialization characteristic matrix X in the network diagram can be obtained, then, in order to verify the effectiveness and robustness of the model, avoid overfitting of the model to the data and reduce the performance of distribution generalization, the obtained data are divided, namely 70% of edges are sampled from the network diagram G to serve as a training set G-train,10% of edges are taken as a verification set G-val and 20% of edges are taken as a final test set G-test of the model. The edges present in the graph are all regarded as positive samples, and an equal number of negative sample sets need to be selected from the sets without edges to guide the model to be trained and tested. In addition, the initial features of the entity can participate in optimization and updating during model training, and low-dimensional embedded characterization vectors are obtained.
After the training set, the verification set and the test set are divided, a specific process for describing a vector learning mode of a node entity and predicting the possibility of interaction between paired nodes is started. As shown in FIG. 1, the initial feature matrix X of entity nodes and the graph topology structure information G are first input into a multi-level graph aggregation Module (MOGA) to generate wide-area characterization vectors
Figure 740400DEST_PATH_IMAGE001
. In the multi-level graph aggregation module, with reference to fig. 2, the specific operation flow is as follows:
(1) Carrying out Laplace transform on the adjacency matrix A, and converting the adjacency matrix A into a normalized adjacency matrix with self-loops;
(2) Firstly, solving a 0-order characteristic aggregation representation, and multiplying the 0-th power of the initial characteristic, the initial characteristic X and the parameterized weight matrix to obtain a 0-order aggregation representation;
(3) Sequentially calculating 1 order, 2 orders and 3 orders in the mode of (2) until the polymerization representation of the K order;
(4) And performing splicing operation on the aggregation representations of different orders to obtain a wide area representation vector.
After obtaining the wide area characterization vector of each node, the node characterization has aggregated the preliminary K-level neighbor node information, which indicates that the node characterization has the receptive field of the K-level neighbor. This facilitates the subsequent subgraph selection module to select the neighbor set with high correlation degree with the node. Next, the present invention inputs the wide area token vector and the original graph A to a subgraph selection module (SGSM) for generating a specific subgraph sub-G for each node. As shown in fig. 3:
(1) Taking a source node u as a center, sampling from an original graph A, and setting the sampling range to be a subgraph set which is P steps of transfer distance away from the source node u and comprises edges and nodes;
(2) Constructing edge representations, namely combining node representation vectors at two ends of an edge to serve as edge representations, and then splicing the edge representations and source node representations to form combined embedding, such as i-j-u;
(3) Inputting the obtained joint embedding into a predefined full connection layer to generate a weight value of the edge i-j;
(4) Discretizing the weight, eliminating irrelevant edges and purifying the subgraph scale;
(5) And reserving high-probability edges, forming a specific subgraph, and generating a subgraph adjacency matrix Sub-A.
The subgraph and the representation of a source node u are input into a new predefined multi-level graph aggregation Module (MOGA), the source node u obtains a specific subgraph representation, the subgraph selection process is repeated, a specific subgraph structure and a subgraph vector representation are generated for each node in the graph, and the node-specific subgraph representations are integrated into a representation matrix, so that the representation of each node can be clearly known to aggregate node information in the graph.
And finally, embedding and splicing the wide area characterization vectors and the specific characterizations to obtain the comprehensive characterizations of each node, wherein the comprehensive characterizations contain rich global node information and carry out reinforced representation on neighbor information with high correlation with the nodes. In order to predict whether an interactive link relationship exists between two nodes, the selected characteristics of the two nodes u and v are input into an interactive prediction module based on a bilinear layer, in the module, firstly, a characteristic vector of a potential edge is generated by the bilinear layer, and then the probability of the existence of the edge, namely the link probability, is predicted by adopting a multi-layer perceptron classification model. The multi-layer perceptron comprises two linear mapping layers and an activation function. In the model parameter training and optimizing stage, a binarization cross loss function is used for solving the difference between a model prediction value and a real situation, and then parameters in the model are updated by combining a gradient descent strategy and a back propagation strategy. In the testing and verification implementation stage, the model fixes parameters and predicts the possibility of edges existing between the initial characteristic vectors based on the input entity nodes.
We used four different data sets to verify the performance of the model, respectively: drug-target Protein interaction network (DTI dataset) from Biosnap database, drug-drug interaction network (DDI dataset) from Biosnap database, protein-Protein interaction network (PPI dataset) from Human proteome database (The Human Protein), and gene-disease interaction network (GDI dataset) from digenet database. The four data sets are specifically presented below:
(1) The DTI network contains 5018 drugs and 2325 target proteins, 15139 drug-target interactions exist between these entities, and a schematic diagram of different level subgraphs selected for sample nodes on the DTI dataset is shown in fig. 4.
(2) The DDI network contains 1514 drugs, according to drug labeling and biomedical literature. 48514 drug-drug interaction relationships were extracted from between these drugs.
(3) PPI network contains 5604 proteins, and 23322 interactions were generated by multiple orthogonal high-throughput yeast two-hybrid screening.
(4) The GDI network consists of 81746 interactions between 9413 genes and 10370 diseases from GWAS studies, animal models and scientific literature.
In order to measure the classification performance of all models, the measurement indexes common in machine learning are adopted: area under ROC curve (AUROC) and precision-recall combined area under curve (aucrc). In order to avoid negative influence caused by class imbalance and simultaneously calculate the feature interaction weight spectrum for each class, negative samples with the same number as that of positive samples are sampled for training and testing. The biological network interaction construction method is realized based on a Pythrch platform, and the version is 1.14.0. And setting main constant type hyper-parameters by adopting a gradient search method. Specifically, the number of hidden layer units of the global characterization is 16, and the number of nerves of the interactive characterization is equal to the number of features in the sample. In addition, the classifier-multilayer perceptron (MLP) has two hidden layers, the number of units is: 64 and 32, the activation function is a linear rectification function (Sigmoid). This optimizer chooses the Adam optimization method with a learning rate that is a dynamic learning rate to minimize the loss. The number of model training iterations is 50, and to prevent overfitting, the model will automatically stop training when the verification loss does not decrease over 10 generations. And finally, due to the fact that the behavior of mini-batch, model parameter initialization and the like has random attributes, all experiments are repeated for 5 times, and the experiment display result is the average value of 5 results. The experimental data are shown in table one.
TABLE AUROC test results for four different datasets
Figure 972798DEST_PATH_IMAGE029
TABLE AUPRC Experimental results for two or four different datasets
Figure 478866DEST_PATH_IMAGE030
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
Although the present invention has been described in detail with reference to examples, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention.

Claims (8)

1. A biological network interaction construction method is characterized by comprising the following steps:
s1, establishing a undirected connection network graph G, wherein a node V in the graph represents syndrome information, edges E between nodes represent biological interaction relation, and a node characterization vector X represents a combination of any information in biological description, chemical structure and information coding of the nodes;
s2, performing primary processing on the undirected connection network graph G by using a multi-level graph aggregation module: aggregating the information of the neighbor nodes of different levels to the source node to generate a wide-area source node characterization vector
Figure 203480DEST_PATH_IMAGE001
S3, constructing a specific subgraph for each source node from the original network graph G by using the wide-area source node characterization vector through a subgraph selection module;
s4, learning specificity characterization vectors of source nodes based on specificity subgraphs
Figure 461286DEST_PATH_IMAGE002
S5, source node characterization vector based on wide area
Figure 625552DEST_PATH_IMAGE001
And a source node's specificity characterization vector
Figure 70439DEST_PATH_IMAGE002
And predicting the interaction relation existing between any two nodes.
2. The biological network interaction construction method as claimed in claim 1, wherein the multi-level graph aggregation module aggregates information of neighbor nodes of different levels to the source node according to the adjacency matrix a and the dimension transformation matrix of different orders to generate a wide-area node characterization vector.
3. The biological network interaction construction method as claimed in claim 2, wherein the multi-level graph aggregation module firstly transforms the initial information features in the network graph G, and maps the initial information into the same low-dimensional shared subspace by applying a full connection layer, and the specific method is as follows:
Figure 345563DEST_PATH_IMAGE003
wherein W represents an initial characteristic mapping parameter matrix, and b represents a bias coefficient; h represents the embedded representation obtained after mapping;
then, a high-order graph convolution encoder is applied to aggregate node information of different orders in the biological network graph G, and the specific method is as follows:
Figure 723455DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure 42309DEST_PATH_IMAGE005
a transition probability matrix representing the k-th order of the node,
Figure 505652DEST_PATH_IMAGE006
is as followslA learnable parameter matrix of the k-th order of the layer;
the nodes of different orders are embedded into the representations for splicing,obtaining a characterization vector of a node
Figure 318887DEST_PATH_IMAGE001
4. The bio-network interaction construction method according to claim 3,
Figure 551285DEST_PATH_IMAGE005
the generation method comprises the following steps: firstly, the adjacent matrix A constructed based on the edges is subjected to Laplace transform and normalization, and then k power is calculated.
5. The biological network interaction construction method according to claim 4, wherein information weights of learning contributions of edges of different levels around the source node to the representation of the source node are calculated according to the embedded representation obtained after the multi-level graph aggregation model is mapped; setting the source node as u, and searching the neighbor node set of the P layer of the source node u from the biological network graph G
Figure 791774DEST_PATH_IMAGE007
And the set of edges existing between these nodes
Figure 7991DEST_PATH_IMAGE008
The characteristic vector learned by nodes at two ends of the edge in the set through a multi-level graph aggregation module
Figure 624918DEST_PATH_IMAGE009
And
Figure 711822DEST_PATH_IMAGE010
token vector with source node u
Figure 388791DEST_PATH_IMAGE011
Performing splicing to calculate
Figure 810414DEST_PATH_IMAGE008
The importance of each edge in the graph to the source node; the method specifically comprises the following steps:
Figure 965452DEST_PATH_IMAGE012
wherein
Figure 172442DEST_PATH_IMAGE013
Represents an edge (i,j) For the weight value of the source node u,
Figure 754733DEST_PATH_IMAGE014
the expression parameter is
Figure 414385DEST_PATH_IMAGE015
The multi-layer sensor module of (1).
6. The method of claim 5, wherein the weight value of each edge in the P-layer neighbor set of the source node u is utilized
Figure 373114DEST_PATH_IMAGE016
Discretizing the weighted values, wherein the specific method comprises the following steps:
Figure 434611DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure 453382DEST_PATH_IMAGE018
representing sigmoid function, mapping the calculated value to
Figure 865909DEST_PATH_IMAGE019
In the interval of the time interval,
Figure 612017DEST_PATH_IMAGE020
is a random value, obey (0,1)The mixture is uniformly distributed, and the mixture is uniformly distributed,
Figure 528020DEST_PATH_IMAGE021
is the temperature coefficient.
7. The method of claim 6, wherein a simple bilinear layer is defined to learn the characterization of the potential edges:
Figure 452114DEST_PATH_IMAGE022
wherein the content of the first and second substances,
Figure 86358DEST_PATH_IMAGE023
representing a learnable fusion matrix, b is a bias coefficient,
Figure 386889DEST_PATH_IMAGE024
representing the representation of the edge obtained by the node i and the node j through the bilinear layer;
the obtained edge characteristics are input into a full-connection layer network of the 2 layer to predict the possibility of the edge existence of two nodes
Figure 891820DEST_PATH_IMAGE025
The specific method comprises the following steps:
Figure 501661DEST_PATH_IMAGE026
where FC denotes a fully connected layer, sigmoid and ELU denote nonlinear activation functions.
8. The method of claim 7, wherein the probability parameter of opposite sides
Figure 623201DEST_PATH_IMAGE025
The loss is obtained by applying a binary cross-loss functionThe body is as follows:
Figure 461844DEST_PATH_IMAGE027
wherein the content of the first and second substances,
Figure 352440DEST_PATH_IMAGE028
is an adjacent matrix of 0-1, and the adjacent matrix,
after the prediction error loss value is obtained, parameters in the model are updated by applying a back propagation algorithm and model learning rate parameters, and the prediction error is reduced.
CN202211462889.6A 2022-11-21 2022-11-21 Biological network interaction construction method Active CN115618745B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211462889.6A CN115618745B (en) 2022-11-21 2022-11-21 Biological network interaction construction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211462889.6A CN115618745B (en) 2022-11-21 2022-11-21 Biological network interaction construction method

Publications (2)

Publication Number Publication Date
CN115618745A true CN115618745A (en) 2023-01-17
CN115618745B CN115618745B (en) 2023-03-21

Family

ID=84879493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211462889.6A Active CN115618745B (en) 2022-11-21 2022-11-21 Biological network interaction construction method

Country Status (1)

Country Link
CN (1) CN115618745B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160010272A (en) * 2014-07-16 2016-01-27 한국과학기술원 Device for selecting disease regulating ubiquitin ligases and method for selecting disease regulating ubiquitin ligases using the same
CN106250368A (en) * 2016-07-27 2016-12-21 中国中医科学院中医药信息研究所 A kind of method and apparatus for checking prescription similarity
CN109411023A (en) * 2018-09-30 2019-03-01 华中农业大学 Interactive relation method for digging between a kind of gene based on Bayesian Network Inference
CN112364295A (en) * 2020-11-13 2021-02-12 中国科学院数学与系统科学研究院 Method and device for determining importance of network node, electronic equipment and medium
CN112802545A (en) * 2021-01-28 2021-05-14 哈尔滨医科大学 Cardiovascular disease patient DNA methylation data processing platform and method
CN115249538A (en) * 2021-12-20 2022-10-28 云南师范大学 Construction method of lncRNA-disease association prediction model for generating confrontation network based on heterogeneous graph

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160010272A (en) * 2014-07-16 2016-01-27 한국과학기술원 Device for selecting disease regulating ubiquitin ligases and method for selecting disease regulating ubiquitin ligases using the same
CN106250368A (en) * 2016-07-27 2016-12-21 中国中医科学院中医药信息研究所 A kind of method and apparatus for checking prescription similarity
CN109411023A (en) * 2018-09-30 2019-03-01 华中农业大学 Interactive relation method for digging between a kind of gene based on Bayesian Network Inference
CN112364295A (en) * 2020-11-13 2021-02-12 中国科学院数学与系统科学研究院 Method and device for determining importance of network node, electronic equipment and medium
CN112802545A (en) * 2021-01-28 2021-05-14 哈尔滨医科大学 Cardiovascular disease patient DNA methylation data processing platform and method
CN115249538A (en) * 2021-12-20 2022-10-28 云南师范大学 Construction method of lncRNA-disease association prediction model for generating confrontation network based on heterogeneous graph

Also Published As

Publication number Publication date
CN115618745B (en) 2023-03-21

Similar Documents

Publication Publication Date Title
Peng et al. Hierarchical Harris hawks optimizer for feature selection
Vandans et al. Identifying knot types of polymer conformations by machine learning
Sarkar et al. Selecting informative rules with parallel genetic algorithm in classification problem
Wang et al. Parallel clustering algorithm for large-scale biological data sets
Sivakumar et al. Innovations in integrating machine learning and agent-based modeling of biomedical systems
Liu et al. Reconstructing gene regulatory networks via memetic algorithm and LASSO based on recurrent neural networks
Yuan et al. Protein-ligand binding affinity prediction model based on graph attention network
Fadhil et al. Multiple efficient data mining algorithms with genetic selection for prediction of SARS-CoV2
Shibahara et al. Deep learning generates custom-made logistic regression models for explaining how breast cancer subtypes are classified
Sameer et al. Multi-objectives TLBO hybrid method to select the related risk features with rheumatism disease
Cong et al. Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization
EP1570426A1 (en) Nonlinear modeling of gene networks from time series gene expression data
CN115618745B (en) Biological network interaction construction method
Yao et al. Chemical property relation guided few-shot molecular property prediction
Ragab et al. Mathematical Modelling of Quantum Kernel Method for Biomedical Data Analysis.
Singh et al. CTDN (convolutional temporal based deep‐neural network): an improvised stacked hybrid computational approach for anticancer drug response prediction
Xue et al. A max-flow based approach for neural architecture search
Sufriyana et al. Deep-insight visible neural network (DI-VNN) for improving interpretability of a non-image deep learning model by data-driven ontology
WO2022212337A1 (en) Graph database techniques for machine learning
Perović et al. How theories of induction can streamline measurements of scientific performance
Bourguignon et al. Studying missingness in spinal cord injury data: challenges and impact of data imputation
Wang et al. Prediction of protein interactions based on CT-DNN
Yastrebov et al. Multiobjective evolutionary algorithm IDEA and k-means clustering for modeling multidimenional medical data based on fuzzy cognitive maps
Elhassani et al. Deep Learning concepts for genomics: an overview
Jeipratha et al. Optimal gene prioritization and disease prediction using knowledge based ontology structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant