CN112820347B

CN112820347B - Disease gene prediction method based on multiple protein network pulse dynamics process

Info

Publication number: CN112820347B
Application number: CN202110141656.5A
Authority: CN
Inventors: 李敏; 项炬
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2021-02-02
Filing date: 2021-02-02
Publication date: 2023-09-22
Anticipated expiration: 2041-02-02
Also published as: CN112820347A

Abstract

The invention discloses a disease gene prediction method based on a multiple protein network pulse dynamics process, which mainly comprises the following steps: 1. constructing a standardized multiple protein network; 2. constructing a multiple protein network pulse dynamics model; 3. extracting pulse dynamics characteristics of multiple protein networks; 4. pulse dynamics characteristics of multiple protein network nodes are fused to predict disease genes by sequencing. The prediction method can be used for more effectively fusing multiple protein networks and mining hidden characteristics in the multiple protein networks, so that the disease gene identification capacity is improved, the calculated amount is small, and the prediction method is suitable for realizing analysis of biological information big data through software.

Description

Disease gene prediction method based on multiple protein network pulse dynamics process

Technical Field

The invention belongs to the field of bioinformatics analysis, and relates to a disease gene prediction method based on a multiple protein network pulse dynamics process.

Background

The identification of disease-related genes is of great importance for the study of disease. Traditional methods such as linkage analysis are helpful for identifying disease-related genes, but often cannot accurately locate disease-causing genes. Due to the high economic costs and high time consumption of biological experiment predictions, the development of efficient computational methods for predicting and screening disease-related genes from a large number of candidate genes has become critical.

Genes associated with similar or identical diseases are functionally related and tend to accumulate nearby in biological networks such as protein-protein interaction networks (PPIs). Therefore, network-based algorithms are very popular in disease gene prediction and related fields, and network propagation is one of the most widely applied strategies, and has become a leading-edge method for genetic association research. Traditional network propagation is useful, but it tends to focus on dynamic steady state solutions, and thus may lose some of the useful information hidden in the dynamic process. Thus, it is necessary to directly mine hidden information in the dynamic process that helps reveal disease gene associations.

Neglecting coexistence of different types of interactions/associations in a networked system, e.g., aggregating these relationships into a single network, can change the topology properties of the overall system, resulting in significant impact on modeling and predictive capabilities of the system. It remains a challenge to make full use of various types of biological networks to effectively predict disease-related genes, as they often have different meanings and reliability, such as metabolic enzyme-coupled interactions, signal transduction, etc. The efficient use of a multi-source biomolecular network will help to enhance the ability of disease gene prediction methods.

Based on this, it is highly desirable to design a disease gene prediction method that can effectively utilize the cross-linking effects of different types of network layers in a networked system.

Disclosure of Invention

First, the technical problem to be solved

Based on the above, the invention discloses a disease gene prediction method based on a multiple protein network pulse dynamics process, which can improve the capability of the disease gene prediction method to fully reveal hidden information related to disease genes, and the analysis method based on the pulse dynamics process can effectively utilize the cross-linking influence of multiple protein networks, so that the prediction accuracy is improved, and is suitable for mass software analysis of biological data.

(II) technical scheme

The invention discloses a disease gene prediction method based on a multiple protein network pulse dynamics process, which comprises the following steps:

step 1: after biological data preprocessing, a plurality of protein networks of different types are connected with nodes corresponding to the same protein, so that a multi-protein network is constructed, multi-network fusion is realized, and the edge weight of the multi-protein network is standardized by calculating the average degree of network nodes; mapping the protein numbers to standard gene symbols uniformly;

step 2: applying periodic pulse signals to seed nodes of each network layer of the multiple protein network in the step 1 to excite the pulse dynamics process of the multiple protein network, calculating the pulse response curve of the multiple protein network nodes, and mining the hidden characteristics of the network nodes;

step 3: acquiring the association strength between the network node and the seed node by calculating the dynamic characteristics of the multiple protein network nodes on the pulse signals;

step 4: based on the dynamic characteristics in the step 3, obtaining a comprehensive protein score by calculating the reciprocal of the geometric average of node ranking values corresponding to the same protein in each network layer of the multiple protein network; disease genes are screened by calculating a descending order of protein composite scores.

Further, the step 1 specifically includes the following steps:

(1) Biological data pretreatment: acquiring known disease gene-related data, disease phenotype-related annotation data, and human phenotype ontology data; acquiring a protein physical interaction network; constructing a protein function association network; uniformly mapping protein numbers into standard gene symbols;

(2) Multiple protein network construction: the interconnection and intercommunication of M network layers with N nodes are realized through a multiple protein network model so as to integrate multiple types of protein associated networks, and the specific operation method comprises the following steps: giving M network layers, wherein each network layer comprises N nodes, connecting nodes corresponding to the same protein in M different types of protein networks, and the connection weight between the network layers is 1/M; to facilitate matrixing operation, let A ^(α) ∈R ^N×N Representing an adjacency matrix for each network layer, the multiple protein network represented by a super adjacency matrix An intra-layer super-adjacency matrix corresponding to an independent network layer, defined as,

the super-adjacency matrix between the corresponding layers, defined as,

wherein A^L ∈R ^M×M The representative node represents an inter-layer link matrix of the network layers, the side weight of which is the link strength between the network layers, set to 1/M,represents the Cronecker product, I.epsilon.R ^M×M Representing the identity matrix;

(3) Normalization of multiple protein networks: dividing the weight of all sides of the multiple protein network by the average degree of network nodes to realize the standardized processing of the multiple protein network, wherein the calculation method comprises the following steps: network node averagingThe normalized network is recorded in a fourth order tensor C, wherein +.> I∈R ^N×N Representing the identity matrix, delta (alpha)Beta) represents a kronecker delta function, when alpha=beta, delta (alpha, beta) =1, otherwise 0.

Further, the step 2 specifically includes: when the pulse dynamics process is excited on the multiple protein network, defining a pulse dynamics equation on the multiple protein network after network normalization treatment as follows:

wherein ,the state of the node i (i=1 to N, N is the total number of nodes) at the network layer α (α=1 to M, M is the total number of network layers) at the time t; />Is a continuous micro-function for describing the self-evolution process of a node without being influenced by other nodes, and is defined +.>Wherein θ is>0 is a self-evolution weight parameter; />The diffusion coefficient between the node i representing the gene network layer alpha and the node j representing the network layer beta, namely the connection weight between the nodes after network standardization, and C corresponds to a fourth-order tensor; if node i of network layer alpha is the control node to which the periodic pulse signal is applied, i.e. the known disease gene, +.>Otherwise->Is a periodic activation function, where t _σ Is the pulse time constant, delta (t-t) _σ ) Is a dirac delta function (when t-t _σ When=0, δ (t-t _σ ) =1, otherwise 0).

Two new fourth-order tensors are defined according to the fourth-order tensor C to represent laplace matrices of the intra-layer sub-network and the inter-layer sub-network of the multiple network, respectively, as defined below,

wherein δ (α, β) represents a kronecker delta function, when α=β, δ (α, β) =1, otherwise 0; expanding the two tensors to obtain a super Laplace matrix in and between layers of the multiple network,

the multiple network pulse dynamics equation is expressed as a matrix form by the superlaplace matrix between layers and layers of the multiple network,

wherein Is a state vector +.>Is a superlaplace matrix of a multiple network,is a vector indicating the control node, u _t Is the aforementioned periodic activation function; based on the matrix equation, the characteristic time tau=1/lambda of the kinetic equation is obtained _m, wherein λ_m For matrix->I is an identity matrix, and θ>0; the pulse period is set to be 5 times or more than 5 times the characteristic time constant according to the characteristic time τ.

Further, the step 3 specifically includes: aiming at the extraction of the pulse dynamics characteristics of the multiple protein networks, the known gene action pulse excitation points related to diseases excite the pulse dynamics process in the multiple protein networks according to the multiple protein network pulse dynamics model, and the impulse response curves of the network nodes are calculated according to the multiple protein network pulse dynamics equation; the kinetic characteristics (S) of the network node to the pulse signal during the multiplex protein network pulse dynamics are defined as:i.e. the maximum value of the node in the impulse dynamics response; and calculating the magnitude of the dynamic characteristics of the network node according to the definition, and describing the association strength between the node and the control node.

Further, the step 4 specifically includes: in a multiplex protein network comprising M network layers of N nodes, each protein has M corresponding replica nodes, i.e., M pulse dynamics feature magnitudesIn each network layer, the magnitude of the dynamics of the node is +.>Calculating the descending order of nodes in each network layer>Then, calculating the reciprocal of the geometric mean of the node ranking values of the corresponding same proteins in M network layers of the multiple protein network to obtain the comprehensive score of the proteins, wherein the calculation method comprises the following steps:finally, according to the comprehensive score, the descending order of the proteins is calculated, and the proteins with the earlier order are more likely to correspond to candidate genes related to diseases, so that the disease genes are identified or predicted, and effective guidance is provided for biological experimental research of the disease genes.

Further, the acquiring protein physical interaction network in the step (1) specifically includes one or more of a regulatory network, a metabolic network, a signaling network, a protein complex network, a protein kinase network, a high-throughput binary interaction network, and a literature-validated protein interaction network.

Further, the construction of the protein function association network in the step (1) specifically includes a gene co-expression network and/or a gene semantic association network based on disease gene association.

In another aspect, the invention also discloses a disease gene prediction system based on multiple protein network pulse dynamics process, comprising:

at least one processor; and at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor to invoke the program instructions to perform the disease gene prediction method based on the multiple protein network pulse dynamics process as described in any one of the above.

In a further aspect, the invention also discloses a non-transitory computer readable storage medium storing computer instructions that cause the computer to perform the disease gene prediction method based on multiple protein network pulse dynamics process as described in any one of the above.

(III) beneficial effects

The technical scheme of the invention has the advantages that the method can more effectively fuse a plurality of types of protein networks, and the information hidden in the multiple protein network structure is mined through the pulse dynamics process of the multiple protein networks, so that the disease related genes can be more effectively identified. The experimental result on the real data set shows that compared with a plurality of existing methods, the prediction method provided by the invention has stronger and more accurate prediction capability, has small calculated amount, and is suitable for realizing analysis processing of batch biological information data through software calculation.

Drawings

The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and should not be construed as limiting the invention in any way, in which:

FIG. 1 is a flowchart of a disease gene prediction method NIDM of the present invention;

FIG. 2 is a graph showing the percentage improvement of the performance of the disease gene prediction method NIDM according to the invention in different data sets when a leave-one-out verification strategy is adopted;

FIG. 3 is a graph of percentage improvement of performance of a disease gene prediction method NIDM of the present invention in different data sets using a five-fold cross-validation strategy;

FIG. 4 is a graph comparing the performance index of the disease gene prediction method NIDM of the present invention with the performance index of the existing RWRMP, RWRMG, DRS, endeavour, RWR and KS methods when a leave-one-out verification strategy is adopted;

FIG. 5 is a graph comparing performance metrics of the disease gene prediction method NIDM of the present invention with the existing RWRMP, RWRMG, DRS, endeavour, RWR and KS methods when a five-fold cross-validation strategy is employed.

Detailed Description

The technical problems and advantages of the technical solution of the present invention will be described in detail with reference to the accompanying drawings and examples, and it should be noted that the described examples are only intended to facilitate understanding of the present invention and are not intended to limit the present invention in any way.

As shown in FIG. 1, the invention provides a disease gene prediction method based on multiple protein network pulse dynamics process, which comprises the following steps:

step 1: construction of a normalized Multiprotein network

After biological data preprocessing, a plurality of nodes corresponding to the same protein in a plurality of different types of protein networks are connected to construct a multi-protein network, so that multi-network fusion is realized; normalization of edge weights of multiple protein networks by computing network node averages

The step 1 specifically comprises the following steps:

(1) Biological data pretreatment: acquiring known disease gene-related data, disease phenotype-related annotation data, and human phenotype ontology data; acquiring a protein physical interaction network (e.g., regulatory network, metabolic network, signaling network, protein complex network, protein kinase network, high-throughput binary interaction network, and literature-validated protein interaction network); constructing a protein function association network (such as a gene co-expression network and a gene semantic association network based on disease gene association); uniformly mapping protein numbers into standard gene symbols;

the super-adjacency matrix between the corresponding layers, defined as,

wherein A^L ∈R ^M×M An inter-layer link matrix representing nodes as network layers, whose edge weights are the link strengths between the network layers, i.e. 1/M,represents the Cronecker product, I.epsilon.R ^M×M Representing the identity matrix;

(3) Normalization of multiple protein networks: dividing the weight of all sides of the multiple protein network by the average degree of network nodes to realize the standardized processing of the multiple protein network, wherein the calculation method comprises the following steps: network node averagingThe normalized network is recorded in a fourth order tensor C, wherein +.> I∈R ^N×N Representing the identity matrix, δ (α, β) represents the kronecker delta function, when α=β, δ (α, β) =1, otherwise 0.

Step 2: construction of multiple protein network pulse dynamics model

Applying periodic pulse signals to seed nodes of each network layer of the multiple protein network in the step 1, exciting the pulse dynamics process of the multiple protein network, calculating the pulse response curve of the multiple protein network nodes, and mining the hidden characteristics of the network nodes;

the step 2 specifically comprises the following steps: when the pulse dynamics process is excited on the multiple protein network, defining a pulse dynamics equation on the multiple protein network after network normalization treatment as follows:

wherein ,the state of the node i (i=1 to N, N is the total number of nodes) at the network layer α (α=1 to M, M is the total number of network layers) at the time t; />Is a continuous micro-function for describing the self-evolution process of a node without being influenced by other nodes, and is defined +.>Wherein θ is>0 is a self-evolution weight parameter; />The diffusion coefficient between nodes i and j representing the gene network layers alpha and beta, namely the connection weight between nodes after network standardization, C corresponds to a fourth-order tensor, which determines the diffusion behavior of pulse signals between nodes of each network layer; if node i of network layer alpha is the control node to which the periodic pulse signal is applied, i.e. the known disease gene, +.>Otherwise->u _t ＝∑ _σ δ(t-t _σ ) Is a periodic activation function, where t _σ Is the pulse time constant, delta (t-t) _σ ) Is a dirac delta function (when t-t _σ When=0, δ (t-t _σ ) =1, otherwise 0); four terms in the pulse dynamics equation describe the self-evolution of the node, the influence of the interaction between the layers in the multiple protein network, and the influence of the periodic pulse signal respectively;

Step 3: extraction of multiple protein network pulse dynamics features

Obtaining the association strength between the network node and the seed node by calculating the dynamic characteristics (S) of the multiple protein network node to the pulse signals;

the step 3 specifically comprises the following steps: aiming at the extraction of the pulse dynamics characteristics of the multiple protein networks, the known gene action pulse excitation points related to diseases excite the pulse dynamics process in the multiple protein networks according to the multiple protein network pulse dynamics model, and the impulse response curves of the network nodes are calculated according to the multiple protein network pulse dynamics equation; the kinetic characteristics (S) of the network node to the pulse signal during the multiplex protein network pulse dynamics are defined as:i.e. the maximum value of the node in the impulse dynamics response; calculating the magnitude of the dynamic characteristics of the network node according to the definition, and describing the association strength between the node and the control node;

step 4: fusion of pulse dynamics characteristics of multiple protein network nodes to predict disease genes

Based on the dynamic characteristics in the step 3, obtaining a comprehensive protein score by calculating the reciprocal of the geometric mean of the node ranking values of the corresponding proteins in each network layer of the multiple protein network; disease genes are screened by calculating a descending order of protein composite scores.

The step 4 specifically includes: in a multiplex protein network comprising M network layers, each protein has M corresponding replica nodes, that is to say M pulse dynamics characteristic valuesIn each network layer, the magnitude of the dynamics of the node is +.>Separately computing a descending order of nodes in each network layer Then, calculating the reciprocal of the geometric mean of the node ranking values of the corresponding same proteins in M network layers of the multiple protein network to obtain the comprehensive score of the proteins, wherein the calculation method comprises the following steps: />Finally, according to the comprehensive score, the descending order of the proteins is calculated, and the proteins with the earlier order are more likely to correspond to candidate genes related to diseases, so that the disease genes are identified or predicted, and effective guidance is provided for biological experimental research of the disease genes.

In order to embody the advantages of the invention, in another embodiment, the validity of the prediction method of the invention is further verified through experiments, the invention also takes the known disease gene association data as a test platform, and adopts one-leave verification and five-fold cross verification to comprehensively evaluate the performance of the method;

(1) Biological data tested: disease Gene data from the OMIM database @https://omim.org/) The method comprises the steps of carrying out a first treatment on the surface of the Protein physical interaction data @ published literature datahttps://science.sciencemag.org/content/ suppl/2015/02/18/347.6224.1257601.DC1) The method comprises the steps of carrying out a first treatment on the surface of the Gene expression data is from GTex data; disease phenotype data and phenotype ontology data are from the HPO database;

(2) Evaluation strategy: for leave-one-out validation, one known disease gene is correlated at a time as a positive test set, the other acting training sets; for five-fold cross validation, randomly splitting a known disease gene set of each disease into 5 parts, wherein each part is sequentially used as a positive test set and the other parts are used as training sets; the splitting process is repeated for a plurality of times; for selection of control set, for each gene of the positive test set, 99 genes closest to it on the same chromosome and not belonging to the training set are selected as control set;

(3) Evaluation index: taking AUROC and AUPRC indexes as evaluation indexes of prediction performance; AUROC, also known as AUC, is the area under which a work characteristic curve (ROC), which is a performance curve with true positive rate (also known as recall, sensitivity) as the ordinate and false positive rate as the abscissa, has been widely used to comprehensively measure the global performance of predictive algorithms; AUPRC is the area under the precision-recall curve (PRC), where PRC curve is on the ordinate with precision and on the abscissa with recall;

(4) Evaluation results

As can be seen from fig. 2 and 3, the multiple protein network pulse dynamics approach is superior to the approach of aggregation networks when multiple types of physical interaction networks are used; when multiple types of physical interaction networks and gene co-expression networks are used, the same multiple protein network pulse dynamics approach is superior to that of the polymeric network; the addition of gene co-expression networks can enhance predictive ability relative to multiple protein network pulse dynamics methods using multiple types of physical interaction networks; compared with a multiple protein network pulse dynamics method using multiple types of physical interaction networks and gene co-expression networks, the addition of the gene semantic similarity network can further improve the prediction capability;

as can be seen from fig. 4, in the leave-one-out experiment, in the multiple protein networks of the multiple types of physical interactions ((a) and (e) in fig. 4), the multiple protein networks of the physical interaction network combined gene co-expression network ((b) and (f) in fig. 4), the multiple protein networks of the physical interaction network combined gene co-expression network and the gene semantic similarity network ((c) and (g) in fig. 4), both AUROC values and AUPRC values of the multiple protein network pulse dynamics method (method abbreviated as NIDM) are superior to other methods;

as can be seen from fig. 5, in the five-fold cross-validation experiment, in the multiple protein networks of the multiple types of physical interactions ((a) and (e) in fig. 5), the multiple protein networks of the physical interaction network combined gene co-expression network ((b) and (f) in fig. 5), the multiple protein networks of the physical interaction network combined gene co-expression network and the gene semantic similarity network ((c) and (g) in fig. 5), both the AUROC value and the AUPRC value of the multiple protein network pulse dynamics method NIDM are also superior to those of the other methods;

therefore, the prediction method NIDM of the invention can more effectively fuse multiple protein networks, and can more effectively extract the hidden information in the network through the pulse dynamics process of the multiple protein networks, thereby more effectively identifying the disease genes.

It should be further noted that the above-mentioned prediction method of the present invention may be implemented as a software program or a computer instruction in a non-transitory computer readable storage medium or in a control system with a memory and a processor, and the calculation program thereof is simple and fast. The functional units in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in hardware plus software functional units. The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The last explanation is: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A disease gene prediction method based on multiple protein network pulse dynamics process is characterized by comprising the following steps:

step 1: after biological data preprocessing, a plurality of protein networks of different types are connected with nodes corresponding to the same protein, so that a multi-protein network is constructed, multi-network fusion is realized, and the edge weight of the multi-protein network is standardized by calculating the average degree of network nodes;

2. The disease gene prediction method based on multiple protein network pulse dynamics process according to claim 1, wherein the step 1 specifically comprises the following steps:

the super-adjacency matrix between the corresponding layers, defined as,

wherein A^L ∈R ^M×M The node represents an interlayer link matrix of the network layers, the side weight is the link strength between the network layers, which is set to be 1/M,represents the Cronecker product, I.epsilon.R ^M×M Representing the identity matrix;

3. The disease gene prediction method based on multiple protein network pulse dynamics process according to claim 1, wherein the step 2 specifically comprises: when the pulse dynamics process is excited on the multiple protein network, defining a pulse dynamics equation on the multiple protein network after network normalization treatment as follows:

wherein ,the state of a node i at a network layer alpha at a time t is represented, alpha=1 to M, M is the total number of the network layers, i=1 to N, and N is the total number of the nodes; />Is a continuous micro-function for describing the self-evolution process of a node without being influenced by other nodes, and is defined +.>Wherein θ is>0 is a self-evolution weight parameter; />The diffusion coefficient between the node i representing the network layer alpha and the node j representing the network layer beta, namely the connection weight between the nodes after network standardization, and C corresponds to a fourth-order tensor; if node i of network layer alpha is the control node to which the periodic pulse signal is applied, i.e. the known disease gene, +.>Otherwise->u _t ＝∑ _σ δ(t-t _σ ) Is a periodic activation function, where t _σ Is the pulse time constant, delta (t-t) _σ ) As a dirac delta function, i.e. when t-t _σ When=0, δ (t-t _σ ) =1, otherwise 0;

wherein As a state vector of the state vector,is a superlaplace matrix of a multiple network,is a vector indicating the control node, u _t Is the aforementioned periodic activation function; based on the matrix equation, the characteristic time tau=1/lambda of the kinetic equation is obtained _m, wherein λ_m For matrix->I is an identity matrix, and θ>0; the pulse period is set to be 5 times or more of the characteristic time constant according to the characteristic time τ.

4. The method for predicting disease genes based on the pulse dynamics process of multiple protein networks according to claim 3, wherein the step 3 specifically comprises: aiming at the extraction of the pulse dynamics characteristics of the multiple protein networks, the known gene action pulse excitation points related to diseases excite the pulse dynamics process in the multiple protein networks according to the multiple protein network pulse dynamics model, and the impulse response curves of the network nodes are calculated according to the multiple protein network pulse dynamics equation; the kinetic characteristics (S) of the network node to the pulse signal during the multiplex protein network pulse dynamics are defined as:i.e. the maximum value of the node in the impulse dynamics response; and calculating the magnitude of the dynamic characteristics of the network node according to the definition, and describing the association strength between the node and the control node.

5. The method for predicting disease genes based on the pulse dynamics process of multiple protein networks according to claim 4, wherein the step 4 specifically comprises: in a multiplex protein network comprising M network layers of N nodes, each protein has M corresponding replica nodes, i.e., M pulse dynamics feature magnitudesIn each network layer, the magnitude of the dynamics of the node is +.>Calculating the descending order of nodes in each network layer>Then, calculating the reciprocal of the geometric mean of the node ranking values of the corresponding same proteins in M network layers of the multiple protein network to obtain the comprehensive score of the proteins, wherein the calculation method comprises the following steps:finally, according to the comprehensive score, the descending order of the proteins is calculated, and the proteins with the earlier order are more likely to correspond to candidate genes related to diseases, so that the disease genes are identified or predicted, and effective guidance is provided for biological experimental research of the disease genes.

6. The method of claim 2, wherein the acquiring protein physical interaction network in step (1) comprises one or more of a regulatory network, a metabolic network, a signaling network, a protein complex network, a protein kinase network, a high throughput binary interaction network, and a literature-validated protein interaction network.

7. The disease gene prediction method based on multiple protein network pulse dynamics process according to claim 2, wherein the constructing protein function association network in the step (1) specifically includes a gene co-expression network and/or a gene semantic association network based on disease gene association.

8. A disease gene prediction system based on multiple protein network pulse dynamics process, comprising:

the memory stores program instructions executable by the processor, the processor invoking the program instructions capable of performing the multiple protein network pulse dynamics-based disease gene prediction method of any one of claims 1 to 7.

9. A non-transitory computer readable storage medium storing computer instructions that cause the computer to perform the disease gene prediction method based on multiple protein network pulse dynamics process according to any one of claims 1 to 7.