WO2017185834A1 - 识别生物分子网络中关键模块或关键节点的方法 - Google Patents

识别生物分子网络中关键模块或关键节点的方法 Download PDF

Info

Publication number
WO2017185834A1
WO2017185834A1 PCT/CN2017/071209 CN2017071209W WO2017185834A1 WO 2017185834 A1 WO2017185834 A1 WO 2017185834A1 CN 2017071209 W CN2017071209 W CN 2017071209W WO 2017185834 A1 WO2017185834 A1 WO 2017185834A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
module
nodes
key
importance
Prior art date
Application number
PCT/CN2017/071209
Other languages
English (en)
French (fr)
Inventor
王�忠
张莹莹
王永炎
Original Assignee
王�忠
张莹莹
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 王�忠, 张莹莹 filed Critical 王�忠
Priority to US16/096,276 priority Critical patent/US20190139621A1/en
Publication of WO2017185834A1 publication Critical patent/WO2017185834A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B10/00ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks

Definitions

  • the invention belongs to the field of bioinformatics.
  • the present invention relates to a method for identifying key modules or key nodes in a complex molecular network of diseases and drug interventions such as a protein interaction network, a gene expression regulatory network, a drug metabolism, and a drug target network.
  • biomolecular network has a modular organization structure.
  • the identification of the module structure contributes to the dimensionality reduction and simplification of complex networks. It is a key factor in understanding biological systems and can provide an opportunity to systematically reveal the complex mechanism of drug action.
  • the network data model shows that for complex diseases, inhibiting multiple targets is more effective than suppressing a single target, so the treatment of complex diseases requires a modular design to affect multiple targets, as well as module identification of related systems and networks. .
  • modules in the biomolecular network Although there are multiple modules in the biomolecular network, their status is not equal, but there are primary and secondary points.
  • the inventors of the present invention have proposed in previous research work that there are key modules in the biomolecular network, which are modules that occupy a key and dominant position in structure and function, that is, have strong centrality and dominance in structure. It is highly integrated in function and can trigger one or more certain effects.
  • the present invention provides a new method capable of identifying key modules or key nodes in a biomolecular network, which is based on a plurality of important node metrics and a topology of a network or module to identify a key from a biomolecular network. Module, or identify key nodes.
  • biomolecular network refers to a network that exists in different organizational forms in a biological system, consisting of nodes representing various biomolecules and edges representing interactions between biomolecules.
  • Common biomolecular networks include gene transcriptional regulatory networks, biological metabolic networks, epigenetic networks, phenotypic networks, signaling networks, and protein interaction networks.
  • key module refers to a module that is critical and dominant in both the structure and function of the network, that is, it is structurally central, functionally and highly integrated, and the module is removed from the network structure. And the impact of messaging exceeds the average of the impact of other modules in the network on network structure and information delivery, and can trigger one or more effects.
  • key node refers to a node in the network that occupies a key position in both structure and function, such as its important influence on information transmission of the network, interaction of nodes, etc., deleting the node structure and information of the network.
  • the impact of delivery exceeds the average of the impact of other nodes in the network on network structure and information delivery. Because this key node plays an important role in the entire biomolecular network, such as the protein interaction network, it can be regarded as a pharmacological driving force, so this article is also called "pharmacological driver”.
  • the invention provides a method of identifying key modules in a biomolecular network, the method comprising the steps of:
  • Module identification is performed on the biomolecular network with the number of nodes ⁇ 3; the exemplary module identification method is MCODE.
  • step (2) For the module identified by step (1), construct a weighted or unweighted module interaction network based on the component correlation between the modules; construct a weighted module interaction network that is the edge actually existing between the modules in the original network Restore the connection to build the interaction between the modules, the number of edges How much is the weight of the relationship between the modules? The more the number of nodes connected to each other between the two modules, the greater the weight of the interaction edges of the two modules.
  • the unweighted module interaction network is only the module. Whether the edges between the two are standard, the edges between all modules are treated as one.
  • the key module is identified by using at least three methods of measuring node importance, wherein the key module satisfies the following condition: The method is calculated, the resulting values are ranked in descending order (A) ranked first in the method of at least one of the importance of the measure nodes; and (B) ranked in the top three in the method of importance of all of the measure nodes. one.
  • the biomolecular network comprises a gene expression regulatory network, a protein interaction network, a biological metabolic network, an epigenetic network, a phenotypic network, a signaling network, etc., preferably a protein interaction network; more preferably, said The biomolecular network is a disease-associated protein interaction network; further preferably, the biomolecular network is a protein-interacting network before or after drug-related drug intervention. Since the biomolecular network of the present invention can be a disease-related network and can further be combined with drug intervention, the identification method of the present invention provides a scientific basis for disease mechanism research and corresponding drug development by revealing key parts of the data network.
  • the biomolecular network is a protein interaction network, preferably the network consists of > 100 nodes (protein) and > 200 edges (interaction between proteins), more preferably ⁇ 200 nodes (protein) and ⁇ 500 edges (interaction between proteins), especially consisting of ⁇ 2000 nodes (protein) and ⁇ 6000 edges (protein interactions).
  • the method for measuring the importance of a node in the step (3) of the method of the present invention may be selected from the group consisting of a degree central method, a proximity central method, a feature vector central method, a median central method, a subgraph central method, and a hub.
  • Central methods, control central methods, node strength methods, stress central methods, page level methods, and adjacency matrix spectrum methods See Table 1 below for details.
  • the present invention provides a method of identifying key nodes in a network or module, the method comprising the steps of: identifying at least three key nodes in the network or module using a method of measuring node importance, wherein The key node satisfies the following conditions: the calculation is performed according to the method of measuring the importance of the node, and the obtained values are arranged in descending order (A) ranked first in the method of at least one importance of the measure node; and (B) at all The method of measuring node importance ranks among the top three.
  • the key nodes can be identified from any network, such as a biomolecular network, or from the modules obtained in steps (1) and (2) of the above method.
  • the network is a biomolecular network, including a gene expression regulatory network, a protein interaction network, a biological metabolic network, an epigenetic network, a phenotypic network, a signaling network, etc.
  • a protein interaction network more preferably, the biomolecular network is a disease-associated protein interaction network; further preferably, the biomolecular network is a protein-interacting network before or after drug-related drug intervention .
  • the biomolecular network is a protein interaction network, preferably the network consists of > 100 nodes (protein) and > 200 edges (interaction between proteins), more preferably ⁇ 200 nodes (protein) and ⁇ 500 edges (interaction between proteins), especially consisting of ⁇ 2000 nodes (protein) and ⁇ 6000 edges (protein interactions).
  • the module is marked with the number of nodes ⁇ 3 a module that identifies a biomolecular network
  • the module is a key module obtained by the following steps:
  • step (2) For the module identified by step (1), construct a weighted or unweighted module interaction network based on the component correlation between the modules; construct a weighted module interaction network that is the edge actually existing between the modules in the original network Restore the connection to build the interaction between the modules, the number of edges as the weight of the relationship between the modules, the more the number of nodes connected between the two modules, the interaction between the two modules The greater the weight; the unweighted module interaction network only considers the existence of edges between modules as the standard, and treats the edges between all modules as one.
  • the key module is identified by using at least three methods of measuring node importance, wherein the key module satisfies the following condition: The method is calculated, the resulting values are ranked in descending order (A) ranked first in the method of at least one of the importance of the measure nodes; and (B) ranked in the top three in the method of importance of all of the measure nodes. one.
  • the method of identifying a key node includes the following steps:
  • step (2) For the module identified by step (1), construct a weighted or unweighted module interaction network based on the component correlation between the modules; construct a weighted module interaction network that is the edge actually existing between the modules in the original network Restore the connection to build the interaction between the modules, the number of edges as the weight of the relationship between the modules, the more the number of nodes connected between the two modules, the interaction between the two modules The greater the weight; the unweighted module interaction network only considers the existence of edges between modules as the standard, and treats the edges between all modules as one.
  • the key module is identified by using at least three methods of measuring node importance, wherein the key module satisfies the following condition: The method is calculated, the resulting values are ranked in descending order (A) ranked first in the method of at least one of the importance of the measure nodes; and (B) ranked in the top three in the method of importance of all of the measure nodes. one;
  • the key nodes in the key modules are identified by using at least three methods of measuring node importance, wherein the key nodes satisfy the following conditions: Calculating according to the method of measuring the importance of the nodes, the obtained values are arranged in descending order (A) ranked first in the method of at least one of the importance of the measuring nodes; and (B) the importance of all the measuring nodes One of the top three methods.
  • the method for measuring the importance of the nodes in the steps (3) and (4) is selected from the group consisting of: a centrality method, a proximity centrality method, a feature vector centrality method, a median central method, a subgraph central method, Hub central method, control center method, node strength method, stress central method, page level method and adjacency matrix spectrum method. See also Table 1.
  • the identification method provided by the present invention can be seen in FIG.
  • the biomolecular network is a protein interaction network of cerebral ischemia, in particular a protein interaction network derived from the cerebral ischemia model group, the jaundice group, and the geniposide group.
  • the protein interaction network of the cerebral ischemia model group consisted of 3750 nodes and 9162 sides.
  • the protein interaction network of the baicalin group consisted of 2813 nodes and 6217 sides.
  • the protein interaction network of the geniposide group was 3416.
  • the nodes and 7581 sides are composed (see Zhang Yingying, Qing Kailing Multi-component intervention and identification of the main modules of the protein network of cerebral ischemia model [D], China Academy of Chinese Medical Sciences, 2014).
  • the methods for identifying key modules and key nodes from the above four protein interaction networks are as follows:
  • step (2) for the module identified by step (1), constructing a weighted module interaction network or an unweighted module interaction network based on component correlation between the modules;
  • the key modules are identified from the methods of multi-measure node importance, such as node strength, median centrality and page level.
  • the module satisfies the following conditions: calculating according to the method of measuring the importance of the node, the obtained values are arranged in descending order (A) ranked first in at least one method; and (B) ranked among the top three in all methods; or
  • key modules are identified from multiple methods of measuring node importance, such as degree centrality, median centrality and page level, among which key modules are identified. The following conditions are met: the calculation is performed according to the method of measuring the importance of the node, and the obtained values are arranged in descending order (A) ranked first in at least one method; and (B) Ranked among the top three in all methods;
  • step (3) For the key modules identified by step (3), for example, three methods of degree centrality, median centrality and page level are selected from the methods of importance of various measure nodes, or stress centrality and node weighting are selected. And feature vector centrality 3 kinds of metrics identify key nodes (pharmacological driver) in the key module, wherein the key node satisfies the following conditions: calculating according to the method of measuring the importance of the node, and the obtained value is in descending order Arrangement (A) ranks first in at least one method; and (B) ranks among the top three in all methods.
  • edge In the network, there are differences in the status of the modules; in the module, the status of the nodes also differs.
  • important nodes have different functions for networks or modules. Hub nodes can be divided into party hubs and appointment hubs. When these two nodes are removed, the impact on the network topology is completely different.
  • the identification of key modules or nodes is very important to understand the mechanism of action of the entire network. For example, when an infectious disease spreads, finding important nodes can better understand the dynamic process of the spread of the infectious disease, and thus it is possible to use effective methods to prevent serious consequences. Therefore, for complex biomolecular networks, especially diseases and drug-related networks, studying the importance of their modules or nodes is important for controlling or reversing disease development and drug design.
  • the present invention proposes an innovative concept for identifying a critical module and a key node, a pharmacological driver, for a complex network or module structure of disease and drug intervention, and specifically provides for identifying key modules from a biomolecular network for the concept.
  • the key node method in the specific identification process, combined with a variety of important node measurement methods and network module topology characteristics, quantitative identification of key modules and pharmacological drivers.
  • the method of the present invention gives sufficient attention to the network topology structure, and through the module itself and the module network as a whole.
  • Multi-angle quantitative analysis such as the relationship between modules determines the key modules
  • the pharmacological driver is determined by multi-method comprehensive quantitative analysis through local, global and iterative importance.
  • the research on key modules and pharmacokinetic drivers lays the foundation for exploring the main pharmacological mechanisms that produce synergistic effects in drug combination; providing a scientific basis for further analysis of drug-related relationships, guiding the combination of drugs and disease treatment, and guiding disease treatment and drug development .
  • 1 is a flow chart showing the key module and pharmacological driver identification method of the present invention.
  • Figure 2 shows the protein interaction network of each group after cerebral ischemia model group and Qingkailing effective component intervention; the circular structures in Fig. 2a-1, 2b-1, 2c-1 are cerebral ischemia model and clear
  • the protein interaction network of the components of baicalin and geniposide, Fig. 2a-2, 2b-2, and 2c-2 are the results of the degree distribution of the nodes in the corresponding network, the abscissa represents the degree of the node, and the ordinate represents the node. Number of. It can be seen from the figure that each network belongs to a scale-free network.
  • Figure 3 shows the module recognition results of the protein interaction network of each group after intervention of the cerebral ischemia model group and the Qingkailing effective component.
  • Figure 4 shows the weighted module interaction network of the protein interaction network of each group after cerebral ischemia model group and Qingkailing intervention; the nodes in the figure represent modules, the edges represent the interaction between modules, and the thickness of the edges The number of connected edges between modules, the thicker the more the number of connected edges.
  • Figure 5 shows the key module results in the protein interaction network of each group after cerebral ischemia model group and Qingkailing intervention;
  • Figure 5a, 5b, 5c are the cerebral ischemia model and the Qingkailing component baicalin Key modules of the geniposide group.
  • Fig. 6 shows the results of changes in the length of the network characteristic path before and after the deletion of each module from the protein interaction network after the cerebral ischemia model and the components of Qingkailing.
  • the abscissa represents the module number
  • the ordinate represents the network feature path length value after deleting each module
  • the "0" represents the characteristic path length value of the network when no module is deleted.
  • Figure 7 shows the results of pharmacological driver verification of the key module M JA-1 of the geniposide group.
  • the figure shows the effect of deleting each node in the module to the feature path length of the module.
  • the abscissa is the name of each node deleted ("None" is not deleted any node), and the ordinate is corresponding to the length of the module feature path when deleting each node. Values, it can be seen that the deletion of Il6 has the greatest impact on the module, consistent with the results identified by the method of the present invention.
  • the object of the present invention is to identify key modules and pharmacological drivers from biomolecular networks of disease and drug interventions, thereby providing a basis for guiding disease treatment and drug development.
  • the following examples demonstrate the effectiveness and feasibility of the method of the invention. These embodiments are non-limiting and the methods of the present invention may also be applied to other types of networks.
  • the protein interaction network of the cerebral ischemia model group (Vehicle group) (Fig. 2a-1) consists of 3750 nodes and 9162 edges;
  • the protein interaction network of the baicalin group (BA group) (Fig. 2b-1) is the protein interaction network after the drug baicalin intervenes in the cerebral ischemia model, consisting of 2813 nodes and 6217 sides;
  • the protein interaction network of the geniposide group (JA group) (Fig. 2c-1) is the protein interaction network after the drug scorpion glycosides interfere with the cerebral ischemia model, consisting of 3,416 nodes and 7581 sides.
  • Step 1 The MCODE method is used to identify each network.
  • the results are shown in Figure 3.
  • the module identified by the protein interaction network of the cerebral ischemia model group is shown in Figure 3a, identified by the protein interaction network of the baicalin group.
  • the module is shown in Figure 3b, and the module identified by the protein interaction network of the geniposide group is shown in Figure 3c.
  • Step 2 based on the component correlation between modules, construct a weighted module interaction network.
  • the results are shown in Figure 4.
  • the weighted module interaction network of the cerebral ischemia model group is shown in Figure 4a, and the weight of the baicalin group.
  • the module interaction network is shown in Figure 4b, and the weighted module interaction network of the geniposide group is shown in Figure 4c.
  • Step 3 Identify key modules from a plurality of methods for measuring node importance, including node strength, median centrality, and page level, wherein the key module meets the following conditions to determine a critical module: according to the metric The method of node importance is calculated, and the obtained values are ranked in descending order (A) ranked first in one method, and (B) ranked in the top three in other methods.
  • each method in each group ranked first in the other three methods, and the specific results are shown in Table 2-4 below.
  • the first column in each table represents the ranking of the calculated values in each method.
  • the number in front of the brackets in each row represents the module number, and the value in parentheses is the score calculated by the module under this method. Except for the node strength, the other two results are rounded off, retaining the three digits after the decimal point.
  • the results of each method are arranged in descending order of the score, and the last column identifies the identified key modules.
  • each module was deleted from each group network, and the network characteristics before and after deletion were observed.
  • the path length changes, and the result is shown in Figure 6.
  • the change of the network feature path length before and after the removal of each module from the network shows that the impact of the network feature path length is greatest after deleting each key module, and is greater than the change of the network feature path length after deleting each module.
  • the average value is the same as the recognition result of each group above. To.
  • Example 1 For the three protein interaction networks (Vehicle group, BA group, JA group) in Example 1, an unweighted module interaction network was constructed in step 2 to identify key networks and pharmacological drivers.
  • the process is as follows.
  • Step 1 Perform module identification on each network by using the MCODE method.
  • Step 2 based on component dependencies between modules, construct an unweighted module interaction network.
  • Step 3 Identifying key modules from a plurality of methods for measuring the importance of the nodes by using three methods: degree centrality, median centrality, and page level, wherein the key modules satisfy the following conditions to be determined as key modules:
  • the method of measuring the importance of the nodes is calculated, and the obtained values are ranked in descending order (A) ranked first in one method, and (B) ranked in the top three among other methods.
  • each method ranked first in each group in the top three in other methods, the specific results are shown in Table 5-7 below.
  • the first column in each table represents the ranking of the calculated values in each method.
  • the number in front of the brackets in each row represents the module number, and the value in parentheses is the score calculated by the module under this method. Except for the node strength, the other two results are rounded off, retaining the three digits after the decimal point.
  • the results of each method are arranged in descending order of the score, and the last column identifies the identified key modules.
  • the key module M JA-1 of the geniposide protein interaction network identified in Example 1 was used to identify the key nodes in this key module, ie pharmacology, using three central measures: selectivity centrality, median centrality and page rank.
  • a driver wherein the key node satisfies the following conditions to determine a key node: calculating according to the method of measuring the importance of the node, and the obtained values are arranged in descending order (A) ranked first in a method, (B) Ranked in the top three among the other methods.
  • Table 8 Key node recognition results in the key module of geniposide group M JA-1
  • each node in the module is removed from the module separately, and the change of the feature path length of the module before and after the deletion is observed. The result is shown in FIG. 7 .
  • Il6 is a biomarker of cerebral ischemia, and Il6 is located upstream of the Graft-versus-host disease and Hematopoietic cell lineage signaling pathways, which is of great significance for diseases and their treatment.
  • the present invention uses three methods for measuring important nodes for protein interaction networks to identify key modules, and each method in the disease model group and each drug intervention group ranks first in other methods. Thirdly, six methods for measuring important nodes are used to identify the pharmacokinetic drivers. The results of multi-method multi-angle evaluation of important nodes are consistent. These results show that the ideas and methods for identifying key modules and pharmacological drivers are effective and feasible.
  • the present invention is only exemplified for the protein interaction network of the cerebral ischemia model group and the drug intervention group, and adopts a specific importance metric method, for other types of networks, other important node metrics are also adopted.
  • Pharmacological drivers in the network after drug intervention in key modules or modules in complex networks can be effectively exploited.
  • the identification of pharmacological drivers is not limited to key modules, which can be divided into primary drivers and secondary drivers, which exist in key modules. And / or non-critical modules, can also reveal the interaction of each module and each node in the network, providing a scientific basis for guiding disease treatment and drug development.

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Physiology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

一种针对生物分子的复杂网络,如蛋白质相互作用网络、基因表达调控网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络,基于其拓扑结构融合多种方法进行关键模块和关键节点的识别方法,主要步骤如下:基于生物分子网络,或对网络进行模块的划分,对网络或各模块,基于网络拓扑结构采用多种度量方法多角度综合定量识别关键模块和关键节点。

Description

识别生物分子网络中关键模块或关键节点的方法 技术领域
本发明属于生物信息技术领域。具体而言,本发明涉及疾病和药物干预的复杂分子网络如蛋白质相互作用网络、基因表达调控网络、药物代谢和药物靶点等网络等中关键模块或关键节点的识别方法。
背景技术
随着组学时代的到来和高通量技术的发展,产生了海量的生物数据集,并随之产生了常见的生物分子网络,例如蛋白质相互作用网络、基因表达调控网络、药物代谢网络等。对这些数据网络进行有效的分析,可以揭示基因表达调控、蛋白质相互作用、代谢物相互作用等机制,进而应用于疾病机制研究与治疗、药物开发等领域。
证据显示生物分子网络中具有模块组织结构,模块结构的识别有助于复杂网络的降维和简化,是理解生物系统的一个关键因素,可以为系统地揭示药物作用的复杂机制提供契机。而且网络数据模型表明,针对复杂疾病,抑制多个靶标比抑制单一靶标更加有效,因此复杂疾病的治疗需要一种模块化设计来影响多个靶点,同样需要对相关系统与网络进行模块的识别。
但是,虽然生物分子网络中存在多个模块,但它们的地位并非均等,而是有主次之分。本发明的发明人在之前的研究工作中已经提出,在生物分子网络中存在关键模块,其是在结构和功能上均占据关键和支配地位的模块,即在结构上具有强中心性和支配性,在功能上具有强整合性,并能触发一个或多个一定效应。
在复杂事物中抓住主要矛盾是解决问题的关键所在。目前已有研究者开展了关键模块的识别工作,但是这些已有研究往往忽视了模块网络的拓扑结构,忽视了模块间关系对关键模块的影响,缺乏对关键模块或关键节点的定量分析。
因此目前本领域仍然需要提供能够识别复杂生物分子网络中关键模块 或关键节点的新方法。
发明内容
针对上述技术问题,本发明提供一种能够识别生物分子网络中关键模块或关键节点的新方法,该方法基于多个重要节点度量方法及网络或模块的拓扑结构,从生物分子网络中识别出关键模块,或者识别出关键节点。
定义:
如本文所用,术语“生物分子网络”是指在生物系统中以不同组织形式存在的网络,其由代表各种生物分子的节点与代表生物分子之间的相互作用关系的边组成。常见的生物分子网络包括基因转录调控网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络、蛋白质相互作用网络等。
如本文所用,术语“关键模块”是指网络中在结构和功能上均占据关键和支配地位的模块,即其在结构上占据中心性,在功能上具有强整合性,删除此模块对网络结构和信息传递的影响超过删除网络中其他模块对网络结构和信息传递影响的平均值,并能触发一个或多个一定效应。
如本文所用,术语“关键节点”是指网络中在结构和功能上均占据关键位置的节点,如其对网络的信息传递、节点的相互作用等方面具有重要影响,删除此节点对网络结构和信息传递的影响超过删除网络中其他节点对网络结构和信息传递影响的平均值。由于该关键节点在整个生物分子网络、例如蛋白质相互作用网络中的重要作用,可以被视为在药理方面具有驱动作用,因此本文又称之为“药理驱动子”。
本发明的具体技术方案如下:
一方面,本发明提供一种识别生物分子网络中关键模块的方法,所述方法包括以下步骤:
(1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;示例性模块识别方法为MCODE。
(2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;构建加权的模块相互作用网络即将模块之间在原网络中实际存在的边恢复连接以构建模块之间的相互作用,将边的数 目的多少作为模块之间关系的权重的大小,两个模块之间节点相互连接的边数越多,则这两个模块相互作用边的权重越大;不加权的模块相互作用网络仅以模块之间的边是否存在为标准,将所有模块之间的边均视为1条。
(3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
优选地,所述生物分子网络包括基因表达调控网络、蛋白质相互作用网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络等,优选为蛋白质相互作用网络;更优选地,所述生物分子网络为疾病相关的蛋白质相互作用网络;进一步优选地,所述生物分子网络为疾病相关的经药物干预之前或之后的蛋白质相互作用网络。由于本发明的生物分子网络可为疾病相关网络,并进一步可组合药物干预,因此本发明的识别方法通过揭示数据网络的关键部分,为疾病机制研究与相应的药物开发提供科学依据。
从复杂程度来看,所述生物分子网络为蛋白质相互作用网络,优选地所述网络由≥100个节点(蛋白质)和≥200条边(蛋白质之间的相互作用关系)组成,更优选由≥200个节点(蛋白质)和≥500条边(蛋白质之间的相互作用关系)组成,特别是由≥2000个节点(蛋白质)和≥6000条边(蛋白质之间的相互作用关系)组成。
其中,本发明方法的步骤(3)中度量节点重要性的方法可选自:度中心性方法、邻近中心性方法、特征向量中心性方法、介数中心性方法、子图中心性方法、枢纽中心性方法、控制中心性方法、节点强度方法、应力中心性方法、页面等级方法和邻接矩阵谱方法。具体见下表1。
表1:各种度量重要节点的方法
度中心性 Degree centrality
邻近中心性 Closeness centrality
特征向量中心性 Eigenvector centrality
介数中心性 Betweenness centrality
子图中心性 Subgraph centrality
枢纽中心性 Hub centrality
控制中心性 Control centrality
节点强度 Weighted degree
应力中心性 Stress
页面等级 PageRank
邻接矩阵谱 Spectrum of the adjacency matrix
另一方面,本发明提供一种识别网络或模块中关键节点的方法,所述方法包括以下步骤:采用至少3种度量节点重要性的方法识别所述网络或模块中的关键节点,其中所述关键节点满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
关键节点可以从任何网络、例如生物分子网络中识别,也可以从上文方法的第(1)、(2)步骤中分别得到的模块中识别。
因此,当从网络中识别关键节点时,优选地,所述网络为生物分子网络,包括基因表达调控网络、蛋白质相互作用网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络等,优选为蛋白质相互作用网络;更优选地,所述生物分子网络为疾病相关的蛋白质相互作用网络;进一步优选地,所述生物分子网络为疾病相关的经药物干预之前或之后的蛋白质相互作用网络。
从复杂程度来看,所述生物分子网络为蛋白质相互作用网络,优选地所述网络由≥100个节点(蛋白质)和≥200条边(蛋白质之间的相互作用关系)组成,更优选由≥200个节点(蛋白质)和≥500条边(蛋白质之间的相互作用关系)组成,特别是由≥2000个节点(蛋白质)和≥6000条边(蛋白质之间的相互作用关系)组成。
当从模块中识别关键节点时,优选地,所述模块为以节点数目≥3为标 准,对生物分子网络进行识别得到的模块;
或者,所述模块为采用下述步骤得到的关键模块:
(1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;
(2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;构建加权的模块相互作用网络即将模块之间在原网络中实际存在的边恢复连接以构建模块之间的相互作用,将边的数目的多少作为模块之间关系的权重的大小,两个模块之间节点相互连接的边数越多,则这两个模块相互作用边的权重越大;不加权的模块相互作用网络仅以模块之间的边是否存在为标准,将所有模块之间的边均视为1条。
(3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
进一步,识别关键节点的方法包括以下步骤:
(1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;
(2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;构建加权的模块相互作用网络即将模块之间在原网络中实际存在的边恢复连接以构建模块之间的相互作用,将边的数目的多少作为模块之间关系的权重的大小,两个模块之间节点相互连接的边数越多,则这两个模块相互作用边的权重越大;不加权的模块相互作用网络仅以模块之间的边是否存在为标准,将所有模块之间的边均视为1条。
(3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一;
(4)对于经步骤(3)识别的关键模块,采用至少3种度量节点重要性的方法识别所述关键模块中的关键节点,其中所述关键节点满足以下条件: 根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
其中所述步骤(3)和步骤(4)中度量节点重要性的方法选自:度中心性方法、邻近中心性方法、特征向量中心性方法、介数中心性方法、子图中心性方法、枢纽中心性方法、控制中心性方法、节点强度方法、应力中心性方法、页面等级方法和邻接矩阵谱方法。同样参见表1。
本发明提供的识别方法可见图1。
根据本发明的具体实施方式,所述生物分子网络为脑缺血的蛋白质相互作用网络,特别是源自脑缺血模型组、黄芩组、栀子苷组的蛋白质相互作用网络。其中脑缺血模型组的蛋白质相互作用网络由3750个节点和9162条边组成,黄芩苷组的蛋白质相互作用网络由2813个节点和6217条边组成,栀子苷组的蛋白质相互作用网络由3416个节点和7581条边组成(参见张莹莹,清开灵多组分干预脑缺血模型蛋白质网络主要模块的识别与比较[D],中国中医科学院,2014)。
示例性地,从上述四个蛋白质相互作用网络分别识别关键模块与关键节点的方法分别如下:
(1)以节点数目≥3为标准,采用MCODE方法进行模块识别;
(2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权的模块相互作用网络或不加权的模块相互作用网络;
(3)对于经步骤(2)构建的加权的模块相互作用网络,从多种度量节点重要性的方法中例如选用节点强度、介数中心性和页面等级3种度量方法识别关键模块,其中关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种方法中排名第一;和(B)在所有方法中排名前三之一;或者
对于经步骤(2)构建的不加权的模块相互作用网络,从多种度量节点重要性的方法中例如选用度中心性、介数中心性和页面等级3种度量方法识别关键模块,其中关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种方法中排名第一;和(B) 在所有方法中排名前三之一;
(4)对于经步骤(3)识别的关键模块,从多种度量节点重要性的方法中例如选用度中心性、介数中心性和页面等级3种度量方法或者选用应力中心性、节点加权度和特征向量中心性3种度量方法识别所述关键模块中的关键节点(药理驱动子),其中所述关键节点满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种方法中排名第一;和(B)在所有方法中排名前三之一。
无论是复杂网络还是作为亚网络的模块,均由节点和节点之间的关系(边)组成。在网络中,模块地位存在差异;而在模块中,节点地位也存在差异。事实上,重要节点对网络或模块具有不同的功能,枢纽节点可分为聚会枢纽和约会枢纽,当移除这两种节点时,对网络拓扑结构的影响是截然不同的。关键模块或节点的识别对于了解整个网络的作用机制是非常重要的。例如在传染病疫情蔓延时,找到重要的节点可以更好地理解该传染病传播动态过程,由此有可能采用行之有效的方法阻止严重的后果。故对于复杂生物分子网络、特别是疾病、药物相关的网络,研究其模块或节点的重要性对于控制或扭转疾病的发展和药物设计具有重要的意义。
对此,本发明对于疾病和药物干预的复杂网络或模块结构,提出要识别关键模块和关键节点即药理驱动子之一创新概念,并且针对该概念,具体提供了从生物分子网络中识别关键模块和关键节点的方法,在该具体识别过程中,结合多种重要节点度量方法及网络模块的拓扑结构特点,定量识别出关键模块和药理驱动子。特别是,基于结构对功能的决定性,不论是分子网络与模块网络还是模块划分与关键模块、驱动子的识别,本发明的方法都给予网络拓扑结构足够重视,并通过对模块本身、模块网络整体、模块间关系等多角度定量分析确定关键模块;通过局部、整体及迭代重要性等采用多方法综合定量分析确定药理驱动子。关键模块和药理驱动子的研究为探索药物联合应用产生协同效应的主要药理机制奠定基石;为进一步分析药物间相互关系、指导联合用药和疾病治疗打下基础,为指导疾病治疗和药物研发提供科学依据。
附图说明
以下,结合附图来详细说明本发明的实施方案,其中:
图1示出本发明所述关键模块和药理驱动子识别方法的流程图。
图2示出脑缺血模型组和清开灵有效组分干预后各组的蛋白质相互作用网络;图2a-1、2b-1、2c-1中圆形结构分别为脑缺血模型及清开灵组分黄芩苷、栀子苷的蛋白质相互作用网络,图2a-2、2b-2、2c-2分别为相应网络中节点的度分布结果,横坐标代表节点的度数,纵坐标代表节点的数目。从图中可看出各个网络均属于无尺度网络。
图3示出脑缺血模型组和清开灵有效组分干预后各组的蛋白质相互作用网络的模块识别结果。
图4示出脑缺血模型组和清开灵各组分干预后各组的蛋白质相互作用网络的加权模块相互作用网络;图中节点代表模块,边代表模块之间相互作用,边的粗细表示模块之间连接边的数目的多少,越粗表示连接边的数目越多。
图5示出脑缺血模型组和清开灵各组分干预后各组的蛋白质相互作用网络中关键模块结果;图5a、5b、5c分别为脑缺血模型及清开灵组分黄芩苷、栀子苷组的关键模块。
图6示出从脑缺血模型和清开灵各组分干预后的蛋白质相互作用网络中删除各个模块前后网络特征路径长度的变化结果。横坐标代表模块编号,纵坐标代表删除各模块后网络特征路径长度值,“0”代表没有删除任何模块时网络的特征路径长度值。
图7示出栀子苷组关键模块MJA-1的药理驱动子验证结果。图中显示在模块中分别删除各个节点对模块特征路径长度的影响,横坐标是删除每个节点的名称(“无”为未删除任何节点),纵坐标对应删除各个节点时模块特征路径长度的值,可看出删除Il6对模块的影响最大,与采用本发明的方法识别的结果一致。
实施发明的最佳方式
以下参照具体的实施例来说明本发明。本领域技术人员能够理解,这些实施例仅用于说明本发明,其不以任何方式限制本发明的范围。
下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的药材原料、试剂材料等,如无特殊说明,均为市售购买产品。
本发明的目的是从疾病和药物干预的生物分子网络中识别出关键模块和药理驱动子,从而为指导疾病治疗和药物研发提供依据。以下的实施例证明了本发明方法的有效性和可行性。这些实施例是非限制性的,本发明的方法还可以应用其他类型的网络。
实施例1关键模块的识别和验证
采用清开灵各组分干预脑缺血小鼠模型的蛋白质相互作用网络(图2;参见张莹莹,清开灵多组分干预脑缺血模型蛋白质网络主要模块的识别与比较[D],中国中医科学院,2014):
脑缺血模型组(Vehicle组)的蛋白质相互作用网络(图2a-1),由3750个节点和9162条边组成;
黄芩苷组(BA组)的蛋白质相互作用网络(图2b-1),为药物黄芩苷干预脑缺血模型后的蛋白质相互作用网络,由2813个节点和6217条边组成;
栀子苷组(JA组)的蛋白质相互作用网络(图2c-1),为药物栀子苷干预脑缺血模型后的蛋白质相互作用网络,由3416个节点和7581条边组成。
关键网络与药理驱动子的识别过程如下。步骤1,采用MCODE方法对各个网络进行模块识别,结果如图3所示,其中由脑缺血模型组的蛋白质相互作用网络识别的模块见图3a,由黄芩苷组的蛋白质相互作用网络识别的模块见图3b,由栀子苷组的蛋白质相互作用网络识别的模块见图3c。
步骤2,基于模块之间的组件相关性,构建加权的模块相互作用网络,结果如图4所示,其中脑缺血模型组的加权的模块相互作用网络见图4a,黄芩苷组的加权的模块相互作用网络见图4b,栀子苷组的加权的模块相互作用网络见图4c。
步骤3,从多种度量节点重要性的方法中选用节点强度、介数中心性和页面等级3种度量方法识别关键模块,其中所述关键模块满足以下条件才确定为关键模块:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在一种方法中排名第一,(B)在另外几种方法中排名位于前三。
结果发现各组中每种方法排名第一的均在其他方法中位于前三,具体结果见下表2-4。各表中第一列代表计算得到的数值在各方法中的排名,每行中括号前面的数字代表模块的号码,括号内的数值为此模块在该种方法下计算的得分。除节点强度外,其余两种结果均四舍五入,保留小数点后3位数,各方法的结果按照得分从高到低的顺序排列,最后一列标出识别出的关键模块。
表2:模型组加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000001
表3:黄芩苷组加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000002
Figure PCTCN2017071209-appb-000003
表4:栀子苷组加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000004
各组识别出的关键模块图见图5中的图5a、5b、5c。
对于从脑缺血模型及清开灵组分黄芩苷、栀子苷组识别出来的关键模块,为验证其对网络的重要性,分别将各个模块从各组网络中删除,观察删除前后网络特征路径长度的变化,结果如图6所示。
如图6所示,从网络中分别删除各个模块前后网络特征路径长度的变化看,在删除各个关键模块后对网络特征路径长度的影响最大,且大于删除各个模块后对网络特征路径长度变化的平均值,与上文对各组的识别结果相一 致。
实施例2关键模块的识别和验证
对实施例1中的三种蛋白质相互作用网络(Vehicle组、BA组、JA组),在步骤2中构建不加权的模块相互作用网络,进行关键网络与药理驱动子的识别。
过程如下。
步骤1,采用MCODE方法对各个网络进行模块识别。
步骤2,基于模块之间的组件相关性,构建不加权的模块相互作用网络。
步骤3,从多种度量节点重要性的方法中选用度中心性、介数中心性和页面等级3种度量方法识别关键模块,其中所述关键模块满足以下条件才确定为关键模块:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在一种方法中排名第一,(B)在另外几种方法中排名位于前三。
结果发现各组中每种方法排名第一的均在其他方法中位于前三,具体结果见下表5-7。各表中第一列代表计算得到的数值在各方法中的排名,每行中括号前面的数字代表模块的号码,括号内的数值为此模块在该种方法下计算的得分。除节点强度外,其余两种结果均四舍五入,保留小数点后3位数,各方法的结果按照得分从高到低的顺序排列,最后一列标出识别出的关键模块。
表5:模型组不加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000005
Figure PCTCN2017071209-appb-000006
表6:黄芩苷组不加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000007
表7:栀子苷组不加权模块网络中关键模块识别结果
Figure PCTCN2017071209-appb-000008
Figure PCTCN2017071209-appb-000009
由结果可知,各组的关键模块识别结果均与实施例1识别结果相同。
实施例3关键节点即药理驱动子的识别和验证
分别采用实施例1中识别的栀子苷组蛋白质相互作用网络的关键模块MJA-1,选用度中心性、介数中心性和页面等级3种度量方法识别这个关键模块中的关键节点即药理驱动子,其中所述关键节点满足以下条件才确定为关键节点:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在一种方法中排名第一,(B)在另外几种方法中排名位于前三。
对栀子苷组蛋白质相互作用网络中关键模块MJA-1的识别发现,三种方法的首要结果保持一致,均为Il6,见表8。
表8:栀子苷组关键模块MJA-1中关键节点识别结果
Figure PCTCN2017071209-appb-000010
Figure PCTCN2017071209-appb-000011
采用另外三种方法组合筛选关键节点,这三种方法的首要结果也保持一致,均为Il6,见表9,说明只要采用任何三种方法,都得到相同结果。
表9:栀子苷组关键模块MJA-1中关键节点识别结果
Figure PCTCN2017071209-appb-000012
Figure PCTCN2017071209-appb-000013
为验证识别的关键节点对网络模块的重要性,分别将模块中各个节点分别从模块中删除,观察删除前后模块特征路径长度的变化,结果如图7所示。
从图中可见从模块中分别删除各个节点前后模块特征路径长度的变化看,在删除关键节点Il6后对模块特征路径长度的影响最大,且大于删除各个节点后对模块特征路径长度变化的平均值,与上文对该模块的识别结果相一致。
通过查阅文献,可知Il6是脑缺血的生物学标记物(biomarker),且Il6位于Graft-versus-host disease、Hematopoietic cell lineage信号通路的上游位置,对于疾病及其治疗有重要的意义。
通过上述实施例可知,本发明针对蛋白质相互作用网络采用3种度量重要节点的方法进行关键模块的识别,疾病模型组和各药物干预组中每种方法排名第一的均在其他方法中位于前三;采用6种度量重要节点的方法进行药理驱动子的识别,多方法多角度评价重要节点结果保持一致,这些结果说明识别关键模块和药理驱动子的思路和方法具有有效性和可行性。
虽然本发明仅针对脑缺血模型组与药物干预组的蛋白质相互作用网络进行了示例,且采取了特定的重要性度量方法,但是对于其他类型的网络,也采用其他重要节点的度量方法,依然可以有效地挖掘出复杂网络中的关键模块或模块中药物干预后网络中的药理驱动子。特别是药理驱动子的识别不仅限定于关键模块中,其可分为主要驱动子和次要驱动子,存在于关键模块 和/或非关键模块中,同样可揭示网络中各模块、各节点的相互作用关系,为指导疾病治疗和药物研发提供科学依据。
以上对本发明具体实施方式的描述并不限制本发明,本领域技术人员可以根据本发明作出各种改变或变形,只要不脱离本发明的精神,均应属于本发明所附权利要求的范围。

Claims (10)

  1. 一种识别生物分子网络中关键模块的方法,所述方法包括以下步骤:
    (1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;
    (2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;
    (3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
  2. 根据权利要求1所述的方法,其特征在于,所述生物分子网络包括基因表达调控网络、蛋白质相互作用网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络,优选为蛋白质相互作用网络;
    优选地,所述生物分子网络为疾病相关的蛋白质相互作用网络;
    优选地,所述生物分子网络为疾病相关的经药物干预之前或之后的蛋白质相互作用网络。
  3. 根据权利要求1或2所述的方法,其特征在于,所述蛋白质相互作用网络由≥100个节点和≥200条边组成;
    优选地,所述蛋白质相互作用网络由≥200个节点和≥500条边;
    更优选地,所述蛋白质相互作用网络由≥2000个节点和≥6000条边组成。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述步骤(3)中度量节点重要性的方法选自:度中心性方法、邻近中心性方法、特征向量中心性方法、介数中心性方法、子图中心性方法、枢纽中心性方法、控制中心性方法、节点强度方法、应力中心性方法、页面等级方法和邻接矩阵谱方法。
  5. 一种识别网络或模块中关键节点的方法,所述方法包括以下步骤:采用至少3种度量节点重要性的方法识别所述网络或模块中的关键节点,其中所述关键节点满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
  6. 根据权利要求5所述的方法,其特征在于,所述网络为生物分子网络,包括基因表达调控网络、蛋白质相互作用网络、生物代谢网络、表观遗传网络、表型网络、信号传导网络,优选为蛋白质相互作用网络;
    优选地,所述生物分子网络为疾病相关的蛋白质相互作用网络;
    优选地,所述生物分子网络为疾病相关的经药物干预之前或之后的蛋白质相互作用网络。
  7. 根据权利要求5或6所述的方法,其特征在于,所述蛋白质相互作用网络由≥100个节点和≥200条边组成;
    优选地,所述蛋白质相互作用网络由≥200个节点和≥500条边;
    更优选地,所述蛋白质相互作用网络由≥2000个节点和≥6000条边组成。
  8. 根据权利要求5至7中任一项所述的方法,其特征在于,所述模块为以节点数目≥3为标准,对生物分子网络进行识别得到的模块;
    或者,所述模块为采用下述步骤得到的关键模块:
    (1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;
    (2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;
    (3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度 量节点重要性的方法中排名前三之一。
  9. 根据权利要求5至8中任一项所述的方法,其特征在于,所述方法包括以下步骤:
    (1)以节点数目≥3为标准,对所述生物分子网络进行模块识别;
    (2)对于经步骤(1)识别的模块,基于模块之间的组件相关性构建加权或不加权的模块相互作用网络;
    (3)对于经步骤(2)构建的加权或不加权的模块相互作用网络,采用至少3种度量节点重要性的方法识别关键模块,其中所述关键模块满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一;
    (4)对于经步骤(3)识别的关键模块,采用至少3种度量节点重要性的方法识别所述关键模块中的关键节点,其中所述关键节点满足以下条件:根据所述度量节点重要性的方法进行计算,得到的数值按降序排列(A)在至少一种所述度量节点重要性的方法中排名第一;和(B)在所有所述度量节点重要性的方法中排名前三之一。
  10. 根据权利要求5至9中任一项所述的方法,其特征在于,所述步骤(3)和步骤(4)中度量节点重要性的方法选自:度中心性方法、邻近中心性方法、特征向量中心性方法、介数中心性方法、子图中心性方法、枢纽中心性方法、控制中心性方法、节点强度方法、应力中心性方法、页面等级方法和邻接矩阵谱方法。
PCT/CN2017/071209 2016-04-27 2017-01-16 识别生物分子网络中关键模块或关键节点的方法 WO2017185834A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/096,276 US20190139621A1 (en) 2016-04-27 2017-01-16 Method for identifying key module or key node in biomolecular network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610269466.0 2016-04-27
CN201610269466.0A CN105956413B (zh) 2016-04-27 2016-04-27 识别生物分子网络中关键模块或关键节点的方法

Publications (1)

Publication Number Publication Date
WO2017185834A1 true WO2017185834A1 (zh) 2017-11-02

Family

ID=56916857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/071209 WO2017185834A1 (zh) 2016-04-27 2017-01-16 识别生物分子网络中关键模块或关键节点的方法

Country Status (3)

Country Link
US (1) US20190139621A1 (zh)
CN (1) CN105956413B (zh)
WO (1) WO2017185834A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504004A (zh) * 2019-06-28 2019-11-26 西安理工大学 一种基于复杂网络结构可控性基因的识别方法
CN112071362A (zh) * 2020-08-03 2020-12-11 西安理工大学 一种融合全局和局部拓扑结构的蛋白质复合体的检测方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956413B (zh) * 2016-04-27 2019-08-06 王�忠 识别生物分子网络中关键模块或关键节点的方法
CN106709231B (zh) * 2016-10-19 2019-03-26 王�忠 评价生物分子网络中药物对模块间关系的影响的方法
CN108804871B (zh) * 2017-05-02 2021-06-25 中南大学 基于最大邻居子网的关键蛋白质识别方法
CN107292126B (zh) * 2017-05-04 2019-12-24 浙江大学 一种中药对复杂性疾病所致“失和”网络整合调节作用的定量评价方法
CN107423555B (zh) * 2017-06-09 2020-06-30 王�忠 一种探索药物新适应症的方法
WO2020150228A1 (en) * 2019-01-15 2020-07-23 Youngblood Ip Holdings, Llc Health data exchange platform
CN109920473B (zh) * 2019-04-02 2021-02-12 中国科学院城市环境研究所 一种代谢组学标志物权重分析通用方法
CN110706740B (zh) * 2019-09-29 2022-03-22 长沙理工大学 基于模块分解的蛋白质功能预测的方法、装置、设备
CN110957002B (zh) * 2019-12-17 2023-04-28 电子科技大学 一种基于协同矩阵分解的药物靶点相互作用关系预测方法
CN111370060A (zh) * 2020-03-21 2020-07-03 广西大学 一种蛋白质互作网络共定位共表达复合物识别系统及方法
CN111667881B (zh) * 2020-06-04 2023-06-06 大连民族大学 一种基于多网络拓扑结构的蛋白质功能预测方法
CN111784206B (zh) * 2020-07-29 2021-03-19 南昌航空大学 采用LeaderRank算法评估社交网络关键节点的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070136001A1 (en) * 2005-12-08 2007-06-14 Electronics And Telecommunications Research Institute Method and apparatus for detecting bio-complexes using rule-based templates
CN105117617A (zh) * 2015-08-26 2015-12-02 大连海事大学 一种用于筛选环境敏感性生物分子的方法
CN105279397A (zh) * 2015-10-26 2016-01-27 华东交通大学 一种识别蛋白质相互作用网络中关键蛋白质的方法
CN105956413A (zh) * 2016-04-27 2016-09-21 王�忠 识别生物分子网络中关键模块或关键节点的方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101252649B1 (ko) * 2011-02-01 2013-04-09 충북대학교 산학협력단 암 관련 단백질 도메인을 발굴하는 방법
CN102841985B (zh) * 2012-08-09 2015-04-08 中南大学 一种基于结构域特征的关键蛋白质识别方法
CN103559426A (zh) * 2013-11-06 2014-02-05 北京工业大学 一种针对多视图数据融合的蛋白质功能模块挖掘方法
CN103778349B (zh) * 2014-01-29 2017-02-15 思博奥科生物信息科技(北京)有限公司 一种基于功能模块的生物分子网络分析的方法
CN105160206A (zh) * 2015-10-08 2015-12-16 中国科学院数学与系统科学研究院 一种预测药物的蛋白质相互作用靶点的方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070136001A1 (en) * 2005-12-08 2007-06-14 Electronics And Telecommunications Research Institute Method and apparatus for detecting bio-complexes using rule-based templates
CN105117617A (zh) * 2015-08-26 2015-12-02 大连海事大学 一种用于筛选环境敏感性生物分子的方法
CN105279397A (zh) * 2015-10-26 2016-01-27 华东交通大学 一种识别蛋白质相互作用网络中关键蛋白质的方法
CN105956413A (zh) * 2016-04-27 2016-09-21 王�忠 识别生物分子网络中关键模块或关键节点的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG, YINGYING: "qinglkailling2 duolzu3fen4 ganlyu4 nao3quelxie3 mo2xing2 dan4bai2zhi4 wang3luo4 zhu3yao4 mo2kuai4 del shi2bie2 yu3 bi3jiao4", CHINA DOCTORAL DISSERTATIONS FULL-TEXT DATABASE (MEDICAL CARE SCIENCE AND TECHNOLOGY, 15 July 2015 (2015-07-15), ISSN: 1674-022X *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110504004A (zh) * 2019-06-28 2019-11-26 西安理工大学 一种基于复杂网络结构可控性基因的识别方法
CN110504004B (zh) * 2019-06-28 2022-02-22 西安理工大学 一种基于复杂网络结构可控性基因的识别方法
CN112071362A (zh) * 2020-08-03 2020-12-11 西安理工大学 一种融合全局和局部拓扑结构的蛋白质复合体的检测方法
CN112071362B (zh) * 2020-08-03 2024-04-09 西安理工大学 一种融合全局和局部拓扑结构的蛋白质复合体的检测方法

Also Published As

Publication number Publication date
CN105956413B (zh) 2019-08-06
CN105956413A (zh) 2016-09-21
US20190139621A1 (en) 2019-05-09

Similar Documents

Publication Publication Date Title
WO2017185834A1 (zh) 识别生物分子网络中关键模块或关键节点的方法
Singh et al. Differential gene regulatory networks in development and disease
US10984889B1 (en) Method and apparatus for providing global view information to a client
US10380186B2 (en) Virtual topological queries
Kechris et al. Generalizing moving averages for tiling arrays using combined p-value statistics
KR20110060849A (ko) 고성능 컴퓨팅 클러스터에서의 데이터 분배 방법 및 시스템
Jiang et al. An efficient algorithm for mining a set of influential spreaders in complex networks
Wen Bayesian model selection in complex linear systems, as illustrated in genetic association studies
BR112018010857B1 (pt) Método e aparelho de consulta de dados, e sistema de banco de dados
Kim et al. K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity
Kakati et al. THD-Tricluster: A robust triclustering technique and its application in condition specific change analysis in HIV-1 progression data
Embar et al. Is the average shortest path length of gene set a reflection of their biological relatedness?
Sonmez et al. Comparison of tissue/disease specific integrated networks using directed graphlet signatures
Ma et al. Biomarker detection and categorization in ribonucleic acid sequencing meta-analysis using bayesian hierarchical models
CN109254844B (zh) 一种大规模图的三角形计算方法
Li et al. RETRACTED ARTICLE: Identifying vital nodes in hypernetwork based on local centrality
Ibrahim et al. A MATLAB tool for pathway enrichment using a topology-based pathway regulation score
Wu et al. Power iteration ranking via hybrid diffusion for vital nodes identification
Jiang et al. Finding influential agent groups in complex multiagent software systems based on citation network analyses
Gundlach et al. Genome-wide association interaction studies with MB-MDR and maxT multiple testing correction on FPGAs
Ray et al. Surveying computational algorithms for identification of miRNA–mRNA regulatory modules
CN109801676A (zh) 一种用于评价化合物对基因通路活化作用的方法及装置
Bouguerra et al. Fault-tolerant scheduling on parallel systems with non-memoryless failure distributions
Zhao et al. SimMon: a toolkit for simulating monitoring mechanism in cloud computing environments
Yang et al. Attack strategy for operation system of systems based on FINC-e model and edge key potential

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17788496

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17788496

Country of ref document: EP

Kind code of ref document: A1