CN106909805A - The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway - Google Patents

The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway Download PDF

Info

Publication number
CN106909805A
CN106909805A CN201710116712.3A CN201710116712A CN106909805A CN 106909805 A CN106909805 A CN 106909805A CN 201710116712 A CN201710116712 A CN 201710116712A CN 106909805 A CN106909805 A CN 106909805A
Authority
CN
China
Prior art keywords
node
metabolic pathway
similarity
species
conjunction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710116712.3A
Other languages
Chinese (zh)
Other versions
CN106909805B (en
Inventor
黄毅然
钟诚
林海翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to CN201710116712.3A priority Critical patent/CN106909805B/en
Publication of CN106909805A publication Critical patent/CN106909805A/en
Application granted granted Critical
Publication of CN106909805B publication Critical patent/CN106909805B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B10/00ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physiology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Animal Behavior & Ethology (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Method the invention discloses reconstruction species phylogenetic tree is compared based on a plurality of metabolic pathway.The conjunction figure of many metabolic pathways is set up by the overall comparison between a plurality of metabolic pathway, then the mapping set up by the node clustering of conjunction figure between the functional module of each metabolic pathway, and the phylogenetic tree by the mapping of functional module come the relation further analyzed between metabolic pathway and between setting up species.The beneficial effects of the invention are as follows:By the implementation of this method, the comparison work of metabolic pathway is simplified, researcher only needs to carry out the phylogenetic tree that fast and accurately product inter-species is just capable of in shirtsleeve operation.

Description

The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway
Technical field
This method is related to a kind of generation method of species phylogenetic tree.Generation is specifically compared based on a plurality of metabolic pathway The method of species phylogenetic tree.
Background technology
Phylogenetic analysis are a key areas of systems biology research, set up system using metabolite data at present The method of tree mainly analyzes the relation between metabolic pathway by the mapping between metabolic pathway node, and is closed with these System carries out Phylogenetic analysis to species.However, the map information between node is limited, it is only difficult by node map information With deeper into correlation between ground excavation metabolic pathway.
The content of the invention
It is an object of the invention to:The side for being compared based on a plurality of metabolic pathway and setting up phylogenetic tree tree between species is provided Method.
The present invention solve above-mentioned technical problem technical scheme be:
The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) a plurality of metabolic pathway closes the foundation of figure:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpTop Point set, EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rj.If riOne output chemical combination Thing is rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll it is reversible , then there is also one from rjTo riDirected edge.
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpOne Node set, wherein u are not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most Short distance is defined as the shortest path side number from u to x.For figure Gp' in arbitrary node v, can similarly define the k neighbours of v Set Nk(v)。
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed asIt is defined as GpIn Nk (u) ∪ { u } inner induced subgraph.In Gp' inner, k neighbours' subgraph of v is expressed asIt is defined as Gp' in NkV () ∪ { v } is inner Induced subgraph.If d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is Neighborhood NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u). It is neighborhood NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v).The topological similarity T of definition node u, v (u, v) is:
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui, vi)+γ×Csim(uo, vo).Wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween Similarity, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them.Csim(ui, vi) it is section The average similarity of the input compound of point u and node v, Csim (uo, vo) be node u and node v output compound it is average Similarity.α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v).The topological phase of integration node Like degree and node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v).
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph) (Gb) another segmentation, with GpNode and Gp' node between homologous similarity as connecting the two nodes split Side right weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping Node v, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′)。
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈ V(Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n =max | V (Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set.It is homologous similar between merging point The calculating of degree:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GM's Merge the homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] table Show merging point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity.
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolism roads Footpath constitutes set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)}。
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively With each metabolic pathway G in Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…, Vmni, i ∈ { 1 ..., t }.Then, a conjunction figure G is often set upMiOne will be obtained and merge the homologous similar matrix M of pointi
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the public metabolic pathway of this t species Close figure GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point Homologous similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, and a homologous similarity moment is put merging Battle array is clustered as the similarity matrix between data point to merging point, and cluster result is exactly to close be divided into a class in figure Merging point set, we it is this merging point set be collectively referred to as UM.For every metabolic pathway, by drawing in comparing every time After segregation class, same U is belonged to by all in metabolic pathwayMThe set of node composition be exactly one of the metabolic pathway conservative Functional module.
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step It is rapid 2) in, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest Conservative functional module is Mmax.For any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if their node is largest Conservative functional module be respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxWith MjmaxSide collection be respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be ViLCCS, it is E that side integratesiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, Bian Jiwei EjLCCS.Then, metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars it is public Co metabolism path is G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp.Then, any two thing Plant OiAnd OjBetween similarity:
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains the similar of t × t Degree matrix B Sim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] represents species i and species j Between similarity.
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j, D [i, j]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP.
4.3) software TreeView display system trees are used.
The beneficial effects of the invention are as follows:By the implementation of this method, researcher only needs to carry out shirtsleeve operation with regard to energy The phylogenetic tree of enough fast and accurately product inter-species;This method converts the process of many metabolic pathway overall comparisons of several species To set up the process of many metabolic pathway conjunction figures, the comparison work of metabolic pathway is simplified;This method is by the node in pairing figure Cluster is finding out the functional module of each metabolic pathway, and the mapping set up between functional module, the discovery of functional module and Mapping between functional module can be helped it is found that the total biochemical characteristic information of more metabolic pathways;This method utilizes this Mapping between a little functional modules establishes species distance matrix, sets up phylogenetic tree using species distance matrix, thus The evolutionary relationship between species can be analyzed using phylogenetic tree.
Specific embodiment
The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) a plurality of metabolic pathway closes the foundation of figure:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpTop Point set, EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rj.If riOne output chemical combination Thing is rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll it is reversible , then there is also one from rjTo riDirected edge.
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpOne Node set, wherein u are not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most Short distance is defined as the shortest path side number from u to x.For figure Gp' in arbitrary node v, can similarly define the k neighbours of v Set Nk(v)。
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed asIt is defined as GpIn Nk (u) ∪ { u } inner induced subgraph.In Gp' inner, k neighbours' subgraph of v is expressed asIt is defined as Gp' in NkV () ∪ { v } is inner Induced subgraph.If d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is Neighborhood NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u). It is neighborhood NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v).The topological similarity T of definition node u, v (u, v) is:
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui, vi)+γ×Csim(uo, vo).Wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween Similarity, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them.Csim(ui, vi) it is section The average similarity of the input compound of point u and node v, Csim (uo, vo) be node u and node v output compound it is average Similarity.α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v).The topological phase of integration node Like degree and node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v).
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph) (Gb) another segmentation, with GpNode and Gp' node between homologous similarity as connecting the two nodes split Side right weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping Node v, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′)。
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈ V(Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n =max | V (Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set.It is homologous similar between merging point The calculating of degree:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GM's Merge the homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] table Show merging point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity.
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolism roads Footpath constitutes set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)}。
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively With each metabolic pathway G in Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…, Vmni, i ∈ { 1 ..., t }.Then, a conjunction figure G is often set upMiOne will be obtained and merge the homologous similar matrix M of pointi
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the public metabolic pathway of this t species Close figure GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point Homologous similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, and a homologous similarity moment is put merging Battle array is clustered as the similarity matrix between data point to merging point, and cluster result is exactly to close be divided into a class in figure Merging point set, we it is this merging point set be collectively referred to as UM.For every metabolic pathway, by drawing in comparing every time After segregation class, same U is belonged to by all in metabolic pathwayMThe set of node composition be exactly one of the metabolic pathway conservative Functional module.
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step It is rapid 2) in, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest Conservative functional module is Mmax.For any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if their node is largest Conservative functional module be respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxWith MjmaxSide collection be respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be ViLCCS, it is E that side integratesiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, Bian Jiwei EjLCCS.Then, metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars it is public Co metabolism path is G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp.Then, any two thing Plant OiAnd OjBetween similarity:
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains the similar of t × t Degree matrix B Sim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] represents species i and species j Between similarity.
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j, D [i, j]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP.
4.3) software TreeView display system trees are used.

Claims (1)

1. the method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) foundation of figure is closed:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpVertex set, EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rjIf, riOne output compound be rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll be it is reversible, So there is also one from rjTo riDirected edge;
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpA node Set, wherein u is not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most short distance From the shortest path side number being defined as from u to x, for figure Gp' in arbitrary node v, can similarly define the k neighborhoods N of vk (v);
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed as It is defined as GpIn Nk(u)∪ { u } inner induced subgraph, in Gp' inner, k neighbours' subgraph of v is expressed as It is defined as Gp' in Nk(v) ∪ { v } inner derivation Subgraph, if d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is neighbours' collection Close NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u).It is neighbours Set NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v), the topological similarity T (u, v) of definition node u, v For:
T [ u , v ] = m i n { | V ( G u k ) | , | V ( G v k ) | } + [ Σ k = 0 K max m i n { Σ i = 1 , x i ∈ N k ( u ) | N k ( u ) | d ( x i ) , Σ i = 1 , y i ∈ N k ( v ) | N k ( v ) | d ( y i ) } ] / 2 m a x { | V ( G u k ) | , | V ( G v k ) | } + m a x { | E ( G u k ) | , | E ( G v k ) | } - - - ( 1 )
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui, vi)+γ ×Csim(uo, vo), wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween it is similar Degree, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them, Csim (ui, vi) be node u and The average similarity of the input compound of node v, Csim (uo, vo) be node u and node v output compound it is average similar Degree, α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v), the topological similarity of integration node With node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v);
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph) (Gb) Another segmentation, with GpNode and Gp' node between homologous similarity as connect the two split nodes side right Weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping node V, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′);
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈V (Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n=max {|V(Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set, merge the homologous similarity between point Calculate:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GMMerging The homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] expression conjunctions And point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity;
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolic pathway structures Into set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)};
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively with G in Each metabolic pathway Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…,Vmni, i ∈ { 1 ..., t }, then, often sets up a conjunction figure GMiOne will be obtained and merge the homologous similar matrix M of pointi
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the conjunction figure of the public metabolic pathway of this t species GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point it is homologous Similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, merging the homologous similarity matrix work of point It is the similarity matrix between data point, is clustered to merging point, cluster result is exactly to close the conjunction that a class is divided into figure And point set, this merging point set is collectively referred to as U by weM, it is poly- by dividing in comparing every time for every metabolic pathway After class, same U is belonged to by all in metabolic pathwayMNode composition set be exactly the metabolic pathway a conservative function Module;
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step 2) In, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest to be guarded Functional module is Mmax, for any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if the largest guarantor of their node Keep functional module respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxAnd Mjmax's Side collection is respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be ViLCCS, side collection It is EiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, it is E that side integratesjLCCS.Then, Metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
S i m S c o r e ( G i , G j ) = min { | E i L C C S | , | E j L C C S | } max { | E i | , | E j | } - - - ( 4 )
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars public generation Thank to path for G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp, then, any two species Oi And OjBetween similarity:
S i m S c o r e ( O i , O j ) = Σ s = 1 p S i m S c o r e ( G i s , G j s ) p - - - ( 5 )
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains a similarity moment of t × t Battle array BSim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] is represented between species i and species j Similarity;
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j, D [i, J]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP;
4.3) software TreeView display system trees are used.
CN201710116712.3A 2017-03-01 2017-03-01 The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway Active CN106909805B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710116712.3A CN106909805B (en) 2017-03-01 2017-03-01 The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710116712.3A CN106909805B (en) 2017-03-01 2017-03-01 The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway

Publications (2)

Publication Number Publication Date
CN106909805A true CN106909805A (en) 2017-06-30
CN106909805B CN106909805B (en) 2019-04-02

Family

ID=59208467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710116712.3A Active CN106909805B (en) 2017-03-01 2017-03-01 The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway

Country Status (1)

Country Link
CN (1) CN106909805B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846262A (en) * 2018-05-31 2018-11-20 广西大学 The method that RNA secondary structure distance based on DFT calculates phylogenetic tree construction
CN109326328A (en) * 2018-11-02 2019-02-12 西北大学 A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster
CN110135450A (en) * 2019-03-26 2019-08-16 中电莱斯信息系统有限公司 A kind of hotspot path analysis method based on Density Clustering
WO2022127687A1 (en) * 2020-12-18 2022-06-23 深圳先进技术研究院 Metabolic pathway prediction method, system, terminal device and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020123847A1 (en) * 2000-12-20 2002-09-05 Manor Askenazi Method for analyzing biological elements
CN103218776A (en) * 2013-03-07 2013-07-24 天津大学 Non-local depth image super-resolution rebuilding method based on minimum spanning tree (MST)
CN103984718A (en) * 2014-05-09 2014-08-13 国家电网公司 Search algorithm of all spanning trees of directed graph and undirected graph

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020123847A1 (en) * 2000-12-20 2002-09-05 Manor Askenazi Method for analyzing biological elements
CN103218776A (en) * 2013-03-07 2013-07-24 天津大学 Non-local depth image super-resolution rebuilding method based on minimum spanning tree (MST)
CN103984718A (en) * 2014-05-09 2014-08-13 国家电网公司 Search algorithm of all spanning trees of directed graph and undirected graph

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846262A (en) * 2018-05-31 2018-11-20 广西大学 The method that RNA secondary structure distance based on DFT calculates phylogenetic tree construction
CN109326328A (en) * 2018-11-02 2019-02-12 西北大学 A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster
CN110135450A (en) * 2019-03-26 2019-08-16 中电莱斯信息系统有限公司 A kind of hotspot path analysis method based on Density Clustering
WO2022127687A1 (en) * 2020-12-18 2022-06-23 深圳先进技术研究院 Metabolic pathway prediction method, system, terminal device and readable storage medium

Also Published As

Publication number Publication date
CN106909805B (en) 2019-04-02

Similar Documents

Publication Publication Date Title
Broumi et al. Single valued neutrosophic graphs
CN106909805A (en) The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway
Li et al. Consistent stabilizability of switched Boolean networks
Wang et al. It's the machine that matters: Predicting gene function and phenotype from protein networks
Ortego et al. Evolutionary and demographic history of the Californian scrub white oak species complex: an integrative approach
Zhu et al. Network inference from consensus dynamics with unknown parameters
Burbrink et al. Ecological divergence and the history of gene flow in the Nearctic milksnakes (Lampropeltis triangulum complex)
Khalilian et al. A novel k-means based clustering algorithm for high dimensional data sets
CN105117618A (en) Implicated crime principle and network topological structural feature based recognition method for drug-target interaction
Tian et al. Pairwise alignment of interaction networks by fast identification of maximal conserved patterns
Wang et al. Global dynamics of multi-group SEI animal disease models with indirect transmission
Zhang et al. Genomic phylogeography of Gymnocarpos przewalskii (Caryophyllaceae): insights into habitat fragmentation in arid Northwestern China
Yan et al. Real-time localization of pollution source for urban water supply network in emergencies
Ragab et al. A Blockchain-based architecture for enabling cybersecurity in the internet-of-critical infrastructures
Liang et al. Multi-objective optimization based network control principles for identifying personalized drug targets with cancer
CN104598614A (en) Data multi-scale modal diffusing update method based on geographic semantics
Roy et al. On time-scale designs for networks
CN112860935A (en) Cross-source image retrieval method, system, medium and equipment
Paniello Marginal distributions of genetic coalgebras
Zhao et al. Simulated annealing with a hybrid local search for solving the traveling salesman problem
Zhou et al. Multi-objective evolutionary computation for topology coverage assessment problem
Rajanala et al. Statistical summaries of unlabelled evolutionary trees and ranked hierarchical clustering trees
Dyvak et al. Evolutionary method based on artificial bee colony and ontological approach for structural identification of interval discrete models of objects with distributed parameters
Smarandache et al. Single valued neutrosophic graphs
CN107332714A (en) A kind of control method of the heterogeneous multiple-input and multiple-output complex networks system of node

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant