CN106909805A - The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway - Google Patents
The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway Download PDFInfo
- Publication number
- CN106909805A CN106909805A CN201710116712.3A CN201710116712A CN106909805A CN 106909805 A CN106909805 A CN 106909805A CN 201710116712 A CN201710116712 A CN 201710116712A CN 106909805 A CN106909805 A CN 106909805A
- Authority
- CN
- China
- Prior art keywords
- node
- metabolic pathway
- similarity
- species
- conjunction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B10/00—ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Physiology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Animal Behavior & Ethology (AREA)
- Molecular Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Method the invention discloses reconstruction species phylogenetic tree is compared based on a plurality of metabolic pathway.The conjunction figure of many metabolic pathways is set up by the overall comparison between a plurality of metabolic pathway, then the mapping set up by the node clustering of conjunction figure between the functional module of each metabolic pathway, and the phylogenetic tree by the mapping of functional module come the relation further analyzed between metabolic pathway and between setting up species.The beneficial effects of the invention are as follows:By the implementation of this method, the comparison work of metabolic pathway is simplified, researcher only needs to carry out the phylogenetic tree that fast and accurately product inter-species is just capable of in shirtsleeve operation.
Description
Technical field
This method is related to a kind of generation method of species phylogenetic tree.Generation is specifically compared based on a plurality of metabolic pathway
The method of species phylogenetic tree.
Background technology
Phylogenetic analysis are a key areas of systems biology research, set up system using metabolite data at present
The method of tree mainly analyzes the relation between metabolic pathway by the mapping between metabolic pathway node, and is closed with these
System carries out Phylogenetic analysis to species.However, the map information between node is limited, it is only difficult by node map information
With deeper into correlation between ground excavation metabolic pathway.
The content of the invention
It is an object of the invention to:The side for being compared based on a plurality of metabolic pathway and setting up phylogenetic tree tree between species is provided
Method.
The present invention solve above-mentioned technical problem technical scheme be:
The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) a plurality of metabolic pathway closes the foundation of figure:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpTop
Point set, EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rj.If riOne output chemical combination
Thing is rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll it is reversible
, then there is also one from rjTo riDirected edge.
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpOne
Node set, wherein u are not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most
Short distance is defined as the shortest path side number from u to x.For figure Gp' in arbitrary node v, can similarly define the k neighbours of v
Set Nk(v)。
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed asIt is defined as GpIn Nk
(u) ∪ { u } inner induced subgraph.In Gp' inner, k neighbours' subgraph of v is expressed asIt is defined as Gp' in NkV () ∪ { v } is inner
Induced subgraph.If d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is
Neighborhood NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u).
It is neighborhood NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v).The topological similarity T of definition node u, v
(u, v) is:
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui,
vi)+γ×Csim(uo, vo).Wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween
Similarity, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them.Csim(ui, vi) it is section
The average similarity of the input compound of point u and node v, Csim (uo, vo) be node u and node v output compound it is average
Similarity.α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v).The topological phase of integration node
Like degree and node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v).
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph)
(Gb) another segmentation, with GpNode and Gp' node between homologous similarity as connecting the two nodes split
Side right weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping
Node v, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′)。
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈
V(Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM。
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n
=max | V (Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set.It is homologous similar between merging point
The calculating of degree:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GM's
Merge the homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] table
Show merging point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity.
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolism roads
Footpath constitutes set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)}。
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively
With each metabolic pathway G in Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…,
Vmni, i ∈ { 1 ..., t }.Then, a conjunction figure G is often set upMiOne will be obtained and merge the homologous similar matrix M of pointi。
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the public metabolic pathway of this t species
Close figure GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point
Homologous similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, and a homologous similarity moment is put merging
Battle array is clustered as the similarity matrix between data point to merging point, and cluster result is exactly to close be divided into a class in figure
Merging point set, we it is this merging point set be collectively referred to as UM.For every metabolic pathway, by drawing in comparing every time
After segregation class, same U is belonged to by all in metabolic pathwayMThe set of node composition be exactly one of the metabolic pathway conservative
Functional module.
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step
It is rapid 2) in, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest
Conservative functional module is Mmax.For any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if their node is largest
Conservative functional module be respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxWith
MjmaxSide collection be respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be
ViLCCS, it is E that side integratesiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, Bian Jiwei
EjLCCS.Then, metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars it is public
Co metabolism path is G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp.Then, any two thing
Plant OiAnd OjBetween similarity:
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains the similar of t × t
Degree matrix B Sim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] represents species i and species j
Between similarity.
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j,
D [i, j]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP.
4.3) software TreeView display system trees are used.
The beneficial effects of the invention are as follows:By the implementation of this method, researcher only needs to carry out shirtsleeve operation with regard to energy
The phylogenetic tree of enough fast and accurately product inter-species;This method converts the process of many metabolic pathway overall comparisons of several species
To set up the process of many metabolic pathway conjunction figures, the comparison work of metabolic pathway is simplified;This method is by the node in pairing figure
Cluster is finding out the functional module of each metabolic pathway, and the mapping set up between functional module, the discovery of functional module and
Mapping between functional module can be helped it is found that the total biochemical characteristic information of more metabolic pathways;This method utilizes this
Mapping between a little functional modules establishes species distance matrix, sets up phylogenetic tree using species distance matrix, thus
The evolutionary relationship between species can be analyzed using phylogenetic tree.
Specific embodiment
The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) a plurality of metabolic pathway closes the foundation of figure:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpTop
Point set, EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rj.If riOne output chemical combination
Thing is rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll it is reversible
, then there is also one from rjTo riDirected edge.
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpOne
Node set, wherein u are not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most
Short distance is defined as the shortest path side number from u to x.For figure Gp' in arbitrary node v, can similarly define the k neighbours of v
Set Nk(v)。
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed asIt is defined as GpIn Nk
(u) ∪ { u } inner induced subgraph.In Gp' inner, k neighbours' subgraph of v is expressed asIt is defined as Gp' in NkV () ∪ { v } is inner
Induced subgraph.If d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is
Neighborhood NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u).
It is neighborhood NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v).The topological similarity T of definition node u, v
(u, v) is:
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui,
vi)+γ×Csim(uo, vo).Wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween
Similarity, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them.Csim(ui, vi) it is section
The average similarity of the input compound of point u and node v, Csim (uo, vo) be node u and node v output compound it is average
Similarity.α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v).The topological phase of integration node
Like degree and node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v).
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph)
(Gb) another segmentation, with GpNode and Gp' node between homologous similarity as connecting the two nodes split
Side right weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping
Node v, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′)。
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈
V(Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM。
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n
=max | V (Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set.It is homologous similar between merging point
The calculating of degree:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GM's
Merge the homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] table
Show merging point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity.
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolism roads
Footpath constitutes set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)}。
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively
With each metabolic pathway G in Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…,
Vmni, i ∈ { 1 ..., t }.Then, a conjunction figure G is often set upMiOne will be obtained and merge the homologous similar matrix M of pointi。
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the public metabolic pathway of this t species
Close figure GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point
Homologous similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, and a homologous similarity moment is put merging
Battle array is clustered as the similarity matrix between data point to merging point, and cluster result is exactly to close be divided into a class in figure
Merging point set, we it is this merging point set be collectively referred to as UM.For every metabolic pathway, by drawing in comparing every time
After segregation class, same U is belonged to by all in metabolic pathwayMThe set of node composition be exactly one of the metabolic pathway conservative
Functional module.
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step
It is rapid 2) in, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest
Conservative functional module is Mmax.For any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if their node is largest
Conservative functional module be respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxWith
MjmaxSide collection be respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be
ViLCCS, it is E that side integratesiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, Bian Jiwei
EjLCCS.Then, metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars it is public
Co metabolism path is G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp.Then, any two thing
Plant OiAnd OjBetween similarity:
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains the similar of t × t
Degree matrix B Sim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] represents species i and species j
Between similarity.
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j,
D [i, j]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP.
4.3) software TreeView display system trees are used.
Claims (1)
1. the method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway, is comprised the following steps that:
1) foundation of figure is closed:
1.1) calculating of node similarity:
For metabolic pathway P, if Gp=(Vp,Ep) represent metabolic pathway P, wherein GpIt is a digraph, VpIt is GpVertex set,
EpIt is GpOriented line set, GpIn summit uiAnd ujRepresent the reaction r in PiAnd rjIf, riOne output compound be
rjOne input compound, then uiAnd ujBetween have one from riTo rjDirected edge, if ri, rjAll be it is reversible,
So there is also one from rjTo riDirected edge;
K is positive integer, for figure GpIn arbitrary node u, define u k neighborhoods:Nk(u), NkU () is VpA node
Set, wherein u is not belonging to Nk(u) and for any x ∈ NkU the node of (), the beeline from u to x is k;Wherein most short distance
From the shortest path side number being defined as from u to x, for figure Gp' in arbitrary node v, can similarly define the k neighborhoods N of vk
(v);
For node u ∈ VpWith node v ∈ V 'p, in GpIn, k neighbours' subgraph of u is expressed as It is defined as GpIn Nk(u)∪
{ u } inner induced subgraph, in Gp' inner, k neighbours' subgraph of v is expressed as It is defined as Gp' in Nk(v) ∪ { v } inner derivation
Subgraph, if d (u) and d (v) are respectively u, v is in GpAnd Gp' inner degree.It is neighbours' collection
Close NkThe node degree series of the k neighbours of the u arranged by non-ascending order in (u).It is neighbours
Set NkThe node degree series of the k neighbours of the v arranged by non-ascending order in (v), the topological similarity T (u, v) of definition node u, v
For:
Biochemical analogy degree between definition node u and node v:Bsim (u, v)=α × ESim (ue, ve)+β×Csim(ui, vi)+γ
×Csim(uo, vo), wherein ue, veIt is respectively the enzyme of catalytic reaction u, v, ESim (ue, ve) it is enzyme ueWith enzyme veBetween it is similar
Degree, the Similarity Measure of the enzyme intersecting ratio of enzyme EC is used as the similarity between them, Csim (ui, vi) be node u and
The average similarity of the input compound of node v, Csim (uo, vo) be node u and node v output compound it is average similar
Degree, α, beta, gamma is proportionality coefficient, for adjusting ratio of each variable in Bsim (u, v), the topological similarity of integration node
With node biochemical analogy degree, node similarity S (u, v) that can be obtained between node u, v is:
S (u, v)=σ × T (u, v)+(1- σ) × Bsim (u, v) (2)
Wherein σ is proportionality coefficient, for adjusting ratio of each variable in S (u, v);
1.2) mapping between node is found according to node similarity:
With GpIn set of node as cum rights bigraph (bipartite graph) (Gb) one segmentation, with Gp' inner set of node is used as bigraph (bipartite graph) (Gb)
Another segmentation, with GpNode and Gp' node between homologous similarity as connect the two split nodes side right
Weight, is G with weight limit Bipartite Matching methodpIn arbitrary node u in Gp' inner it is found in Gp' inner unique mapping node
V, obtains the mapping of 1 couple 1 (u, v) of u to v, u ∈ V (Gp), v ∈ V (Gp′);
1.3) foundation of figure is closed between two metabolic pathways:
By step 1.2) mapping of 1 couple 1 (u, v) of u to the v that obtains is defined as merging point Vm=(u, v) | u ∈ V (Gp),v∈V
(Gp'), and the figure that these merging points are constituted is defined as conjunction figure GM;
If GpWith Gp' conjunction figure GMVertex set be V (GM)={ Vm1,Vm2,…,Vmi,…Vmn, i ∈ { 1,2 ..., n }, n=max
{|V(Gp)|,|V(Gp') |, we are also by V (GM) it is referred to as GpAnd Gp' merging point set, merge the homologous similarity between point
Calculate:
S (u, v)=α × Esim (ue,ve)+β×Csim(uic,vic)+γ×Csim(uoc,voc) (3)
Conjunction figure G is calculated by (3) formula respectivelyMMiddle any two merges the homologous similarity between point, can obtain conjunction figure GMMerging
The homologous similar matrix M of point, M is one | V (Gp)|×|V(Gp') | matrix, each element M [V in Mmi,Vmj] ∈ [0,1] expression conjunctions
And point Vmi∈V(GM) with merge point Vmj∈V(GM) homologous similarity;
1.4) foundation of the homologous similarity matrix of figure and correspondence conjunction figure is closed between a plurality of metabolic pathway:
If the public metabolic pathway of t species is respectively G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et), these metabolic pathway structures
Into set G={ G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et)};
The conjunction figure set up between the public metabolic pathway of these species is comprised the following steps that:
1.4.1 the most metabolic pathway G of nodes) is selected from G firstmax, | V (Gmax) |=n, then uses GmaxRespectively with G in
Each metabolic pathway Gi∈ G set up a conjunction figure GMi, close figure GMiVertex set be V (GMi)={ Vm1i,Vm2i,…,Vmni, i
∈ { 1 ..., t }, then, often sets up a conjunction figure GMiOne will be obtained and merge the homologous similar matrix M of pointi;
1.4.2) step 1.4.1) the conjunction figure that obtains merges, and obtains the conjunction figure of the public metabolic pathway of this t species
GMK, wherein closing figure GMKVertex set beClose figure GMKMerging point it is homologous
Similar matrix
2) foundation of functional module is guarded:
Using step 1.4) each in the conjunction figure that obtains merge point an as data point, merging the homologous similarity matrix work of point
It is the similarity matrix between data point, is clustered to merging point, cluster result is exactly to close the conjunction that a class is divided into figure
And point set, this merging point set is collectively referred to as U by weM, it is poly- by dividing in comparing every time for every metabolic pathway
After class, same U is belonged to by all in metabolic pathwayMNode composition set be exactly the metabolic pathway a conservative function
Module;
3) calculating of species similarity:
If the public metabolic pathway in t species is expressed as G1(V1,E1),G2(V2,E2),…,Gt(Vt,Et).In step 2)
In, the conservative functional module found in this t metabolic pathway is M={ M1,M2,…,Mr, its interior joint is largest to be guarded
Functional module is Mmax, for any two metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej), if the largest guarantor of their node
Keep functional module respectively MimaxAnd Mjmax, wherein MimaxAnd MjmaxVertex set be respectively VimaxxAnd Vjmax, MimaxAnd Mjmax's
Side collection is respectively EimaxAnd Ejmax;If MimaxWith MjmaxIn MimaxIn LCCS be MiLCCS, MiLCCSVertex set be ViLCCS, side collection
It is EiLCCSIf, MimaxWith MjmaxIn MjmaxIn LCCS be MjLCCS, MjLCCSVertex set be VjLCCS, it is E that side integratesjLCCS.Then,
Metabolic pathway Gi(Vi,Ei) and Gj(Vj,Ej) between similar score:
If t species are respectively O1, O2..., Ot, O1The public metabolic pathway of p bars be G11,G12,…,G1p, O2P bars public generation
Thank to path for G21,G22,…,G2p..., OtThe public metabolic pathway of p bars be Gt1,Gt2,…,Gtp, then, any two species Oi
And OjBetween similarity:
4) foundation of species phylogenetic tree:
Comprise the following steps that:
4.1) similarity in this t species between any two species is calculated with (5) formula, obtains a similarity moment of t × t
Battle array BSim.BSim is the symmetrical matrix that diagonal entry is 1, and BSim [i, j] ∈ [0,1] is represented between species i and species j
Similarity;
4.2) distance matrix for setting this t species is D, and D [i, j] ∈ [0,1] represents the distance between species i and species j, D [i,
J]=1-BSim [i, j].Then, a phylogenetic tree based on Distance matrix D is set up with software PHYLIP;
4.3) software TreeView display system trees are used.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710116712.3A CN106909805B (en) | 2017-03-01 | 2017-03-01 | The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710116712.3A CN106909805B (en) | 2017-03-01 | 2017-03-01 | The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106909805A true CN106909805A (en) | 2017-06-30 |
CN106909805B CN106909805B (en) | 2019-04-02 |
Family
ID=59208467
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710116712.3A Active CN106909805B (en) | 2017-03-01 | 2017-03-01 | The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106909805B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846262A (en) * | 2018-05-31 | 2018-11-20 | 广西大学 | The method that RNA secondary structure distance based on DFT calculates phylogenetic tree construction |
CN109326328A (en) * | 2018-11-02 | 2019-02-12 | 西北大学 | A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster |
CN110135450A (en) * | 2019-03-26 | 2019-08-16 | 中电莱斯信息系统有限公司 | A kind of hotspot path analysis method based on Density Clustering |
WO2022127687A1 (en) * | 2020-12-18 | 2022-06-23 | 深圳先进技术研究院 | Metabolic pathway prediction method, system, terminal device and readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020123847A1 (en) * | 2000-12-20 | 2002-09-05 | Manor Askenazi | Method for analyzing biological elements |
CN103218776A (en) * | 2013-03-07 | 2013-07-24 | 天津大学 | Non-local depth image super-resolution rebuilding method based on minimum spanning tree (MST) |
CN103984718A (en) * | 2014-05-09 | 2014-08-13 | 国家电网公司 | Search algorithm of all spanning trees of directed graph and undirected graph |
-
2017
- 2017-03-01 CN CN201710116712.3A patent/CN106909805B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020123847A1 (en) * | 2000-12-20 | 2002-09-05 | Manor Askenazi | Method for analyzing biological elements |
CN103218776A (en) * | 2013-03-07 | 2013-07-24 | 天津大学 | Non-local depth image super-resolution rebuilding method based on minimum spanning tree (MST) |
CN103984718A (en) * | 2014-05-09 | 2014-08-13 | 国家电网公司 | Search algorithm of all spanning trees of directed graph and undirected graph |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846262A (en) * | 2018-05-31 | 2018-11-20 | 广西大学 | The method that RNA secondary structure distance based on DFT calculates phylogenetic tree construction |
CN109326328A (en) * | 2018-11-02 | 2019-02-12 | 西北大学 | A kind of extinct plants and animal pedigree evolution analysis method based on pedigree cluster |
CN110135450A (en) * | 2019-03-26 | 2019-08-16 | 中电莱斯信息系统有限公司 | A kind of hotspot path analysis method based on Density Clustering |
WO2022127687A1 (en) * | 2020-12-18 | 2022-06-23 | 深圳先进技术研究院 | Metabolic pathway prediction method, system, terminal device and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106909805B (en) | 2019-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Broumi et al. | Single valued neutrosophic graphs | |
CN106909805A (en) | The method for rebuilding species phylogenetic tree is compared based on a plurality of metabolic pathway | |
Li et al. | Consistent stabilizability of switched Boolean networks | |
Wang et al. | It's the machine that matters: Predicting gene function and phenotype from protein networks | |
Ortego et al. | Evolutionary and demographic history of the Californian scrub white oak species complex: an integrative approach | |
Zhu et al. | Network inference from consensus dynamics with unknown parameters | |
Burbrink et al. | Ecological divergence and the history of gene flow in the Nearctic milksnakes (Lampropeltis triangulum complex) | |
Khalilian et al. | A novel k-means based clustering algorithm for high dimensional data sets | |
CN105117618A (en) | Implicated crime principle and network topological structural feature based recognition method for drug-target interaction | |
Tian et al. | Pairwise alignment of interaction networks by fast identification of maximal conserved patterns | |
Wang et al. | Global dynamics of multi-group SEI animal disease models with indirect transmission | |
Zhang et al. | Genomic phylogeography of Gymnocarpos przewalskii (Caryophyllaceae): insights into habitat fragmentation in arid Northwestern China | |
Yan et al. | Real-time localization of pollution source for urban water supply network in emergencies | |
Ragab et al. | A Blockchain-based architecture for enabling cybersecurity in the internet-of-critical infrastructures | |
Liang et al. | Multi-objective optimization based network control principles for identifying personalized drug targets with cancer | |
CN104598614A (en) | Data multi-scale modal diffusing update method based on geographic semantics | |
Roy et al. | On time-scale designs for networks | |
CN112860935A (en) | Cross-source image retrieval method, system, medium and equipment | |
Paniello | Marginal distributions of genetic coalgebras | |
Zhao et al. | Simulated annealing with a hybrid local search for solving the traveling salesman problem | |
Zhou et al. | Multi-objective evolutionary computation for topology coverage assessment problem | |
Rajanala et al. | Statistical summaries of unlabelled evolutionary trees and ranked hierarchical clustering trees | |
Dyvak et al. | Evolutionary method based on artificial bee colony and ontological approach for structural identification of interval discrete models of objects with distributed parameters | |
Smarandache et al. | Single valued neutrosophic graphs | |
CN107332714A (en) | A kind of control method of the heterogeneous multiple-input and multiple-output complex networks system of node |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |