CN107180164A - A kind of multiple domain protein structure assemble method based on template - Google Patents
A kind of multiple domain protein structure assemble method based on template Download PDFInfo
- Publication number
- CN107180164A CN107180164A CN201710256156.XA CN201710256156A CN107180164A CN 107180164 A CN107180164 A CN 107180164A CN 201710256156 A CN201710256156 A CN 201710256156A CN 107180164 A CN107180164 A CN 107180164A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- msubsup
- albumen
- mtr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
Landscapes
- Spectroscopy & Molecular Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
A kind of multiple domain protein structure assemble method based on template, first, according to the structure of each domain albumen, compares instrument TM align using protein structure and once each domain albumen is compared, optimal Template is found out from multiple domain albumen database;Then, rotation translation matrix is obtained using Kabsch methods, each domain albumen is overlapped onto in template, and translation rotation process is carried out to each domain albumen, the distance between its is equal to minimum allowable range;Secondly, it is adjusted by entering row stochastic translation and rotation to package assembly, and the quality of package assembly is weighed using the conflict factor between the albumen of domain, the atomic quantity of interaction, and the mobile range of package assembly opposite formwork;In an assembling process, adjacent domain albumen is assembled successively, and the structure assembled is fixed, after all structures are completed, the last assembling result of output.The present invention provides a kind of higher multiple domain protein structure assemble method based on template of precision of prediction.
Description
Technical field
The present invention relates to a kind of biological information, intelligent optimization, computer application field, more particularly to a kind of base
In the multiple domain protein structure assemble method of template.
Background technology
Large-scale protein is generally made up of multiple independent domain albumen folded, and being determined to for multiple domain protein structure is strong
Promote biological study progress.Domain albumen generally has compact three-dimensional structure and specific biological function, and same domain albumen can
There can be different knots to reach.In addition, the single domain three-dimensional structure of many multiple domain albumen by X-ray diffraction, nuclear magnetic resonance with
And the prediction such as computer is determined.Therefore, the structure for obtaining its corresponding multiple domain albumen according to the structure of single domain albumen is one
Important step, while determining full-length proteins structure and understanding a necessary links of its biological function.
At present, predict that the structure of multiple domain albumen has the conventional method of two classes from single domain albumen.The first kind passes through fixation
The structure of single domain albumen, then alignment assembling.Equations of The Second Kind by enumerating link field protein between the structure of conformation assemble
The structure of whole multiple domain albumen.Wherein, first kind method is considered as the docking problem between albumen, some docking calculations
It may be used as the assembling of multiple domain protein structure;Different from first kind method, Equations of The Second Kind method is considered as between the albumen of domain relatively
The structure ab initio prediction problem of shorter amino acid sequence, due to only changing the structure of the conformation between link field albumen, it is adopted
Sample space very little.However, because the above method lacks template-directed, so that the assembling side of domain albumen can not be determined in assembling
To, and then cause precision of prediction relatively low.
Therefore, existing multiple domain protein structure assemble method has defect, it is necessary to improve in terms of precision of prediction.
The content of the invention
In order to overcome the shortcomings of existing multiple domain protein structure assemble method in terms of precision of prediction, the present invention provides a kind of
The higher multiple domain protein structure assemble method based on template of precision of prediction.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of multiple domain protein structure assemble method based on template, the described method comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, interaction
Atomic quantity constant n0;
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group
Decking:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it
Template matching score TM-score1;
3.2) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align
The optimal comparison position of domain albumen, and and record TM-score2;
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM-
socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiRepresent t-th template
Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made
For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, then tried to achieve according to Kabsch methods
Spin matrixWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represent rotation
T-th of element of the s rows of matrix, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it
Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat
Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen
Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD;
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted
Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weight
Value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts=
0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3,
T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling
Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen
Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group
The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
The present invention technical concept be:First, according to the structure of each domain albumen, instrument TM- is compared using protein structure
Once each domain albumen is compared by align, and optimal Template is found out from multiple domain albumen database;Then, Kabsch side is utilized
Method obtains rotation translation matrix, and each domain albumen is overlapped onto in template, and carries out translation rotation process to each domain albumen, makes its it
Between distance be equal to minimum allowable range;Secondly, it is adjusted by entering row stochastic translation and rotation to package assembly, and profit
Weighed with the conflict factor between the albumen of domain, the atomic quantity of interaction, and the mobile range of package assembly opposite formwork
Measure the quality of package assembly;In an assembling process, adjacent domain albumen is assembled successively, and the structure assembled is fixed, when
After all structures are completed, the last assembling result of output.
Beneficial effects of the present invention are shown:Assembling is instructed by template, the directional information of package assembly, and root is obtained
Package assembly is weighed according to the change between the conflict factor between the albumen of domain, interaction factor and package assembly and template
Quality, so as to reach the effect for instructing assembling, and then improves the prediction progress of whole albumen.
Brief description of the drawings
Fig. 1 is the flow chart of the multiple domain protein structure assemble method based on template.
Fig. 2 is the result that the multiple domain protein structure assemble method based on template is assembled to multiple domain albumen 3nd1A.
Embodiment
The invention will be further described below in conjunction with the accompanying drawings.
Referring to Figures 1 and 2, a kind of multiple domain protein structure assemble method based on template, comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, interaction
Atomic quantity constant n0;
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group
Decking:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it
Template matching score TM-score1;
3.2) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align
The optimal comparison position of domain albumen, and and record TM-score2;
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM-
socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiRepresent t-th template
Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made
For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin moment is then tried to achieve according to Kabsch methods
Battle arrayWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the of the s rows of spin matrix
T element, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it
Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat
Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen
Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD;
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted
Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weight
Value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts=
0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3,
T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling
Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen
Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group
The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
The multiple domain protein 3nd1A that the present embodiment sequence length is 244 is embodiment, a kind of multiple domain albumen based on template
Structure assemble method, comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax=30000, conflict distance threshold dclash=3.75, interact threshold value
dcontact=8, the atomic quantity constant n of interaction0=87;
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group
Decking, process is as follows:
3.6) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it
Template matching score TM-score1;
3.7) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align
The optimal comparison position of domain albumen, and and record TM-score2;
3.8) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM-
socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.9) score of the template is calculatedWherein scoreiRepresent t-th template
Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.10) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made
For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.3) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin moment is then tried to achieve according to Kabsch methods
Battle arrayWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the of the s rows of spin matrix
T element, tsRepresent s-th of translation vector;
4.4) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it
Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat
Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen
Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD;
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted
Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1=w2=1, w3=0.35 is
Respective weighted value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts=
0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3,
T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen are represented,Represent (n+1)th domain egg
The s dimension coordinates of m-th white of C alpha atom;
10.6) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling
Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen
Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group
The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
Using sequence length be 244 the multiple domain protein 3nd1A comprising two domains as embodiment, assembled with above method
The nearly native state conformation of the multiple domain protein is obtained, root-mean-square-deviation isTM-score is 0.997, pre- geodesic structure
As shown in Figure 2.
Described above is the effect of optimization that is drawn using 3nd1A protein by example of the present invention, and non-limiting of the invention
Practical range, does various modifications and improvement on the premise of without departing from scope involved by substance of the present invention to it, should not
Exclude outside protection scope of the present invention.
Claims (1)
1. a kind of multiple domain protein structure assemble method based on template, it is characterised in that:The multiple domain protein structure assembling includes
Following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, the atom of interaction
Quantity constant n0;
3) it is directed to each multiple domain albumen in PDB storehouses and performs following operation, so that it is determined that rigging:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record its template
Compare score TM-score1;
3.2) since the resi-dues of last that first domain albumen is compared, second domain egg is found out using TM-align
White optimal comparison position, and and record TM-score2;
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM-
socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiThe score of t-th of template is represented,
TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is used as mould
Plate;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin matrix is then tried to achieve according to Kabsch methodsWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the t of the s rows of spin matrix
Individual element, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msub>
<mi>t</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>11</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>12</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>13</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mi>n</mi>
</msubsup>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msub>
<mi>t</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>21</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>22</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>23</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mi>n</mi>
</msubsup>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msub>
<mi>t</mi>
<mn>3</mn>
</msub>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>31</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>32</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>u</mi>
<mn>33</mn>
</msub>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mi>n</mi>
</msubsup>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) position of n-th of domain albumen is fixed, (n+1)th domain albumen is translated according to equation below, made between its tie point
Root-mean-square-deviation RMSD is
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mi>s</mi>
</mrow>
<mi>n</mi>
</msubsup>
<mo>=</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mi>s</mi>
</mrow>
<mi>n</mi>
</msubsup>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<msub>
<mi>l</mi>
<mi>n</mi>
</msub>
<mi>s</mi>
</mrow>
<mi>n</mi>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mn>1</mn>
<mi>s</mi>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<mo>*</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mn>3.8</mn>
<mo>/</mo>
<msub>
<mi>d</mi>
<mrow>
<mi>n</mi>
<mo>,</mo>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msub>
<mo>)</mo>
</mrow>
</mrow>
Wherein, lnThe length of n-th of domain albumen is tieed up,For the s dimension coordinates of last C alpha atom of n-th of domain albumen,
For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For n-th of domain albumen last C alpha atom and
Euclidean distance between first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD;
7) Euclidean distance of C alpha atoms between any two in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted
Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weighted value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts=0.3
(2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>11</mn>
</msub>
<mo>=</mo>
<msubsup>
<mi>X</mi>
<mn>1</mn>
<mn>2</mn>
</msubsup>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&alpha;</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>12</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>1</mn>
</msub>
<msub>
<mi>X</mi>
<mn>2</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>3</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>13</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>1</mn>
</msub>
<msub>
<mi>X</mi>
<mn>3</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>2</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>21</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>1</mn>
</msub>
<msub>
<mi>X</mi>
<mn>2</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>3</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>22</mn>
</msub>
<mo>=</mo>
<msup>
<msub>
<mi>X</mi>
<mn>2</mn>
</msub>
<mn>2</mn>
</msup>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&alpha;</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>23</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>2</mn>
</msub>
<msub>
<mi>X</mi>
<mn>3</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>1</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>31</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>1</mn>
</msub>
<msub>
<mi>X</mi>
<mn>3</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>2</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>32</mn>
</msub>
<mo>=</mo>
<msub>
<mi>X</mi>
<mn>3</mn>
</msub>
<msub>
<mi>X</mi>
<mn>2</mn>
</msub>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>-</mo>
<msub>
<mi>&beta;X</mi>
<mn>3</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msub>
<mi>U</mi>
<mn>33</mn>
</msub>
<mo>=</mo>
<msup>
<msub>
<mi>X</mi>
<mn>3</mn>
</msub>
<mn>2</mn>
</msup>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>-</mo>
<mi>&alpha;</mi>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&alpha;</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Wherein, α=cos γ, β=sin γ, UstRepresent t-th of element of the s rows of assembling spin matrix, s=1,2,3, t=
1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>=</mo>
<msub>
<mi>T</mi>
<mn>1</mn>
</msub>
<mo>+</mo>
<msubsup>
<mi>x</mi>
<mn>11</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>11</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>11</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>11</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>12</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>11</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>13</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>=</mo>
<msub>
<mi>T</mi>
<mn>2</mn>
</msub>
<mo>+</mo>
<msubsup>
<mi>x</mi>
<mn>12</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>12</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>21</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>12</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>22</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>12</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>23</mn>
</msub>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>=</mo>
<msub>
<mi>T</mi>
<mn>3</mn>
</msub>
<mo>+</mo>
<msubsup>
<mi>x</mi>
<mn>13</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>1</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>13</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>31</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>2</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>13</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>32</mn>
</msub>
<mo>+</mo>
<mrow>
<mo>(</mo>
<msubsup>
<mi>x</mi>
<mrow>
<mi>m</mi>
<mn>3</mn>
</mrow>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>-</mo>
<msubsup>
<mi>x</mi>
<mn>13</mn>
<mrow>
<mi>n</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
</msubsup>
<mo>)</mo>
</mrow>
<msub>
<mi>U</mi>
<mn>33</mn>
</msub>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,Represent (n+1)th
The s dimension coordinates of m-th of C alpha atom of domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current package assembly;
11) repeat step 10) ImaxSecondary, then the structure of last time is the assembling knot of n-th of domain albumen and (n+1)th domain albumen
Structure;
12) after (n+1)th albumen is completed, then it is fixed before n+1 domain albumen structure, according to step 5) -11) assemble the
N+2 domain albumen, after all N number of domain albumen are completed, exports last package assembly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710256156.XA CN107180164B (en) | 2017-04-19 | 2017-04-19 | Template-based multi-domain protein structure assembly method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710256156.XA CN107180164B (en) | 2017-04-19 | 2017-04-19 | Template-based multi-domain protein structure assembly method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107180164A true CN107180164A (en) | 2017-09-19 |
CN107180164B CN107180164B (en) | 2020-02-21 |
Family
ID=59831420
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710256156.XA Active CN107180164B (en) | 2017-04-19 | 2017-04-19 | Template-based multi-domain protein structure assembly method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107180164B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062457A (en) * | 2018-01-15 | 2018-05-22 | 浙江工业大学 | A kind of Advances in protein structure prediction of structural eigenvector assisted Selection |
CN108763870A (en) * | 2018-05-09 | 2018-11-06 | 浙江工业大学 | A kind of multiple domain Protein L inker construction methods |
CN110164506A (en) * | 2019-04-19 | 2019-08-23 | 浙江工业大学 | A kind of multiple domain protein structure assemble method based on contact residues between domain |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080003631A1 (en) * | 2006-06-28 | 2008-01-03 | Weis Robert M | Methods and materials for in vitro analysis and/or use of membrane-associated proteins, portions thereof or variants thereof |
CN103566363A (en) * | 2013-09-23 | 2014-02-12 | 中国人民解放军第三军医大学第一附属医院 | Preparation method of contraceptive microneedle |
CN105354441A (en) * | 2015-10-23 | 2016-02-24 | 上海交通大学 | Vegetable protein interaction network construction method |
CN105808972A (en) * | 2016-03-11 | 2016-07-27 | 浙江工业大学 | Method for predicting protein structure from local to global on basis of knowledge spectrum |
-
2017
- 2017-04-19 CN CN201710256156.XA patent/CN107180164B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080003631A1 (en) * | 2006-06-28 | 2008-01-03 | Weis Robert M | Methods and materials for in vitro analysis and/or use of membrane-associated proteins, portions thereof or variants thereof |
CN103566363A (en) * | 2013-09-23 | 2014-02-12 | 中国人民解放军第三军医大学第一附属医院 | Preparation method of contraceptive microneedle |
CN105354441A (en) * | 2015-10-23 | 2016-02-24 | 上海交通大学 | Vegetable protein interaction network construction method |
CN105808972A (en) * | 2016-03-11 | 2016-07-27 | 浙江工业大学 | Method for predicting protein structure from local to global on basis of knowledge spectrum |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062457A (en) * | 2018-01-15 | 2018-05-22 | 浙江工业大学 | A kind of Advances in protein structure prediction of structural eigenvector assisted Selection |
CN108062457B (en) * | 2018-01-15 | 2021-06-18 | 浙江工业大学 | Protein structure prediction method for structure feature vector auxiliary selection |
CN108763870A (en) * | 2018-05-09 | 2018-11-06 | 浙江工业大学 | A kind of multiple domain Protein L inker construction methods |
CN108763870B (en) * | 2018-05-09 | 2021-08-03 | 浙江工业大学 | Construction method of multi-domain protein Linker |
CN110164506A (en) * | 2019-04-19 | 2019-08-23 | 浙江工业大学 | A kind of multiple domain protein structure assemble method based on contact residues between domain |
Also Published As
Publication number | Publication date |
---|---|
CN107180164B (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10621140B2 (en) | Systems and methods for improving the performance of a quantum processor via reduced readouts | |
CN106778059B (en) | A kind of group's Advances in protein structure prediction based on Rosetta local enhancement | |
Mel'Nik et al. | A new list of OB associations in our galaxy | |
CN107180164A (en) | A kind of multiple domain protein structure assemble method based on template | |
CN106951736B (en) | A kind of secondary protein structure prediction method based on multiple evolution matrix | |
CN106096328B (en) | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface | |
Zheng et al. | Automated protein fold determination using a minimal NMR constraint strategy | |
CN114503203A (en) | Protein structure prediction from amino acid sequences using self-attention neural networks | |
Zhang et al. | Similarity metric method for binary basic blocks of cross-instruction set architecture | |
CN103605711A (en) | Construction method and device, classification method and device of support vector machine | |
CN108681697A (en) | Feature selection approach and device | |
CN106055920A (en) | Method for predicting protein structure based on phased multi-strategy copy exchange | |
Zhang et al. | Cp-nas: Child-parent neural architecture search for 1-bit cnns | |
CN109360599A (en) | A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy | |
Buyukkurt et al. | Compiler generated systolic arrays for wavefront algorithm acceleration on FPGAs | |
CN109086565A (en) | A kind of Advances in protein structure prediction based on contiguity constraint between residue | |
Biswas et al. | Improved efficiency in cryo-EM secondary structure topology determination from inaccurate data | |
CN109033753A (en) | A kind of group's Advances in protein structure prediction based on the assembling of secondary structure segment | |
CN109346128A (en) | A kind of Advances in protein structure prediction based on residue information dynamic select strategy | |
Cvitaš et al. | Quantum dynamics in water clusters | |
CN108763860A (en) | A kind of group's protein conformation space optimization method based on Loop intelligence samples | |
Górecki et al. | Deep coalescence reconciliation with unrooted gene trees: Linear time algorithms | |
Xu et al. | A computational method for NMR-constrained protein threading | |
CN100428254C (en) | Cross reaction antigen computer-aided screening method | |
CN108920894A (en) | A kind of protein conformation space optimization method based on the estimation of brief abstract convex |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |