CN107180164A - A kind of multiple domain protein structure assemble method based on template - Google Patents

A kind of multiple domain protein structure assemble method based on template Download PDF

Info

Publication number
CN107180164A
CN107180164A CN201710256156.XA CN201710256156A CN107180164A CN 107180164 A CN107180164 A CN 107180164A CN 201710256156 A CN201710256156 A CN 201710256156A CN 107180164 A CN107180164 A CN 107180164A
Authority
CN
China
Prior art keywords
mrow
msub
msubsup
albumen
mtr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710256156.XA
Other languages
Chinese (zh)
Other versions
CN107180164B (en
Inventor
张贵军
周晓根
郝小虎
王柳静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN201710256156.XA priority Critical patent/CN107180164B/en
Publication of CN107180164A publication Critical patent/CN107180164A/en
Application granted granted Critical
Publication of CN107180164B publication Critical patent/CN107180164B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment

Landscapes

  • Spectroscopy & Molecular Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

A kind of multiple domain protein structure assemble method based on template, first, according to the structure of each domain albumen, compares instrument TM align using protein structure and once each domain albumen is compared, optimal Template is found out from multiple domain albumen database;Then, rotation translation matrix is obtained using Kabsch methods, each domain albumen is overlapped onto in template, and translation rotation process is carried out to each domain albumen, the distance between its is equal to minimum allowable range;Secondly, it is adjusted by entering row stochastic translation and rotation to package assembly, and the quality of package assembly is weighed using the conflict factor between the albumen of domain, the atomic quantity of interaction, and the mobile range of package assembly opposite formwork;In an assembling process, adjacent domain albumen is assembled successively, and the structure assembled is fixed, after all structures are completed, the last assembling result of output.The present invention provides a kind of higher multiple domain protein structure assemble method based on template of precision of prediction.

Description

A kind of multiple domain protein structure assemble method based on template
Technical field
The present invention relates to a kind of biological information, intelligent optimization, computer application field, more particularly to a kind of base In the multiple domain protein structure assemble method of template.
Background technology
Large-scale protein is generally made up of multiple independent domain albumen folded, and being determined to for multiple domain protein structure is strong Promote biological study progress.Domain albumen generally has compact three-dimensional structure and specific biological function, and same domain albumen can There can be different knots to reach.In addition, the single domain three-dimensional structure of many multiple domain albumen by X-ray diffraction, nuclear magnetic resonance with And the prediction such as computer is determined.Therefore, the structure for obtaining its corresponding multiple domain albumen according to the structure of single domain albumen is one Important step, while determining full-length proteins structure and understanding a necessary links of its biological function.
At present, predict that the structure of multiple domain albumen has the conventional method of two classes from single domain albumen.The first kind passes through fixation The structure of single domain albumen, then alignment assembling.Equations of The Second Kind by enumerating link field protein between the structure of conformation assemble The structure of whole multiple domain albumen.Wherein, first kind method is considered as the docking problem between albumen, some docking calculations It may be used as the assembling of multiple domain protein structure;Different from first kind method, Equations of The Second Kind method is considered as between the albumen of domain relatively The structure ab initio prediction problem of shorter amino acid sequence, due to only changing the structure of the conformation between link field albumen, it is adopted Sample space very little.However, because the above method lacks template-directed, so that the assembling side of domain albumen can not be determined in assembling To, and then cause precision of prediction relatively low.
Therefore, existing multiple domain protein structure assemble method has defect, it is necessary to improve in terms of precision of prediction.
The content of the invention
In order to overcome the shortcomings of existing multiple domain protein structure assemble method in terms of precision of prediction, the present invention provides a kind of The higher multiple domain protein structure assemble method based on template of precision of prediction.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of multiple domain protein structure assemble method based on template, the described method comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, interaction Atomic quantity constant n0
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group Decking:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it Template matching score TM-score1
3.2) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align The optimal comparison position of domain albumen, and and record TM-score2
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM- socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiRepresent t-th template Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, then tried to achieve according to Kabsch methods
Spin matrixWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represent rotation T-th of element of the s rows of matrix, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weight Value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts= 0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3, T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
The present invention technical concept be:First, according to the structure of each domain albumen, instrument TM- is compared using protein structure Once each domain albumen is compared by align, and optimal Template is found out from multiple domain albumen database;Then, Kabsch side is utilized Method obtains rotation translation matrix, and each domain albumen is overlapped onto in template, and carries out translation rotation process to each domain albumen, makes its it Between distance be equal to minimum allowable range;Secondly, it is adjusted by entering row stochastic translation and rotation to package assembly, and profit Weighed with the conflict factor between the albumen of domain, the atomic quantity of interaction, and the mobile range of package assembly opposite formwork Measure the quality of package assembly;In an assembling process, adjacent domain albumen is assembled successively, and the structure assembled is fixed, when After all structures are completed, the last assembling result of output.
Beneficial effects of the present invention are shown:Assembling is instructed by template, the directional information of package assembly, and root is obtained Package assembly is weighed according to the change between the conflict factor between the albumen of domain, interaction factor and package assembly and template Quality, so as to reach the effect for instructing assembling, and then improves the prediction progress of whole albumen.
Brief description of the drawings
Fig. 1 is the flow chart of the multiple domain protein structure assemble method based on template.
Fig. 2 is the result that the multiple domain protein structure assemble method based on template is assembled to multiple domain albumen 3nd1A.
Embodiment
The invention will be further described below in conjunction with the accompanying drawings.
Referring to Figures 1 and 2, a kind of multiple domain protein structure assemble method based on template, comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, interaction Atomic quantity constant n0
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group Decking:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it Template matching score TM-score1
3.2) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align The optimal comparison position of domain albumen, and and record TM-score2
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM- socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiRepresent t-th template Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin moment is then tried to achieve according to Kabsch methods Battle arrayWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the of the s rows of spin matrix T element, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weight Value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts= 0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3, T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
The multiple domain protein 3nd1A that the present embodiment sequence length is 244 is embodiment, a kind of multiple domain albumen based on template Structure assemble method, comprises the following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax=30000, conflict distance threshold dclash=3.75, interact threshold value dcontact=8, the atomic quantity constant n of interaction0=87;
3) it is directed to each multiple domain albumen in PDB (Protein Data Bank) storehouse and performs following operation, so that it is determined that group Decking, process is as follows:
3.6) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record it Template matching score TM-score1
3.7) since the resi-dues of last that first domain albumen is compared, second is found out using TM-align The optimal comparison position of domain albumen, and and record TM-score2
3.8) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM- socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.9) score of the template is calculatedWherein scoreiRepresent t-th template Point, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.10) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is made For template;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.3) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin moment is then tried to achieve according to Kabsch methods Battle arrayWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the of the s rows of spin matrix T element, tsRepresent s-th of translation vector;
4.4) for each C alpha atom of inquiry albumenMake rotation translation
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) fix the position of n-th of domain albumen, translated (n+1)th domain albumen according to equation below, make its tie point it Between root-mean-square-deviation RMSD be
Wherein, lnThe length of n-th of domain albumen is tieed up,S for last C alpha atom of n-th of domain albumen ties up seat Mark,For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For last C α of n-th of domain albumen Euclidean distance between atom and first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD
7) Euclidean distance between C alpha atoms in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1=w2=1, w3=0.35 is Respective weighted value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts= 0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
Wherein, α=cos γ, β=sin γ, UstT-th of element of the s rows of expression assembling spin matrix, s=1,2,3, T=1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,
Represent the s dimension coordinates of m-th of C alpha atom of (n+1)th domain albumen, s=1,2,3;
10.5) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen are represented,Represent (n+1)th domain egg The s dimension coordinates of m-th white of C alpha atom;
10.6) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current assembling Structure;
11) repeat step 10) ImaxSecondary, then the structure of last time is the group of n-th of domain albumen and (n+1)th domain albumen Assembling structure;
12) after (n+1)th albumen is completed, then the structure of n+1 domain albumen before fixing, according to step 5) -11) group The n-th+2 domain albumen are filled, after all N number of domain albumen are completed, last package assembly are exported.
Using sequence length be 244 the multiple domain protein 3nd1A comprising two domains as embodiment, assembled with above method The nearly native state conformation of the multiple domain protein is obtained, root-mean-square-deviation isTM-score is 0.997, pre- geodesic structure As shown in Figure 2.
Described above is the effect of optimization that is drawn using 3nd1A protein by example of the present invention, and non-limiting of the invention Practical range, does various modifications and improvement on the premise of without departing from scope involved by substance of the present invention to it, should not Exclude outside protection scope of the present invention.

Claims (1)

1. a kind of multiple domain protein structure assemble method based on template, it is characterised in that:The multiple domain protein structure assembling includes Following steps:
1) sequence information of three-dimensional structure multiple domain albumen corresponding with its of each single domain albumen is inputted;
2) maximum iteration I is setmax, conflict distance threshold dclash, interaction threshold value dcontact, the atom of interaction Quantity constant n0
3) it is directed to each multiple domain albumen in PDB storehouses and performs following operation, so that it is determined that rigging:
3.1) instrument TM-align is compared according to protein structure and finds out first albumen optimal comparison position, and record its template Compare score TM-score1
3.2) since the resi-dues of last that first domain albumen is compared, second domain egg is found out using TM-align White optimal comparison position, and and record TM-score2
3.3) repeat step 3.2) the optimal comparison position of other domain albumen is sequentially found, and record TM-socre3, TM- socre4,…,TM-socreN, N is the total quantity of domain albumen;
3.4) score of the template is calculatedWherein scoreiThe score of t-th of template is represented, TM-scoreiRepresent the comparison score of i-th of domain albumen, LiFor the sequence length of i-th of domain albumen;
3.5) by step 3.1) -3.4) calculate after the score for obtaining each template, the albumen for choosing highest scoring is used as mould Plate;
4) each domain albumen is overlapped onto in template by the following method, process is as follows:
4.1) the C alpha atoms of albumen will be inquired about and the C α of template is compared one by one, spin matrix is then tried to achieve according to Kabsch methodsWith translation vector (t1,t2,t3), ust, s=1,2,3, t=1,2,3 represents the t of the s rows of spin matrix Individual element, tsRepresent s-th of translation vector;
4.2) for each C alpha atom of inquiry albumenMake rotation translation
<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mo>=</mo> <msub> <mi>t</mi> <mn>1</mn> </msub> <mo>+</mo> <msub> <mi>u</mi> <mn>11</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>12</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>13</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mi>n</mi> </msubsup> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mi>n</mi> </msubsup> <mo>=</mo> <msub> <mi>t</mi> <mn>2</mn> </msub> <mo>+</mo> <msub> <mi>u</mi> <mn>21</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>22</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>23</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mi>n</mi> </msubsup> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mi>n</mi> </msubsup> <mo>=</mo> <msub> <mi>t</mi> <mn>3</mn> </msub> <mo>+</mo> <msub> <mi>u</mi> <mn>31</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>32</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <msub> <mi>u</mi> <mn>33</mn> </msub> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mi>n</mi> </msubsup> </mrow> </mtd> </mtr> </mtable> </mfenced>
Wherein,Represent the s dimension coordinates of m-th of C alpha atom of n-th of domain albumen;
5) position of n-th of domain albumen is fixed, (n+1)th domain albumen is translated according to equation below, made between its tie point Root-mean-square-deviation RMSD is
<mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mi>s</mi> </mrow> <mi>n</mi> </msubsup> <mo>=</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mi>s</mi> </mrow> <mi>n</mi> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <msub> <mi>l</mi> <mi>n</mi> </msub> <mi>s</mi> </mrow> <mi>n</mi> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mrow> <mn>1</mn> <mi>s</mi> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>*</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mn>3.8</mn> <mo>/</mo> <msub> <mi>d</mi> <mrow> <mi>n</mi> <mo>,</mo> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> </mrow>
Wherein, lnThe length of n-th of domain albumen is tieed up,For the s dimension coordinates of last C alpha atom of n-th of domain albumen, For the s dimension coordinates of first C alpha atom of (n+1)th domain albumen, dn,n+1For n-th of domain albumen last C alpha atom and Euclidean distance between first C alpha atom of (n+1)th domain albumen;
6) the root-mean-square-deviation E of the C alpha atoms between current albumen and template is calculatedRMSD
7) Euclidean distance of C alpha atoms between any two in the C alpha atoms and (n+1)th domain albumen of n-th of domain albumen is calculated, and is counted Distance is less than dclashQuantity nclash, and record corresponding distanceConflict score between computational fields
8) distance is less than d in statistic procedure 7contactQuantity ncontact, and calculate interaction score
9) ENERGY E=w of current albumen is calculated1ERMSD+w2Eclash+w3Econtact, wherein, w1,w2,w3For respective weighted value;
10) package assembly of minimum energy is determined by following operation iteration, process is as follows:
10.1) rotary shaft is determined:X3=θ, wherein, θ=1-2rand [0,1],φ=2 π rand [0,1], rand [0,1] are the random decimal between 0 and 1;
10.2) random generation anglec of rotation γ=2rand [0,1] -1 and assembling translation vector (T1,T2,T3), wherein Ts=0.3 (2rand [0,1] -1), s=1,2,3;
10.3) assembling spin matrix is determined:
<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>11</mn> </msub> <mo>=</mo> <msubsup> <mi>X</mi> <mn>1</mn> <mn>2</mn> </msubsup> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>&amp;alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>12</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>1</mn> </msub> <msub> <mi>X</mi> <mn>2</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>3</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>13</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>1</mn> </msub> <msub> <mi>X</mi> <mn>3</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>2</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>21</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>1</mn> </msub> <msub> <mi>X</mi> <mn>2</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>3</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>22</mn> </msub> <mo>=</mo> <msup> <msub> <mi>X</mi> <mn>2</mn> </msub> <mn>2</mn> </msup> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>&amp;alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>23</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>2</mn> </msub> <msub> <mi>X</mi> <mn>3</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>1</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>31</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>1</mn> </msub> <msub> <mi>X</mi> <mn>3</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>2</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>32</mn> </msub> <mo>=</mo> <msub> <mi>X</mi> <mn>3</mn> </msub> <msub> <mi>X</mi> <mn>2</mn> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&amp;beta;X</mi> <mn>3</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>U</mi> <mn>33</mn> </msub> <mo>=</mo> <msup> <msub> <mi>X</mi> <mn>3</mn> </msub> <mn>2</mn> </msup> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>&amp;alpha;</mi> </mrow> </mtd> </mtr> </mtable> </mfenced>
Wherein, α=cos γ, β=sin γ, UstRepresent t-th of element of the s rows of assembling spin matrix, s=1,2,3, t= 1,2,3;
10.4) rotation and translation operation is made to each C alpha atom of (n+1)th domain albumen:
<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msub> <mi>T</mi> <mn>1</mn> </msub> <mo>+</mo> <msubsup> <mi>x</mi> <mn>11</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>11</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>11</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>11</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>12</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>11</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>13</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msub> <mi>T</mi> <mn>2</mn> </msub> <mo>+</mo> <msubsup> <mi>x</mi> <mn>12</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>12</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>21</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>12</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>22</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>12</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>23</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>=</mo> <msub> <mi>T</mi> <mn>3</mn> </msub> <mo>+</mo> <msubsup> <mi>x</mi> <mn>13</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>13</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>31</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>2</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>13</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>32</mn> </msub> <mo>+</mo> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mrow> <mi>m</mi> <mn>3</mn> </mrow> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>x</mi> <mn>13</mn> <mrow> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mo>)</mo> </mrow> <msub> <mi>U</mi> <mn>33</mn> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced>
Wherein,The s dimension coordinates of first C alpha atom of (n+1)th domain albumen of expression, s=1,2,3,Represent (n+1)th The s dimension coordinates of m-th of C alpha atom of domain albumen, s=1,2,3;
10.5) according to step 6) -9) energy of current package assembly is calculated, if energy reduces, receive current package assembly;
11) repeat step 10) ImaxSecondary, then the structure of last time is the assembling knot of n-th of domain albumen and (n+1)th domain albumen Structure;
12) after (n+1)th albumen is completed, then it is fixed before n+1 domain albumen structure, according to step 5) -11) assemble the N+2 domain albumen, after all N number of domain albumen are completed, exports last package assembly.
CN201710256156.XA 2017-04-19 2017-04-19 Template-based multi-domain protein structure assembly method Active CN107180164B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710256156.XA CN107180164B (en) 2017-04-19 2017-04-19 Template-based multi-domain protein structure assembly method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710256156.XA CN107180164B (en) 2017-04-19 2017-04-19 Template-based multi-domain protein structure assembly method

Publications (2)

Publication Number Publication Date
CN107180164A true CN107180164A (en) 2017-09-19
CN107180164B CN107180164B (en) 2020-02-21

Family

ID=59831420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710256156.XA Active CN107180164B (en) 2017-04-19 2017-04-19 Template-based multi-domain protein structure assembly method

Country Status (1)

Country Link
CN (1) CN107180164B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062457A (en) * 2018-01-15 2018-05-22 浙江工业大学 A kind of Advances in protein structure prediction of structural eigenvector assisted Selection
CN108763870A (en) * 2018-05-09 2018-11-06 浙江工业大学 A kind of multiple domain Protein L inker construction methods
CN110164506A (en) * 2019-04-19 2019-08-23 浙江工业大学 A kind of multiple domain protein structure assemble method based on contact residues between domain

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080003631A1 (en) * 2006-06-28 2008-01-03 Weis Robert M Methods and materials for in vitro analysis and/or use of membrane-associated proteins, portions thereof or variants thereof
CN103566363A (en) * 2013-09-23 2014-02-12 中国人民解放军第三军医大学第一附属医院 Preparation method of contraceptive microneedle
CN105354441A (en) * 2015-10-23 2016-02-24 上海交通大学 Vegetable protein interaction network construction method
CN105808972A (en) * 2016-03-11 2016-07-27 浙江工业大学 Method for predicting protein structure from local to global on basis of knowledge spectrum

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080003631A1 (en) * 2006-06-28 2008-01-03 Weis Robert M Methods and materials for in vitro analysis and/or use of membrane-associated proteins, portions thereof or variants thereof
CN103566363A (en) * 2013-09-23 2014-02-12 中国人民解放军第三军医大学第一附属医院 Preparation method of contraceptive microneedle
CN105354441A (en) * 2015-10-23 2016-02-24 上海交通大学 Vegetable protein interaction network construction method
CN105808972A (en) * 2016-03-11 2016-07-27 浙江工业大学 Method for predicting protein structure from local to global on basis of knowledge spectrum

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062457A (en) * 2018-01-15 2018-05-22 浙江工业大学 A kind of Advances in protein structure prediction of structural eigenvector assisted Selection
CN108062457B (en) * 2018-01-15 2021-06-18 浙江工业大学 Protein structure prediction method for structure feature vector auxiliary selection
CN108763870A (en) * 2018-05-09 2018-11-06 浙江工业大学 A kind of multiple domain Protein L inker construction methods
CN108763870B (en) * 2018-05-09 2021-08-03 浙江工业大学 Construction method of multi-domain protein Linker
CN110164506A (en) * 2019-04-19 2019-08-23 浙江工业大学 A kind of multiple domain protein structure assemble method based on contact residues between domain

Also Published As

Publication number Publication date
CN107180164B (en) 2020-02-21

Similar Documents

Publication Publication Date Title
US10621140B2 (en) Systems and methods for improving the performance of a quantum processor via reduced readouts
CN106778059B (en) A kind of group&#39;s Advances in protein structure prediction based on Rosetta local enhancement
Mel'Nik et al. A new list of OB associations in our galaxy
CN107180164A (en) A kind of multiple domain protein structure assemble method based on template
CN106951736B (en) A kind of secondary protein structure prediction method based on multiple evolution matrix
CN106096328B (en) A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface
Zheng et al. Automated protein fold determination using a minimal NMR constraint strategy
CN114503203A (en) Protein structure prediction from amino acid sequences using self-attention neural networks
Zhang et al. Similarity metric method for binary basic blocks of cross-instruction set architecture
CN103605711A (en) Construction method and device, classification method and device of support vector machine
CN108681697A (en) Feature selection approach and device
CN106055920A (en) Method for predicting protein structure based on phased multi-strategy copy exchange
Zhang et al. Cp-nas: Child-parent neural architecture search for 1-bit cnns
CN109360599A (en) A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy
Buyukkurt et al. Compiler generated systolic arrays for wavefront algorithm acceleration on FPGAs
CN109086565A (en) A kind of Advances in protein structure prediction based on contiguity constraint between residue
Biswas et al. Improved efficiency in cryo-EM secondary structure topology determination from inaccurate data
CN109033753A (en) A kind of group&#39;s Advances in protein structure prediction based on the assembling of secondary structure segment
CN109346128A (en) A kind of Advances in protein structure prediction based on residue information dynamic select strategy
Cvitaš et al. Quantum dynamics in water clusters
CN108763860A (en) A kind of group&#39;s protein conformation space optimization method based on Loop intelligence samples
Górecki et al. Deep coalescence reconciliation with unrooted gene trees: Linear time algorithms
Xu et al. A computational method for NMR-constrained protein threading
CN100428254C (en) Cross reaction antigen computer-aided screening method
CN108920894A (en) A kind of protein conformation space optimization method based on the estimation of brief abstract convex

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant