CN109448786B - Method for predicting protein structure by lower bound estimation dynamic strategy - Google Patents
Method for predicting protein structure by lower bound estimation dynamic strategy Download PDFInfo
- Publication number
- CN109448786B CN109448786B CN201810994693.9A CN201810994693A CN109448786B CN 109448786 B CN109448786 B CN 109448786B CN 201810994693 A CN201810994693 A CN 201810994693A CN 109448786 B CN109448786 B CN 109448786B
- Authority
- CN
- China
- Prior art keywords
- conformation
- population
- randomly selecting
- segment
- trial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
A lower bound estimation dynamic strategy protein structure prediction method is characterized in that under the framework of an evolutionary algorithm, firstly, two groups of strategy pools are established, each group of strategy pools has three different variation strategies, and the variation strategies in the different strategy pools are selected according to different evolution algebra; and secondly, selecting the variant conformation according to a lower bound estimation function, and finally selecting the conformation by using a Rosetta energy function score3 and a Monte Carlo Boltzmann receiving criterion. The invention provides a lower-bound estimation dynamic strategy protein structure prediction method with higher sampling efficiency and prediction precision.
Description
Technical Field
The invention relates to the fields of bioinformatics and computer application, in particular to a method for predicting a protein structure by using a lower-bound estimation dynamic strategy.
Background
Protein molecules play a crucial role in the course of biochemical reactions in biological cells. Their structural models and biological activity states are of great importance to our understanding and cure of various diseases. Proteins can only produce their characteristic biological functions by folding into a specific three-dimensional structure. Therefore, to understand the function of a protein, it is necessary to obtain its three-dimensional structure.
Protein tertiary structure prediction is an important task of bioinformatics. The most challenging problem of protein conformation optimization is to search the complex protein energy model function surface, and the finer the model is, the more detailed knowledge can be provided, and the more computing resources are needed.
The rapid development of computer hardware and software technologies provides a robust, fundamental platform for the development of predictions from the tertiary structure of proteins. The development and breakthrough of the protein structure de novo prediction method further promote the wide participation of subject researchers in computer science and evolutionary computation. The de novo prediction method is directly based on a protein physical or knowledge energy model, and utilizes an optimization algorithm to search a global minimum energy conformational solution in a conformational space. The conformation space optimization method is one of the most critical factors for restricting the de novo prediction precision of the protein structure at present. Currently, many optimization methods have been started to solve this problem. The application of the optimization algorithm to the de novo prediction sampling process must first solve the following three problems: (1) the complexity of the energy. (2) High dimensional properties of the energy model. (3) Inaccuracy of the energy model. To date, we are far from constructing a sufficiently accurate force field that can direct the target sequence to fold in the correct direction, resulting in a mathematically optimal solution that does not necessarily correspond to the native structure of the target protein; furthermore, model inaccuracies can also result in an inability to objectively analyze the performance of the optimization algorithm.
The differential evolution algorithm (DE) has been successfully applied to protein structure prediction due to its advantages of simple structure, easy implementation, strong robustness, fast convergence rate, etc. However, with the increase of amino acid sequences, the degree of freedom of a protein molecular system is increased, and obtaining a global optimal solution of a large-scale protein conformation space by using the traditional population algorithm sampling becomes challenging work; secondly, the coarse-grained model reduces the conformational search space, but also causes information loss between interaction forces, thereby directly affecting the prediction accuracy.
Therefore, the conventional protein structure prediction method has disadvantages in sampling efficiency and prediction accuracy, and needs to be improved.
Disclosure of Invention
In order to overcome the defects of low sampling efficiency, poor population diversity and low prediction precision of the conventional protein structure prediction method, the invention introduces a dynamic variation strategy to guide conformational space optimization under the framework of a basic differential evolution algorithm, and provides a lower-bound estimation dynamic strategy protein structure prediction method with high sampling efficiency and high prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for predicting a protein structure of a lower bound estimation dynamic strategy, comprising the following steps:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) setting parameters: the method comprises the following steps of (1) setting a population size NP, a maximum iteration algebra G of an algorithm, a cross factor CR, a temperature factor beta and a slope control factor M to be an iteration algebra G equal to 0;
4) population initialization: random fragment assembly to generate NP initial conformations Ci,i={1,2,…,NP};
5) Each conformation CiThe three-dimensional coordinates of each carbon α atom of i ═ {1,2, …, NP } are combined into position coordinates of the conformation A j-dimensional element representing a spatial position coordinate of the i-th conformation, len being a length of the protein sequence;
6) for each individual in the population CiThe following operations are carried out:
6.1) mixing CiSet as a target individualIf g is 0 or even, then steps 6.2) to 6.4) are performed, otherwise steps 6.5) to 6.7) are performed, generating Ctrial1、Ctrial2、Ctrial3;
6.2) randomly selecting three individuals C different from each other in the populationa、CbAnd Cc,Respectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CcFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.3) randomly selecting four mutually different individuals C in the populationa、Cb、CcAnd Cd,Respectively from Ca、Cb、CcRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CdFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.4) randomly selecting two mutually different individuals C in the populationa、Cb,Respectively from Ca、CbIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.5) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifAs energy in the populationThe lowest amount of conformation, one conformation C is randomly selected from the whole populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CSLFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.6) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting a conformation C from the whole populationaAnd is andrespectively from Ca、CSLIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.7) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from CSL、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CaFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.8) finding the distance C from the populationtrial1、Ctrial2、Ctrial3Recent individual Cnear1、Cnear2、Cnear3Respectively combining the three-dimensional coordinates of each carbon alpha atom of the corresponding conformation into the position coordinates of the conformation, then Ctrial1、Ctrial2、Ctrial3And Cnear1、Cnear2、Cnear3Respectively are
6.9) if g is 0, C is calculated using Rosetta score3 energy function respectivelytrial1、Ctrial2、Ctrial3Energy score3 (C)trial1)、score3(Ctrial2) And score3 (C)trial3) And selecting the conformation with the smallest energy as CtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.10) ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
6.11) if g>0, calculating C by equation (1) respectivelytrial1、Ctrial2、Ctrial3Lower bound estimation UEtrial1、UEtrial2、UEtrial3;
The conformation with the smallest lower bound estimate was selected as CtrialThe corresponding lower bound estimate is denoted as UEtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.12) ifThen C istrialIs rejected, otherwise C is calculatedtrialEnergy value score of (C) 3 (C)trial) If, ifThen C istrialReplacement ofOtherwise pressProbability of illuminationReceiving a constellation using Monte Carlo criteria;
7) g +1, and iteratively executing the steps 6) to 7) until G is larger than G;
8) the conformation with the lowest output energy is the final result.
The technical conception of the invention is as follows: under the frame of an evolutionary algorithm, firstly, two groups of strategy pools are established, each group of strategy pools has three different variation strategies, and the variation strategies in the different strategy pools are selected according to different evolution algebras; and secondly, selecting the variant conformation according to a lower bound estimation function, and finally selecting the conformation by using a Rosetta energy function score3 and a Monte Carlo Boltzmann receiving criterion. The invention provides a method for predicting a protein structure by a lower bound estimation dynamic strategy.
The invention has the beneficial effects that: the variation strategies of different strategy pools are selected according to population evolution algebra to guide variation, so that not only can the diversity of the population be improved, but also the problem of low sampling efficiency of the traditional evolution algorithm can be solved; and the lower bound estimation function is used for assisting the conformation selection, so that the selection efficiency is improved, the problem of prediction error caused by inaccurate energy function is relieved, and the prediction precision is improved.
Drawings
FIG. 1 is a conformational distribution diagram obtained by sampling protein 1GB1 by a lower bound estimation dynamic strategy protein structure prediction method.
FIG. 2 is a schematic diagram of the conformational update of protein 1GB1 when the lower bound estimation dynamic strategy protein structure prediction method samples the protein.
FIG. 3 is a three-dimensional structure predicted by a lower bound estimation dynamic strategy protein structure prediction method on the structure of protein 1GB 1.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 3, a method for predicting a protein structure by a lower bound estimation dynamic strategy, the method comprising the steps of:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) setting parameters: the method comprises the following steps of (1) setting a population size NP, a maximum iteration algebra G of an algorithm, a cross factor CR, a temperature factor beta and a slope control factor M to be an iteration algebra G equal to 0;
4) population initialization: random fragment assembly to generate NP initial conformations Ci,i={1,2,…,NP};
5) Each conformation CiThe three-dimensional coordinates of each carbon α atom of i ═ {1,2, …, NP } are combined into position coordinates of the conformation A j-dimensional element representing a spatial position coordinate of the i-th conformation, len being a length of the protein sequence;
6) for each individual in the population CiThe following operations are carried out:
6.1) mixing CiSet as a target individualIf g is 0 or even, then steps 6.2) to 6.4) are performed, otherwise steps 6.5) to 6.7) are performed, generating Ctrial1、Ctrial2、Ctrial3;
6.2) randomly selecting three individuals C different from each other in the populationa、CbAnd Cc,Respectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CcFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.3) randomly selecting four mutually different individuals C in the populationa、Cb、CcAnd Cd,Respectively from Ca、Cb、CcRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CdFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.4) randomly selecting two mutually different individuals C in the populationa、Cb,Respectively from Ca、CbIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.5) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CSLCorresponding bitFragment generation of variant conformation Ctrial1;
6.6) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting a conformation C from the whole populationaAnd is andrespectively from Ca、CSLIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.7) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from CSL、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CaFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.8) finding the distance C from the populationtrial1、Ctrial2、Ctrial3Recent individual Cnear1、Cnear2、Cnear3Respectively combining the three-dimensional coordinates of each carbon alpha atom of the corresponding conformation into the position coordinates of the conformation, then Ctrial1、Ctrial2、Ctrial3And Cnear1、Cnear2、Cnear3Respectively are
6.9) if g is 0, C is calculated using Rosetta score3 energy function respectivelytrial1、Ctrial2、Ctrial3Energy score3 (C)trial1)、score3(Ctrial2) And score3 (C)trial3) And selecting the conformation with the smallest energy as CtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.10) ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
6.11) if g>0, calculating C by equation (1) respectivelytrial1、Ctrial2、Ctrial3Lower bound estimation UEtrial1、UEtrial2、UEtrial3;
The conformation with the smallest lower bound estimate was selected as CtrialThe corresponding lower bound estimate is denoted as UEtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.12) ifThen C istrialIs rejected, otherwise C is calculatedtrialEnergy value score of (C) 3 (C)trial) If, ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
7) g +1, and iteratively executing the steps 6) to 7) until G is larger than G;
8) the conformation with the lowest output energy is the final result.
Taking alpha/beta protein 1GB1 with the sequence length of 56 as an example, the method for predicting the protein structure by the lower bound estimation dynamic strategy comprises the following steps:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) setting parameters: the population size NP is 100, the maximum iteration algebra G of the algorithm is 1000, the crossover factor CR is 0.5, the temperature factor β is 2, the slope control factor M is 10000, and the iteration algebra G is 0;
4) population initialization: random fragment assembly to generate NP initial conformations Ci,i={1,2,…,NP};
5) Each conformation CiThe three-dimensional coordinates of each carbon α atom of i ═ {1,2, …, NP } are combined into position coordinates of the conformation A j-dimensional element representing a spatial position coordinate of the i-th conformation, len being a length of the protein sequence;
6) for each individual in the population CiThe following operations are carried out:
6.1) mixing CiSet as a target individualIf g is 0 or even, then steps 6.2) to 6.4) are performed, otherwise steps 6.5) to 6.7) are performed, generating Ctrial1、Ctrial2、Ctrial3;
6.2) randomly selecting three individuals C different from each other in the populationa、CbAnd Cc,Respectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CcFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.3) randomly selecting four mutually different individuals C in the populationa、Cb、CcAnd Cd,Respectively from Ca、Cb、CcRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CdFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.4) randomly selecting two mutually different individuals C in the populationa、Cb,Respectively from Ca、CbIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.5) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from Ca、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CSLFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.6) Slave populationIn which an energy ratio is randomly selectedLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting a conformation C from the whole populationaAnd is andrespectively from Ca、CSLIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.7) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationaAnd CbAnd is andrespectively from CSL、CbRandomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CaFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.8) finding the distance C from the populationtrial1、Ctrial2、Ctrial3Recent individual Cnear1、Cnear2、Cnear3Are respectively paired withIf the three-dimensional coordinates of each carbon alpha atom of the conformation are combined to form the position coordinates of the conformation, Ctrial1、Ctrial2、Ctrial3And Cnear1、Cnear2、Cnear3Respectively are
6.9) if g is 0, C is calculated using Rosetta score3 energy function respectivelytrial1、Ctrial2、Ctrial3Energy score3 (C)trial1)、score3(Ctrial2) And score3 (C)trial3) And selecting the conformation with the smallest energy as CtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.10) ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
6.11) if g>0, calculating C by equation (1) respectivelytrial1、Ctrial2、Ctrial3Lower bound estimation UEtrial1、UEtrial2、UEtrial3;
The conformation with the smallest lower bound estimate was selected as CtrialThe corresponding lower bound estimate is denoted as UEtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.12) ifThen C istrialIs rejected, otherwise C is calculatedtrialEnergy value score of (C) 3 (C)trial) If, ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
7) g +1, and iteratively executing the steps 6) to 7) until G is larger than G;
8) the conformation with the lowest output energy is the final result.
Taking alpha/beta protein 1GB1 with sequence length of 56 as an example, the method is used for obtaining the near-natural state conformation of the protein, and the structure and the natural state conformation obtained by running 1000 generationsThe mean RMS deviation between state structures isMinimum root mean square deviation ofThe predicted three-dimensional structure is shown in fig. 3.
The foregoing illustrates one example of the invention, and it will be apparent that the invention is not limited to the above-described embodiments, but may be practiced with various modifications without departing from the essential spirit of the invention and without departing from the spirit thereof.
Claims (1)
1. A method for predicting a protein structure by a lower bound estimation dynamic strategy is characterized by comprising the following steps: the method comprises the following steps:
1) sequence information for a given protein of interest;
2) obtaining fragment library files from a ROBETTA server according to a target protein sequence, wherein the fragment library files comprise 3 fragment library files and 9 fragment library files;
3) setting parameters: the method comprises the following steps of (1) setting a population size NP, a maximum iteration algebra G of an algorithm, a temperature factor beta and a slope control factor M to be an iteration algebra G equal to 0;
4) population initialization: random fragment assembly to generate NP initial conformations Ci,i={1,2,…,NP};
5) Each conformation CiAre combined into position coordinates of the conformation A j-th dimension element representing a spatial position coordinate of an i-th conformation, j being 1, 2.., 3len, len being a length of a protein sequence;
6) for each individual in the population CiThe following operations are carried out:
6.1) mixing CiSet as a target individualIf g is 0 or even, then steps 6.2) to 6.4) are performed, otherwise steps 6.5) to 6.7) are performed, generating Ctrial1、Ctrial2、Ctrial3;
6.2) randomly selecting three individuals C different from each other in the populationa1、Cb1And Cc1,Respectively from Ca1、Cb1Randomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with Cc1Fragments of the corresponding positions generate a mutated conformation Ctrial1;
6.3) randomly selecting four mutually different individuals C in the populationa2、Cb2、Cc2And Cd2,Respectively from Ca2、Cb2、Cc2Randomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with Cd2Fragments of the corresponding positions generate a mutated conformation Ctrial2;
6.4) randomly selecting two mutually different individuals C in the populationa3、Cb3,Respectively from Ca3、Cb3In the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial3;
6.5) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationa4And Cb4And is andrespectively from Ca4、Cb4Randomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with CSLFragments of the corresponding positions generate a mutated conformation Ctrial1;
6.6) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting a conformation C from the whole populationa5And is andrespectively from Ca5、CSLIn the method, a 3-segment with different positions is randomly selected and respectively replaced toFragments of the corresponding positions generate a mutated conformation Ctrial2;
6.7) randomly selecting an energy ratio from the populationLow conformation CSLIf, ifA conformation C is randomly selected from the whole population as the lowest energy conformation in the populationSLThen randomly selecting two mutually unequal conformations C from the whole populationa6And Cb6And is andrespectively from CSL、Cb6Randomly selecting a 3-segment with different positions, and respectively replacing the 3-segment with Ca6 Fragments of the corresponding positions generate a mutated conformation Ctrial3;
6.8) finding the distance C from the populationtrial1、Ctrial2、Ctrial3Recent individual Cnear1、Cnear2、Cnear3Respectively combining the three-dimensional coordinates of each carbon alpha atom of the corresponding conformation into the position coordinates of the conformation, then Ctrial1、Ctrial2、Ctrial3And Cnear1、Cnear2、Cnear3Respectively are
6.9) if g is 0, C is calculated using Rosetta score3 energy function respectivelytrial1、Ctrial2、Ctrial3Energy score3 (C)trial1)、score3(Ctrial2) And score3 (C)trial3) And selecting the conformation with the smallest energy as CtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.10) ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
6.11) if g>0, calculating C by equation (1) respectivelytrial1、Ctrial2、Ctrial3Lower bound estimation UEtrial1、UEtrial2、UEtrial3;
The conformation with the smallest lower bound estimate was selected as CtrialThe corresponding lower bound estimate is denoted as UEtrialAnd recording the spatial position coordinates thereof asCalculating CtrialThe Euclidean distance from each conformation in the population is found, and the conformation C closest to the Euclidean distance is foundnearAnd recording the space position coordinates thereof as
6.12) ifThen C istrialIs rejected, otherwise C is calculatedtrialEnergy value score of (C) 3 (C)trial) If, ifThen C istrialReplacement ofOtherwise according to probabilityReceiving a constellation using Monte Carlo criteria;
7) g +1, and iteratively executing the steps 6) to 7) until G is larger than G;
8) the conformation with the lowest output energy is the final result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810994693.9A CN109448786B (en) | 2018-08-29 | 2018-08-29 | Method for predicting protein structure by lower bound estimation dynamic strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810994693.9A CN109448786B (en) | 2018-08-29 | 2018-08-29 | Method for predicting protein structure by lower bound estimation dynamic strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109448786A CN109448786A (en) | 2019-03-08 |
CN109448786B true CN109448786B (en) | 2021-04-06 |
Family
ID=65530202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810994693.9A Active CN109448786B (en) | 2018-08-29 | 2018-08-29 | Method for predicting protein structure by lower bound estimation dynamic strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109448786B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111161791B (en) * | 2019-11-28 | 2021-06-18 | 浙江工业大学 | Experimental data-assisted adaptive strategy protein structure prediction method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103413067A (en) * | 2013-07-30 | 2013-11-27 | 浙江工业大学 | Abstract convex lower-bound estimation based protein structure prediction method |
CN104732115A (en) * | 2014-11-25 | 2015-06-24 | 浙江工业大学 | Protein conformation optimization method based on simple space abstract convexity lower bound estimation |
CN105224987A (en) * | 2015-09-22 | 2016-01-06 | 浙江工业大学 | A kind of change strategy colony global optimization method based on dynamic Lipschitz Lower Bound Estimation |
CN106096328A (en) * | 2016-04-26 | 2016-11-09 | 浙江工业大学 | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface |
CN106503484A (en) * | 2016-09-23 | 2017-03-15 | 浙江工业大学 | A kind of multistage differential evolution Advances in protein structure prediction that is estimated based on abstract convex |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004065363A2 (en) * | 2003-01-21 | 2004-08-05 | The Trustees Of The University Of Pennsylvania | Computational design of a water-soluble analog of a protein, such as phospholamban and potassium channel kcsa |
-
2018
- 2018-08-29 CN CN201810994693.9A patent/CN109448786B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103413067A (en) * | 2013-07-30 | 2013-11-27 | 浙江工业大学 | Abstract convex lower-bound estimation based protein structure prediction method |
CN104732115A (en) * | 2014-11-25 | 2015-06-24 | 浙江工业大学 | Protein conformation optimization method based on simple space abstract convexity lower bound estimation |
CN105224987A (en) * | 2015-09-22 | 2016-01-06 | 浙江工业大学 | A kind of change strategy colony global optimization method based on dynamic Lipschitz Lower Bound Estimation |
CN106096328A (en) * | 2016-04-26 | 2016-11-09 | 浙江工业大学 | A kind of double-deck differential evolution Advances in protein structure prediction based on locally Lipschitz function supporting surface |
CN106503484A (en) * | 2016-09-23 | 2017-03-15 | 浙江工业大学 | A kind of multistage differential evolution Advances in protein structure prediction that is estimated based on abstract convex |
Non-Patent Citations (2)
Title |
---|
A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space;Xiao-Hu Hao et al.;《 IEEE/ACM Transactions on Computational Biology and Bioinformatics》;20160930;全文 * |
一种基于局部Lipschitz下界估计支撑面的差分进化算法;周晓根 等;《计算机学报》;20161231;第39卷(第12期);第2631-2651页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109448786A (en) | 2019-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cruz et al. | RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction | |
Bowman et al. | Using generalized ensemble simulations and Markov state models to identify conformational states | |
CN108846256B (en) | Group protein structure prediction method based on residue contact information | |
CN110148437B (en) | Residue contact auxiliary strategy self-adaptive protein structure prediction method | |
Lee et al. | Exascale computing: A new dawn for computational biology | |
Chen et al. | Overcoming free-energy barriers with a seamless combination of a biasing force and a collective variable-independent boost potential | |
CN109872770B (en) | Variable strategy protein structure prediction method combined with displacement degree evaluation | |
CN109448786B (en) | Method for predicting protein structure by lower bound estimation dynamic strategy | |
CN109360601B (en) | Multi-modal protein structure prediction method based on displacement strategy | |
CN111180004B (en) | Multi-contact information sub-population strategy protein structure prediction method | |
CN109346126B (en) | Adaptive protein structure prediction method of lower bound estimation strategy | |
CN109509510B (en) | Protein structure prediction method based on multi-population ensemble variation strategy | |
CN109360597B (en) | Group protein structure prediction method based on global and local strategy cooperation | |
Roshan | Multiple sequence alignment using Probcons and Probalign | |
CN109461471B (en) | Adaptive protein structure prediction method based on championship mechanism | |
CN109243526B (en) | Protein structure prediction method based on specific fragment crossing | |
CN111161791B (en) | Experimental data-assisted adaptive strategy protein structure prediction method | |
Liu et al. | GraphCPLMQA: Assessing protein model quality based on deep graph coupled networks using protein language model | |
CN110197700B (en) | Protein ATP docking method based on differential evolution | |
CN109326319B (en) | Protein conformation space optimization method based on secondary structure knowledge | |
CN109411013B (en) | Group protein structure prediction method based on individual specific variation strategy | |
Jahanshahi et al. | A coarse-graining approach for modeling nonlinear mechanical behavior of FCC nano-crystals | |
CN112085246B (en) | Protein structure prediction method based on residue pair distance constraint | |
CN111815036B (en) | Protein structure prediction method based on multi-residue contact map cooperative constraint | |
CN109063413B (en) | Method for optimizing space of protein conformation by population hill climbing iteration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |