CN106096326B

CN106096326B - A kind of differential evolution Advances in protein structure prediction based on barycenter Mutation Strategy

Info

Publication number: CN106096326B
Application number: CN201610390675.0A
Authority: CN
Inventors: 张贵军; 周晓根; 俞旭锋; 郝小虎; 王柳静; 徐东伟
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2016-06-02
Filing date: 2016-06-02
Publication date: 2018-09-07
Anticipated expiration: 2036-06-02
Also published as: CN106096326A

Abstract

A differential evolution protein structure prediction method based on the centroid mutation strategy. First, the energy values of each conformation are arranged in ascending order, and the average energy error value between each conformation and the lowest energy conformation is calculated; then, some conformations with lower energy are selected to calculate Centroid conformation; finally, judge the search state achieved by the algorithm according to the average energy error value, so as to design different centroid mutation strategies to generate the test conformation, that is, if the average energy error value is greater than the set threshold, design DE/rand-to-centroid The /1 strategy is mutated, and the test conformation is generated by extracting some fragments in the centroid conformation to replace the corresponding fragments in the randomly selected conformation, otherwise, the DE/centroid/2 strategy is designed to mutate, and the centroid conformation is replaced by extracting fragments in the randomly selected conformation The corresponding fragments in generate a test conformation, thereby improving the algorithm search efficiency and prediction accuracy.

Description

A Differential Evolution Protein Structure Prediction Method Based on Centroid Mutation Strategy

技术领域technical field

本发明涉及一种生物学信息学、智能优化、计算机应用领域，尤其涉及的是，一种基于质心变异策略的差分进化蛋白质结构预测方法。The present invention relates to the fields of biological informatics, intelligent optimization and computer application, in particular to a differential evolution protein structure prediction method based on centroid mutation strategy.

背景技术Background technique

1953年，J.Watson和F.Crick在英国《Nature》杂志上发表了DNA分子双螺旋结构模型，标志着分子生物学真正意义上的诞生；五年后，F.Crick提出分子生物学“中心法则”的设想，揭示了生命遗传信息传递的一般规律。作为该法则的关键部分，从DNA到蛋白质氨基酸序列的三联遗传密码(简称“第一密码”)的破译工作早在1965年就已经全部完成；然而，从氨基酸序列到空间结构的折叠密码(简称“第二密码”)至今尚未破解。随着2003年人类基因组测序工作的完成，蛋白质氨基酸序列数量激增，蛋白质折叠密码的理论研究成为当前蛋白质工程领域迫切需要解决的一个关键问题。In 1953, J.Watson and F.Crick published the DNA molecular double helix structure model in the British "Nature", marking the birth of molecular biology in a true sense; five years later, F.Crick proposed the "Center of Molecular Biology". The idea of "law" reveals the general law of life genetic information transmission. As a key part of this law, the deciphering of the triple genetic code from DNA to protein amino acid sequence (referred to as "the first code") has been completed as early as 1965; however, the folding code from amino acid sequence to spatial structure (referred to as The "Second Code") has not yet been cracked. With the completion of the sequencing of the human genome in 2003, the number of protein amino acid sequences has increased sharply, and the theoretical study of protein folding codes has become a key issue that needs to be solved urgently in the field of protein engineering.

结构基因组学利用实验手段来测定蛋白质的三维结构。X射线晶体学方法是至今为止研究蛋白质结构最有效的方法，所能达到的精度是任何其他方法所不能比拟的，它的缺点主要是蛋白质的晶体难以培养且晶体结构测定的周期较长。多维核磁共振方法可以直接测定蛋白质在溶液中的构象，但是由于对样品的需要量大、纯度要求高，目前只能测定小分子蛋白质。总体上，蛋白质结构实验测定方法极其费时费钱费力。Structural genomics uses experimental means to determine the three-dimensional structure of proteins. X-ray crystallography is by far the most effective method for studying protein structure, and the precision it can achieve is unmatched by any other method. Its main disadvantages are that protein crystals are difficult to cultivate and the period of crystal structure determination is long. The multidimensional NMR method can directly determine the conformation of proteins in solution, but due to the large amount of samples required and the high purity requirements, currently only small molecular proteins can be determined. In general, protein structure experimental determination methods are extremely time-consuming, costly and labor-intensive.

从头预测方法被誉为蛋白质结构预测领域的圣杯，鉴于其重要的生物学意义和问题的复杂性，2005年《Science》杂志将其列为当前科学界亟待解决的100个最具挑战性问题之一。蛋白质从头预测方法必须考虑以下两个因素：(1)蛋白质结构能量函数；(2)构象空间搜索方法。第一个因素本质上属于分子力学问题，主要是为了能够计算得到每个蛋白质结构对应的能量值。第二个因素本质上属于全局优化问题，通过选择一种合适的优化方法，对构象空间进行快速搜索，得到与某一全局最小能量对应的构象。其中，蛋白质构象空间优化属于一类非常难解的NP-Hard问题。群体进化类算法是研究蛋白质分子构象优化的重要方法，主要包括差分进化算法(DE)、遗传算法(GA)、粒子群算法(PSO)，这些算法不仅结构简单，易于实现，而且鲁棒性强，因此，经常被用于从头预测方法中的全局最小能量构象搜索。然而群体优化算法属于一类随机优化算法，现有蛋白质构象优化方面的文献主要研究如何从一个局部最小解跳到另一个局部最小解，没有提供一种机制有效利用群体进化过程的智能信息指导搜索，从而导致算法效率较低。此外，受选择压力和随机采样过程中遗传漂变的影响，群体中所有个体将不可避免收敛到某个吸收态。对于蛋白质构象这类优化问题，该吸收态并不一定就是全局最优解，从而影响预测精度。The de novo prediction method is known as the Holy Grail in the field of protein structure prediction. In view of its important biological significance and the complexity of the problem, in 2005, "Science" magazine listed it as one of the 100 most challenging problems to be solved in the current scientific community. one. The following two factors must be considered in protein de novo prediction methods: (1) protein structure energy function; (2) conformational space search method. The first factor is essentially a molecular mechanics problem, mainly to be able to calculate the energy value corresponding to each protein structure. The second factor is essentially a global optimization problem. By choosing an appropriate optimization method, the conformation space is quickly searched to obtain the conformation corresponding to a certain global minimum energy. Among them, protein conformation space optimization belongs to a class of very difficult NP-Hard problems. Population evolution algorithm is an important method to study protein molecular conformation optimization, mainly including differential evolution algorithm (DE), genetic algorithm (GA), particle swarm algorithm (PSO), these algorithms are not only simple in structure, easy to implement, but also strong in robustness , therefore, is often used for global minimum energy conformation search in ab initio prediction methods. However, the population optimization algorithm belongs to a class of stochastic optimization algorithms. The existing literature on protein conformation optimization mainly studies how to jump from one local minimum solution to another local minimum solution, and does not provide a mechanism to effectively use the intelligent information of the population evolution process to guide the search. , resulting in low algorithm efficiency. In addition, affected by selection pressure and genetic drift during random sampling, all individuals in the population will inevitably converge to a certain absorbing state. For optimization problems such as protein conformation, the absorption state is not necessarily the global optimal solution, which affects the prediction accuracy.

因此，现有的基于群体的蛋白质结构预测方法在搜索效率和预测精度方面存在着缺陷，需要改进。Therefore, the existing population-based protein structure prediction methods have shortcomings in search efficiency and prediction accuracy, which need to be improved.

发明内容Contents of the invention

为了克服现有的蛋白质结构预测方法在搜索效率和预测精度方面的不足，本发明通过提取能量较低的构象信息，设计质心变异策略，同时基于片段组装技术，提出一种搜索效率高、预测精度高的基于质心变异策略的差分进化蛋白质结构预测方法。In order to overcome the shortcomings of existing protein structure prediction methods in terms of search efficiency and prediction accuracy, the present invention designs a centroid mutation strategy by extracting low-energy conformation information, and at the same time, based on fragment assembly technology, proposes a method with high search efficiency and high prediction accuracy. Gao's method for protein structure prediction based on differential evolution based on centroid mutation strategy.

本发明解决其技术问题所采用的技术方案是：The technical solution adopted by the present invention to solve its technical problems is:

一种基于质心变异策略的差分进化蛋白质结构预测方法，所述优化方法包括以下步骤：A differential evolution protein structure prediction method based on centroid mutation strategy, the optimization method includes the following steps:

1)选取蛋白质力场模型，即能量函数E(X)；1) Select the protein force field model, that is, the energy function E(X);

2)给定输入序列信息；2) given input sequence information;

3)初始化：设置种群大小NP，交叉因子CR，最大迭代次数，由输入序列产生初始构象种群并初始化迭代次数G＝0，其中，N表示维数，表示第i个构象Cⁱ的第N维元素；3) Initialization: Set the population size NP, cross factor CR, maximum number of iterations, and generate the initial conformation population from the input sequence And initialize the number of iterations G=0, where N represents the number of dimensions, represents the N-th dimension element of the i-th conformation ^Ci ;

4)计算当前种群各构象的能量函数值E(Cⁱ),i＝1,2,…,N，并根据当前种群中各构象能量值对各构象进行升序排列；4) Calculate the energy function value E(C ⁱ ) of each conformation of the current population, i=1, 2, ..., N, and arrange the conformations in ascending order according to the energy value of each conformation in the current population;

5)找出当前种群中能量最低的构象C_best，并计算其他构象的能量与C_best的能量E(C_best)的平均能量误差如果迭代次数G＝0，则令δ_max＝δ；5) Find the conformation C _{best with} the lowest energy in the current population, and calculate the average energy error between the energy of other conformations and the energy E(C _best ) of C _best If the number of iterations G=0, let δ _max =δ;

6)针对种群中的每个构象个体Cⁱ，i∈{1,2,3,…,NP}，令C_target＝Cⁱ，C_target表示目标构象个体，提取当前种群中能量较低的构象信息，执行以下操作生成变异构象C_mutant：6) For each conformation individual C ⁱ in the population, i∈{1,2,3,…,NP}, let C _target =C ⁱ , C _target represents the target conformation individual, and extract the conformation with lower energy in the current population information, perform the following operations to generate a variant conformation C _mutant :

6.1)选取排名前CT个构象其中CT＝rand(NP/3,NP/2)，rand(NP/3,NP/2)表示NP/3和NP/2之间的随机整数，表示第m个选取构象的第N维元素；6.1) Select the top CT conformations Where CT=rand(NP/3,NP/2), rand(NP/3,NP/2) represents a random integer between NP/3 and NP/2, Represents the Nth dimensional element of the mth selected conformation;

6.2)计算所选取的CT个构象的质心构象C_centroid＝(x_centroid,1,x_centroid,2,…,x_centroid,N)，其中，构象C_centroid的第j维元素j＝1,2,…,N；6.2) Calculate the centroid conformation C _centroid of the selected CT conformations = (x _centroid,1 ,x _centroid,2 ,...,x _centroid,N ), where the j-th dimension element of the conformation C _centroid j=1,2,...,N;

6.3)设置序列长度L，在1和L之间随机生成4个整数randint1、randint2、randint3和randint4，其中randint1和randint2，randint3和randint4互不相同，令a＝min(randint1,randint2)，b＝max(randint1,randint2)，c＝min(randint3,randint4)，d＝max(randint3,randint4)，其中min表示取两个数的最小值，max表示取两个数的最大值；6.3) Set the sequence length L, randomly generate 4 integers randint1, randint2, randint3 and randint4 between 1 and L, where randint1 and randint2, randint3 and randint4 are different from each other, let a=min(randint1, randint2), b= max(randint1, randint2), c=min(randint3, randint4), d=max(randint3, randint4), where min means taking the minimum value of two numbers, and max means taking the maximum value of two numbers;

6.4)如果δ＞0.5δ_max，则设计DE/rand-to-centroid/策略进行变异：从当前种群中随机选取两个不同的构象C^rand1和C^rand2，其中rand1≠rand2∈[1,NP]，提取质心构象C_centroid位置a到位置b的片段的氨基酸所对应的二面角替换构象C^rand1的相同位置所对应的二面角，同时提取构象C^rand2位置c到位置d的片段的氨基酸所对应的二面角替换构象C^rand1相同位置所对应的二面角，然后将所得C^rand1进行片段组装得到变异构象个体C_mutant；6.4) If δ＞0.5δ _max , design DE/rand-to-centroid/strategy for mutation: randomly select two different conformations C ^rand1 and C ^rand2 from the current population, where rand1≠rand2∈[1,NP] , extract the dihedral angle corresponding to the amino acid of the fragment from position a to position b of centroid conformation C _centroid to replace the dihedral angle corresponding to the same position of conformation C ^rand1 , and extract the amino acid of the fragment from position c to position d of conformation C ^rand2 The corresponding dihedral angle replaces the dihedral angle corresponding to the same position of the conformation C ^rand1 , and then the resulting C ^rand1 is fragment assembled to obtain a variant conformation individual C _mutant ;

6.5)如果δ≤0.5δ_max，则设计DE/centroid/2策略进行变异：从当前种群中随机选取两个不同的构象C^rand1和C^rand2，其中rand1≠rand2∈[1,NP]，提取构象C^rand1位置a到位置b的片段的氨基酸所对应的二面角替换质心构象C_centroid的相同位置所对应的二面角，同时使用C^rand2上位置c到位置d的片段的氨基酸所对应的二面角替换质心构象C_centroid相同位置所对应的二面角，然后将所得C_centroid进行片段组装得到变异构象个体C_mutant；；6.5) If δ≤0.5δ _max , design a DE/centroid/2 strategy for mutation: randomly select two different conformations C ^rand1 and C ^rand2 from the current population, where rand1≠rand2∈[1,NP], extract the conformation The dihedral angle corresponding to the amino acid of the fragment from position a to position b of C ^rand1 replaces the dihedral angle corresponding to the same position of the centroid conformation C _centroid , and uses the dihedral angle corresponding to the amino acid of the fragment from position c to position d on C ^rand2 The face angle is replaced by the dihedral angle corresponding to the same position of the centroid conformation C _centroid , and then the resulting C _centroid is fragment assembled to obtain a variant conformation individual C _mutant ;

7)对变异构象C_mutant执行交叉操作生成测试构象C_trial：7) Perform a crossover operation on the variant conformation C _mutant to generate a test conformation C _trial :

7.1)在0和1之间随机生成小数rand3；7.1) Randomly generate a decimal number rand3 between 0 and 1;

7.2)若rand3≤CR，则在1和L之间随机生成整数rand4，利用变异构象C_mutant中的片段rand4替换目标构象C_target中对应的片段，从而生成测试构象C_trial，若rand3>CR，则C_trial直接等于变异构象C_mutant；7.2) If rand3≤CR, then randomly generate an integer rand4 between 1 and L, use the segment rand4 in the variant conformation C _mutant to replace the corresponding segment in the target conformation C _target , thereby generating a test conformation C _trial , if rand3>CR, Then C _trial is directly equal to the variant conformation C _mutant ;

8)计算测试构象C_trial的能量值E(C_trial)，如果E(C_trial)-E(C_target)＜0，表明测试构象优于目标构象，则测试构象C_trial替换目标构象C_target；8) Calculate the energy value E(C _trial ) of the test conformation C _trial , if E(C _trial )-E(C _target )<0, it indicates that the test conformation is better than the target conformation, then the test conformation C _trial replaces the target conformation C _target ;

9)判断是否满足终止条件，若满足则输出结果并退出，否则返回步骤4)。9) Judging whether the termination condition is satisfied, if so, output the result and exit, otherwise return to step 4).

进一步，所述步骤9)中，对种群中的每个构象个体都执行完步骤6)-8)以后，迭代次数G＝G+1，终止条件为迭代次数G达到步骤3)中预设的最大迭代次数。Further, in step 9), after performing steps 6)-8) for each conformation individual in the population, the number of iterations G=G+1, the termination condition is that the number of iterations G reaches the preset value in step 3) The maximum number of iterations.

本发明的技术构思为：首先，根据各构象的能量值进行升序排列，并计算各构象与能量最低构象的平均能量误差值；然后，选取部分能量较低的构象计算质心构象；最后，根据平均能量误差值判断算法所达到的搜索状态，从而设计不同的质心变异策略生成测试构象，即如果平均能量误差值大于设定的阈值，则设计DE/rand-to-centroid/1策略进行变异，通过提取质心构象中的部分片段替换随机选取的构象中的对应片段生成测试构象，否则设计DE/centroid/2策略进行变异，通过提取随机选择的构象中的片段替换质心构象中的对应片段生成测试构象，从而提高算法搜索效率和预测精度。The technical idea of the present invention is as follows: firstly, arrange in ascending order according to the energy values of each conformation, and calculate the average energy error value between each conformation and the lowest energy conformation; then, select some conformations with lower energy to calculate the centroid conformation; finally, according to the average The energy error value judges the search state achieved by the algorithm, so as to design different centroid mutation strategies to generate the test conformation, that is, if the average energy error value is greater than the set threshold, design the DE/rand-to-centroid/1 strategy to mutate, through Extract some fragments in the centroid conformation to replace the corresponding fragments in the randomly selected conformation to generate a test conformation, otherwise design a DE/centroid/2 strategy to mutate, and generate a test conformation by extracting fragments in the randomly selected conformation to replace the corresponding fragments in the centroid conformation , so as to improve the algorithm search efficiency and prediction accuracy.

本发明的有益效果表现在：根据能量较低的构象计算质心构象，并通过提取质心构象的进化信息设计质心变异策略生成测试构象，从而提高预测精度；其次，根据平均能量误差值判断算法所达到的搜索状态，从而设计适合对应状态的质心变异策略生成测试构象，达到提高算法搜索效率的效果。The beneficial effects of the present invention are as follows: the centroid conformation is calculated according to the conformation with lower energy, and the centroid mutation strategy is designed to generate the test conformation by extracting the evolution information of the centroid conformation, thereby improving the prediction accuracy; secondly, according to the average energy error value, the judgment algorithm achieves The search state, so as to design a centroid mutation strategy suitable for the corresponding state to generate a test conformation, so as to improve the search efficiency of the algorithm.

附图说明Description of drawings

图1是本发明中蛋白质结构预测方法的流程图。Fig. 1 is a flowchart of the protein structure prediction method in the present invention.

图2是本发明中的预测方法对蛋白质4ICB预测时的构象更新示意图。Fig. 2 is a schematic diagram of conformation update of protein 4ICB predicted by the prediction method in the present invention.

图3是本发明中的预测方法对蛋白质4ICB预测时得到的构象分布图。Fig. 3 is a conformational distribution diagram obtained when the prediction method of the present invention predicts the protein 4ICB.

图4是本发明中的预测方法对蛋白质4ICB预测得到的三维结构。Fig. 4 is the three-dimensional structure predicted by the prediction method of the present invention for protein 4ICB.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。The present invention will be further described below in conjunction with the accompanying drawings.

参照图1和图4，一种基于质心变异策略差分进化蛋白质结构预测方法，包括以下步骤：Referring to Figure 1 and Figure 4, a protein structure prediction method based on centroid mutation strategy differential evolution, including the following steps:

2)给定输入序列信息；2) given input sequence information;

5)记当前种群中能量最低的构象C_best，并计算其他构象的能量与C_best的能量E(C_best)的平均能量误差如果迭代次数G＝0，则令δ_max＝δ；5) Record the conformation C _{best with} the lowest energy in the current population, and calculate the average energy error between the energy of other conformations and the energy E(C _best ) of C _best If the number of iterations G=0, let δ _max =δ;

7)为了提高种群的多样性，对变异构象C_mutant执行交叉操作生成测试构象C_trial：7) In order to increase the diversity of the population, perform a crossover operation on the variant conformation C _mutant to generate a test conformation C _trial :

所述步骤9)中，对种群中的每个构象个体都执行完步骤6)-8)以后，迭代次数G＝G+1，终止条件为迭代次数G达到步骤3)中预设的最大迭代次数In said step 9), after performing steps 6)-8) for each conformation individual in the population, the number of iterations G=G+1, the termination condition is that the number of iterations G reaches the preset maximum iteration in step 3) frequency

本实施例序列长度为76的α折叠蛋白质4ICB为实施例，一种基于质心变异策略的差分进化蛋白质结构预测方法，其中包含以下步骤：In this example, the α-fold protein 4ICB with a sequence length of 76 is an example, a differential evolution protein structure prediction method based on the centroid mutation strategy, which includes the following steps:

1)选取Rosetta score3力场模型，即能量函数E(X)；1) Select the Rosetta score3 force field model, namely the energy function E(X);

2)输入蛋白质4ICB的序列信息；2) input the sequence information of protein 4ICB;

3)初始化：设置种群大小NP＝50，交叉因子CR＝0.5，最大迭代次数为10000，由输入序列产生初始构象种群并初始化迭代次数G＝0，其中，N表示维数，表示第i个构象Cⁱ的第N维元素；3) Initialization: set population size NP=50, crossover factor CR=0.5, maximum number of iterations is 10000, generate initial conformation population from input sequence And initialize the number of iterations G=0, where N represents the number of dimensions, represents the N-th dimension element of the i-th conformation ^Ci ;

6.3)设置序列长度L＝76，在1和L之间随机生成4个整数randint1、randint2、randint3和randint4，其中randint1和randint2，randint3和randint4互不相同，令a＝min(randint1,randint2)，b＝max(randint1,randint2)，c＝min(randint3,randint4)，d＝max(randint3,randint4)，其中min表示取两个数的最小值，max表示取两个数的最大值；6.3) Set sequence length L=76, randomly generate 4 integers randint1, randint2, randint3 and randint4 between 1 and L, wherein randint1 and randint2, randint3 and randint4 are different from each other, let a=min(randint1, randint2), b=max(randint1, randint2), c=min(randint3, randint4), d=max(randint3, randint4), where min means taking the minimum value of two numbers, and max means taking the maximum value of two numbers;

9)对种群中的每个构象个体都执行完步骤6)-8)以后，迭代次数G＝G+1，若迭代次数G达到最大迭代次数10000，则输出结果并退出，否则返回步骤4)。9) After executing steps 6)-8) for each conformation individual in the population, the number of iterations G=G+1, if the number of iterations G reaches the maximum number of iterations 10000, output the result and exit, otherwise return to step 4) .

以序列长度为76的α折叠蛋白质4ICB为实施例，运用以上方法得到了该蛋白质的近天然态构象，最小均方根偏差为平均均方根偏差为预测得到的三维结构如图4所示。Taking the α-fold protein 4ICB with a sequence length of 76 as an example, the near-native conformation of the protein was obtained by using the above method, and the minimum root mean square deviation is The average root mean square deviation is The predicted three-dimensional structure is shown in Fig. 4.

以上阐述的是本发明给出的一个实施例表现出来的优良优化效果，显然本发明不仅适合上述实施例，而且可以应用到实际工程中的各个领域，同时在不偏离本发明基本精神及不超出本发明实质内容所涉及内容的前提下可对其做种种变化加以实施。What has been set forth above is the excellent optimization effect shown by an embodiment of the present invention. Obviously, the present invention is not only suitable for the above-mentioned embodiment, but also can be applied to various fields in actual engineering, while not departing from the basic spirit of the present invention and not exceeding Under the premise of the content involved in the essence of the present invention, various changes can be made to it and implemented.

Claims

1. a kind of differential evolution Advances in protein structure prediction based on barycenter Mutation Strategy, it is characterised in that：The protein Structure Prediction Methods include the following steps：

1) protein force field model, i.e. energy function E (X) are chosen；

2) list entries information is given；

3) it initializes：Population Size NP is set, factor CR is intersected, maximum iteration generates initial configurations kind by list entries GroupAnd initialize iterations G=0, wherein N Representation dimension,Indicate i-th of conformation CⁱN-dimensional element；

4) the energy function value E (C of current each conformation of population are calculatedⁱ), i=1,2 ..., NP, and according to each conformation in current population Energy value carries out ascending order arrangement to each conformation；

5) the conformation C of minimum energy in current population is found out_best, and calculate the energy and C of other conformations_bestENERGY E (C_best) Average energy errorIf iterations G=0, enables δ_max=δ；

6) each conformation individual C being directed in populationⁱ, i ∈ { 1,2,3 ..., NP } enable C_target=Cⁱ, C_targetIndicate target structure As individual, the lower Constellation information of energy in current population is extracted, following operation is executed and generates variation conformation C_mutant：

6.1) CT conformation before the lower ranking of energy is chosen in current population Wherein CT=rand (NP/3, NP/2), rand (NP/3, NP/2) indicate the random integers between NP/3 and NP/2,It indicates The N-dimensional element of m-th of selection conformation；

6.2) the barycenter conformation C of CT selected conformation is calculated_centroid=(x_centroid,1,x_centroid,2,…,x_centroid,N), Wherein, conformation C_centroidJth tie up element

6.3) be arranged sequence length L, generated at random between 1 and L 4 integers randint1, randint2, randint3 and Randint4, wherein randint1 and randint2, randint3 and randint4 are different, enable a=min (randint1, Randint2), b=max (randint1, randint2), c=min (randint3, randint4), d=max (randint3, randint4), wherein min indicate that the minimum value of two numbers, max is taken to indicate to take the maximum value of two numbers；

If 6.4) 0.5 δ of δ ＞_max, then DE/rand-to-centroid/ strategies are designed into row variation：It is random from current population Choose two different conformation C^rand1And C^rand2, wherein rand1 ≠ rand2 ∈ [1, NP], extraction barycenter conformation C_centroidPosition Dihedral angle corresponding to amino acid of a to the segment of position b replaces conformation C^rand1Same position corresponding to dihedral angle, simultaneously Extract conformation C^rand2Dihedral angle corresponding to amino acid of the position c to the segment of position d replaces conformation C^rand1Same position institute is right The dihedral angle answered, then by gained C^rand1Segment is carried out to assemble to obtain variation conformation individual C_mutant；

If 6.5) δ of δ≤0.5_max, then DE/centroid/2 strategies are designed into row variation：Two are randomly selected from current population A different conformation C^rand1And C^rand2, wherein rand1 ≠ rand2 ∈ [1, NP], extraction conformation C^rand1Pieces of the position a to position b Dihedral angle corresponding to the amino acid of section replaces barycenter conformation C_centroidSame position corresponding to dihedral angle, use simultaneously C^rand2Dihedral angle corresponding to the amino acid of segments of the upper position c to position d replaces barycenter conformation C_centroidSame position institute is right The dihedral angle answered, then by gained C_centroidSegment is carried out to assemble to obtain variation conformation individual C_mutant；；

7) to the conformation C that makes a variation_mutantIt executes crossover operation and generates test conformation C_trial：

7.1) decimal rand3 is generated at random between zero and one；

If 7.2) rand3≤CR, integer rand4 is generated at random between 1 and L, utilize variation conformation C_mutantIn segment Rand4 replaces target conformation C_targetIn corresponding segment, to generate test conformation C_trialIf rand3>CR, then C_trialDirectly It connects equal to variation conformation C_mutant；

8) test conformation C is calculated_trialEnergy value E (C_trial), if E (C_trial)-E(C_target) ＜ 0, show that test conformation is excellent In target conformation, then conformation C is tested_trialReplace target conformation C_target；

9) judge whether to meet end condition, result is exported if meeting and exit, otherwise return to step 4).

2. a kind of double-deck differential evolution Advances in protein structure prediction based on barycenter Mutation Strategy as described in claim 1, It is characterized in that：In the step 9), step 6) -8 has been carried out to each conformation individual in population) after, iterations G =G+1, end condition are that iterations G reaches preset maximum iteration in step 3).