CN111951885A - Protein structure prediction method based on local bias - Google Patents

Protein structure prediction method based on local bias Download PDF

Info

Publication number
CN111951885A
CN111951885A CN202010803348.XA CN202010803348A CN111951885A CN 111951885 A CN111951885 A CN 111951885A CN 202010803348 A CN202010803348 A CN 202010803348A CN 111951885 A CN111951885 A CN 111951885A
Authority
CN
China
Prior art keywords
fragment
individual
secondary structure
individuals
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010803348.XA
Other languages
Chinese (zh)
Other versions
CN111951885B (en
Inventor
彭绍亮
陈健
王小奇
陈东
李肯立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202010803348.XA priority Critical patent/CN111951885B/en
Publication of CN111951885A publication Critical patent/CN111951885A/en
Application granted granted Critical
Publication of CN111951885B publication Critical patent/CN111951885B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Public Health (AREA)
  • Chemical & Material Sciences (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the fields of bioinformatics, intelligent optimization and computer application, and discloses a protein structure prediction method based on local bias. The invention comprises the following steps: calculating the hydrophobic scale difference between the target individual variation window fragment and the fragment library fragment and the secondary structure score of each fragment in the fragment library; counting and sequencing each fragment in the fragment library; selecting the best fragment for fragment assembly, and judging whether the best fragment is received through a Monte Carlo mechanism so as to determine a variant individual; calculating the secondary structure score of the cross fragments of the variant individuals and the random individuals to determine cross individuals; and determining a comparison target individual and a cross individual energy value or a secondary structure score through random number value to select a next generation target individual. The invention avoids the defects of the traditional conformation space optimization method, such as: the sampling efficiency is low, and the prediction precision is low. The invention realizes an improved structure model scoring method by virtue of the hydrophobic characteristics of amino acids and the local structure characteristics of the amino acids.

Description

Protein structure prediction method based on local bias
The technical field is as follows:
the invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a protein structure prediction method based on local bias.
Background art:
protein tertiary structure prediction is one of the major research issues in the field of structural biology. Proteins are long sequences of 20 different amino acid residues that fold into unique three-dimensional structures under specific conditions, and thus perform their biological functions. At present, the prediction of protein structure by computer means has become the mainstream method in this field. The de novo prediction is one of the methods for accurately predicting the three-dimensional structure of the protein from a one-dimensional amino acid sequence, but the complexity and the high dimension of the inherent conformation search space are the most important bottleneck of the method.
The folding process of a protein is very complex, and the hydrophobic effect of amino acid is one of the main effects among factors influencing the folding process of the protein, so that the consideration of the hydrophilicity and hydrophobicity of the amino acid can be helpful to improve the sampling efficiency of the de novo prediction method. The basic factor for determining the structure of protein is its one-dimensional amino acid sequence, which is folded by coiling to form a protein molecule with a certain spatial structure, so that the joint consideration of the primary structure of protein, i.e. the one-dimensional amino acid sequence and the secondary structure information, will help to further improve the efficiency and precision of structure prediction.
However, the existing conformational space optimization method has defects in prediction accuracy and sampling efficiency, and therefore, the model can be constructed by combining the above influence factors, so that the improvement of the existing method can be realized.
The invention content is as follows:
in order to overcome the defects of low sampling efficiency and low prediction precision in the conventional protein conformation optimization method, the invention provides a local biased protein structure prediction method with high sampling efficiency and high prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for predicting protein structure based on local bias, the method comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by utilizing a PSIPRED platform, and constructing a fragment library by utilizing a ROSETTA platform;
3) initializing parameters: setting population size Ps, iteration counter G and maximum genetic algebra GmaxInitial population search track length N, cross segment length c, segment length l, variation counter T, maximum count value Tmax
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xiI ∈ { i ═ 1,. 2.., Ps } proceeds as follows:
5.1) for individual xiCarrying out mutation operation:
5.1.1) randomly generating an integer d' e [1, l-m]And then determining the individual xiFragment of [ d ', m + d ' is inserted into the window ']Where m is the window size;
5.1.2) according to the formula
Figure BDA0002628187930000021
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure BDA0002628187930000022
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure BDA0002628187930000023
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskThe epsilon { H, E, L } is a predicted secondary structure type corresponding to the kth residue in a target protein window region, and H, E and L respectively represent an alpha helix, a beta fold and a loop region;
5.1.4) according to the formula
Figure BDA0002628187930000024
Wherein
Figure BDA0002628187930000025
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure BDA0002628187930000026
Representing the secondary structure type corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure BDA0002628187930000027
Respectively calculating the scores of all the fragments in the fragment library and sequencing the fragments from high to low, wherein w1And w2Weight of hydrophobic scale difference and secondary structure score, Δ R, respectivelyhRepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure BDA0002628187930000028
representing the secondary structure score of the h fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) or step 5.1.7)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating T ═ 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' e [1, l-c ], determining an intersection region [ d ', d ' + c ];
5.2.2) determining the secondary structure S corresponding to the target protein in the cross region according to the predicted secondary structurepre={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue of the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure BDA0002628187930000031
Wherein,
Figure BDA0002628187930000032
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure BDA0002628187930000033
Wherein
Figure BDA0002628187930000034
Calculating individual x'iScore of middle cross-over fragment
Figure BDA0002628187930000035
Wherein
Figure BDA0002628187930000036
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPiSecondary structure sequence corresponding to middle cross region
Figure BDA0002628187930000037
Figure BDA0002628187930000038
5.2.6) according to the formula
Figure BDA0002628187930000039
Wherein
Figure BDA00026281879300000310
ComputingIndividual xjSecondary Structure score of Mesopross fragments
Figure BDA00026281879300000311
5.2.7) comparison
Figure BDA00026281879300000312
And
Figure BDA00026281879300000313
is large or small, if
Figure BDA00026281879300000314
Then x ″)i=x′iAnd go to step 5.3), otherwise, go to step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iCorresponding fragment in (a), generating cross individual x ″)i
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure BDA00026281879300000315
Wherein
Figure BDA00026281879300000316
And
Figure BDA00026281879300000317
Figure BDA00026281879300000318
wherein
Figure BDA00026281879300000319
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iSecondary structure score of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
The technical conception of the invention is as follows: under the basic framework of an evolutionary algorithm, carrying out variation and intersection based on amino acid hydrophobic scale and secondary structure similarity on each target individual; and guiding the population to update by a Monte Carlo mechanism and an energy function, and further selecting potential conformation to enter the next generation population.
The beneficial effects of the invention are as follows: on one hand, a conformation space sampling strategy is designed through the hydrophobic property of amino acid and secondary structure knowledge, and the searching efficiency is improved; on the other hand, the Monte Carlo mechanism and the energy function together guide population updating, and therefore prediction accuracy is greatly improved.
Description of the drawings:
FIG. 1 is a flow chart of a method for predicting protein structure based on local bias;
FIG. 2 is a schematic diagram of the conformational update when the structure prediction of protein 1ail is performed based on a locally biased protein structure prediction method;
FIG. 3 is a three-dimensional structure diagram obtained by performing structure prediction of protein 1ail based on a locally biased protein structure prediction method.
The specific implementation mode is as follows:
the invention is described in further detail below with reference to the accompanying drawings and specific embodiments:
modifications to the embodiments as appropriate to the teachings of the disclosure
Referring to fig. 1 to 3, a method for predicting a protein structure based on local bias, the method comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by utilizing a PSIPRED platform, and constructing a fragment library by utilizing a ROSETTA platform;
3) initializing parameters: setting the population size Ps to be 100, the iteration counter G to be 0 and the maximum genetic algebra Gmax200, 2500 initial population search track length N, 6 cross segment length c, 6 segment length l, 0 variance counter T, and maximum count value Tmax=150;
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xiI ∈ { i ═ 1,. 2.., Ps } proceeds as follows:
5.1) for individual xiCarrying out mutation operation:
5.1.1) randomly generating an integer d' e [1, l-m]And then determining the individual xiFragment of [ d ', m + d ' is inserted into the window ']Where m is the window size;
5.1.2) according to the formula
Figure BDA0002628187930000051
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure BDA0002628187930000052
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure BDA0002628187930000053
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskE { H, E, L } isPredicting the secondary structure type corresponding to the kth residue in a target protein window region, wherein H, E and L respectively represent an alpha helix, a beta sheet and a loop region;
5.1.4) according to the formula
Figure BDA0002628187930000054
Wherein
Figure BDA0002628187930000055
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure BDA0002628187930000056
Representing the secondary structure type corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure BDA0002628187930000057
The scores of the fragments in the fragment library are calculated respectively and are sorted from high to low. Wherein, w1And w2Weight of hydrophobic scale difference and secondary structure score, Δ R, respectivelyhRepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure BDA0002628187930000058
representing the secondary structure score of the h fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) or step 5.1.7)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating T ═ 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' e [1, l-c ], determining an intersection region [ d ', d ' + c ];
5.2.2) determining the secondary structure S corresponding to the target protein in the cross region according to the predicted secondary structurepre={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue of the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure BDA0002628187930000059
Wherein,
Figure BDA00026281879300000510
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure BDA0002628187930000061
Wherein
Figure BDA0002628187930000062
Calculating individual x'iScore of middle cross-over fragment
Figure BDA0002628187930000063
Wherein, therein
Figure BDA0002628187930000064
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPjSecondary structure sequence corresponding to middle cross region
Figure BDA0002628187930000065
Figure BDA0002628187930000066
5.2.6) according to the formula
Figure BDA0002628187930000067
Wherein
Figure BDA0002628187930000068
Calculating an individual xjSecondary structure of the middle cross-over fragment
Figure BDA0002628187930000069
5.2.7) comparison
Figure BDA00026281879300000610
And
Figure BDA00026281879300000611
is large or small, if
Figure BDA00026281879300000612
Then x "i ═ x' i, and proceed to step 5.3), otherwise, perform step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iGenerating cross pieces x ″' of corresponding fragmentsi
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure BDA00026281879300000613
Wherein
Figure BDA00026281879300000614
And
Figure BDA00026281879300000615
Figure BDA00026281879300000616
wherein
Figure BDA00026281879300000617
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iSecondary structure score of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
Using the example of alpha protein 1ail with a sequence length of 70, the above method was used to obtain the near-native conformation of the protein, wherein the minimum RMS deviation of 200 individuals in the final population was
Figure BDA00026281879300000618
Mean root mean square deviation of
Figure BDA00026281879300000619
The prediction structure is shown in fig. 3.
The above description is the prediction effect of the present invention using 1ail protein as an example, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1. A method for predicting protein structure based on local bias, comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by utilizing a PSIPRED platform, and constructing a fragment library by utilizing a ROSETTA platform;
3) initializing parameters: setting population size Ps, iteration counter G and maximum genetic algebra GmaxInitial population search track length N, cross segment length c, segment length l, variation counter T, maximum count value Tmax
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xi
Figure FDA0002628187920000019
The following operations are carried out:
5.1) for individual xiCarrying out mutation operation:
5.1.1) randomly generating an integer d' e [1, l-m]And then determining the individual xiFragment of [ d ', m + d ' is inserted into the window ']Where m is the window size;
5.1.2) according to the formula
Figure FDA0002628187920000011
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure FDA0002628187920000012
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure FDA0002628187920000013
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskThe epsilon { H, E, L } is a predicted secondary structure type corresponding to the kth residue in a target protein window region, and H, E and L respectively represent an alpha helix, a beta fold and a loop region;
5.1.4) according toFormula (II)
Figure FDA0002628187920000014
Wherein
Figure FDA0002628187920000015
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure FDA0002628187920000016
Representing the secondary structure type corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure FDA0002628187920000017
Respectively calculating the scores of all the fragments in the fragment library and sequencing the fragments from high to low, wherein w1And w2Weight of hydrophobic scale difference and secondary structure score, Δ R, respectivelyhRepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure FDA0002628187920000018
representing the secondary structure score of the h fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) or step 5.1.7)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating T ═ 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' e [1, l-c ], determining an intersection region [ d ', d ' + c ];
5.2.2) determining the target based on the predicted Secondary StructureSecondary structure S corresponding to protein in cross regionpre={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue of the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure FDA0002628187920000021
Wherein,
Figure FDA0002628187920000022
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure FDA0002628187920000023
Wherein
Figure FDA0002628187920000024
Calculating individual x'iScore of middle cross-over fragment
Figure FDA0002628187920000025
Wherein
Figure FDA0002628187920000026
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPjSecondary structure sequence corresponding to middle cross region
Figure FDA0002628187920000027
Figure FDA0002628187920000028
5.2.6) according to the formula
Figure FDA0002628187920000029
Wherein
Figure FDA00026281879200000210
Calculating an individual xjSecondary Structure score of Mesopross fragments
Figure FDA00026281879200000211
5.2.7) comparison
Figure FDA00026281879200000212
And
Figure FDA00026281879200000213
is large or small, if
Figure FDA00026281879200000214
Then x ″)i=x′iAnd go to step 5.3), otherwise, go to step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iCorresponding fragment in (a), generating cross individual x ″)i
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure FDA00026281879200000215
Wherein
Figure FDA00026281879200000216
And
Figure FDA00026281879200000217
Figure FDA0002628187920000031
wherein
Figure FDA0002628187920000032
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iSecondary structure score of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
CN202010803348.XA 2020-08-11 2020-08-11 Protein structure prediction method based on local bias Active CN111951885B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010803348.XA CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010803348.XA CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Publications (2)

Publication Number Publication Date
CN111951885A true CN111951885A (en) 2020-11-17
CN111951885B CN111951885B (en) 2022-05-03

Family

ID=73331694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010803348.XA Active CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Country Status (1)

Country Link
CN (1) CN111951885B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967751A (en) * 2021-03-21 2021-06-15 湖南大学 Protein conformation space optimization method based on evolution search

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017011779A1 (en) * 2015-07-16 2017-01-19 Dnastar, Inc. Protein structure prediction system
CN108334746A (en) * 2018-01-15 2018-07-27 浙江工业大学 A kind of Advances in protein structure prediction based on secondary structure similarity
CN109033753A (en) * 2018-06-07 2018-12-18 浙江工业大学 A kind of group's Advances in protein structure prediction based on the assembling of secondary structure segment
CN109086566A (en) * 2018-07-12 2018-12-25 浙江工业大学 A kind of group's Advances in protein structure prediction based on segment resampling
CN109101785A (en) * 2018-07-12 2018-12-28 浙江工业大学 A kind of Advances in protein structure prediction based on secondary structure similarity selection strategy
CN109300505A (en) * 2018-08-29 2019-02-01 浙江工业大学 It is a kind of based on there is the Advances in protein structure prediction sampled partially
US20200234788A1 (en) * 2019-01-04 2020-07-23 President And Fellows Of Harvard College Protein structures from amino-acid sequences using neural networks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017011779A1 (en) * 2015-07-16 2017-01-19 Dnastar, Inc. Protein structure prediction system
CN108334746A (en) * 2018-01-15 2018-07-27 浙江工业大学 A kind of Advances in protein structure prediction based on secondary structure similarity
CN109033753A (en) * 2018-06-07 2018-12-18 浙江工业大学 A kind of group's Advances in protein structure prediction based on the assembling of secondary structure segment
CN109086566A (en) * 2018-07-12 2018-12-25 浙江工业大学 A kind of group's Advances in protein structure prediction based on segment resampling
CN109101785A (en) * 2018-07-12 2018-12-28 浙江工业大学 A kind of Advances in protein structure prediction based on secondary structure similarity selection strategy
CN109300505A (en) * 2018-08-29 2019-02-01 浙江工业大学 It is a kind of based on there is the Advances in protein structure prediction sampled partially
US20200234788A1 (en) * 2019-01-04 2020-07-23 President And Fellows Of Harvard College Protein structures from amino-acid sequences using neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JAD ABBASS 等: "Enhancing fragment-based protein structure prediction by customising fragment cardinality according to local secondary structure", 《METHODOLOGY ARTICLE》 *
包晨等: "基于多尺度卷积和循环神经网络的蛋白质二级结构预测", 《基因组学与应用生物学》 *
王小奇 等: "距离和疏水模型辅助的蛋白质结构预测方法", 《小型微型计算机系统》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967751A (en) * 2021-03-21 2021-06-15 湖南大学 Protein conformation space optimization method based on evolution search

Also Published As

Publication number Publication date
CN111951885B (en) 2022-05-03

Similar Documents

Publication Publication Date Title
CN107609342B (en) Protein conformation search method based on secondary structure space distance constraint
CN107633157B (en) Protein conformation space optimization method based on distribution estimation and copy exchange strategy
CN108334746B (en) Protein structure prediction method based on secondary structure similarity
CN106055920B (en) It is a kind of based on the Advances in protein structure prediction that tactful copy exchanges more than stage
CN109360599B (en) Protein structure prediction method based on residue contact information cross strategy
CN108062457B (en) Protein structure prediction method for structure feature vector auxiliary selection
CN110010194A (en) A kind of prediction technique of RNA secondary structure
CN109086566B (en) Group protein structure prediction method based on fragment resampling
KR20030043908A (en) Method for Determining Three-Dimensional Protein Structure from Primary Protein Sequence
CN111951885B (en) Protein structure prediction method based on local bias
Di Francesco et al. Fold recognition using predicted secondary structure sequences and hidden Markov models of protein folds
CN109215733B (en) Protein structure prediction method based on residue contact information auxiliary evaluation
CN109378034B (en) Protein prediction method based on distance distribution estimation
Bernard et al. State-of-the-RNArt: benchmarking current methods for RNA 3D structure prediction
CN109346128B (en) Protein structure prediction method based on residue information dynamic selection strategy
CN109360597B (en) Group protein structure prediction method based on global and local strategy cooperation
CN109300505B (en) Protein structure prediction method based on biased sampling
CN109033753B (en) Group protein structure prediction method based on secondary structure fragment assembly
CN108595910B (en) Group protein conformation space optimization method based on diversity index
CN109326319B (en) Protein conformation space optimization method based on secondary structure knowledge
CN109147867B (en) Group protein structure prediction method based on dynamic segment length
CN109411013B (en) Group protein structure prediction method based on individual specific variation strategy
CN109243526B (en) Protein structure prediction method based on specific fragment crossing
CN109326318B (en) Group protein structure prediction method based on Loop region Gaussian disturbance
CN109063413B (en) Method for optimizing space of protein conformation by population hill climbing iteration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant