CN111951885B - Protein structure prediction method based on local bias - Google Patents

Protein structure prediction method based on local bias Download PDF

Info

Publication number
CN111951885B
CN111951885B CN202010803348.XA CN202010803348A CN111951885B CN 111951885 B CN111951885 B CN 111951885B CN 202010803348 A CN202010803348 A CN 202010803348A CN 111951885 B CN111951885 B CN 111951885B
Authority
CN
China
Prior art keywords
fragment
individual
secondary structure
individuals
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010803348.XA
Other languages
Chinese (zh)
Other versions
CN111951885A (en
Inventor
彭绍亮
陈健
王小奇
陈东
李肯立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN202010803348.XA priority Critical patent/CN111951885B/en
Publication of CN111951885A publication Critical patent/CN111951885A/en
Application granted granted Critical
Publication of CN111951885B publication Critical patent/CN111951885B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • G16B15/20Protein or domain folding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Public Health (AREA)
  • Chemical & Material Sciences (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the fields of bioinformatics, intelligent optimization and computer application, and discloses a protein structure prediction method based on local bias. The invention comprises the following steps: calculating the hydrophobic scale difference between the target individual variation window fragment and the fragment library fragment and the secondary structure score of each fragment in the fragment library; counting and sequencing each fragment in the fragment library; selecting the best fragment for fragment assembly, and judging whether the best fragment is received through a Monte Carlo mechanism so as to determine a variant individual; calculating the secondary structure score of the cross fragments of the variant individuals and the random individuals to determine cross individuals; and determining a comparison target individual and a cross individual energy value or a secondary structure score through random number value to select a next generation target individual. The invention avoids the defects of the traditional conformation space optimization method, such as: the sampling efficiency is low, and the prediction precision is low. The invention realizes an improved structure model scoring method by virtue of the hydrophobic characteristics of amino acids and the local structure characteristics of the amino acids.

Description

Protein structure prediction method based on local bias
The technical field is as follows:
the invention relates to the fields of bioinformatics, intelligent optimization and computer application, in particular to a protein structure prediction method based on local bias.
Background art:
protein tertiary structure prediction is one of the major research issues in the field of structural biology. Proteins are long sequences of 20 different amino acid residues that fold into unique three-dimensional structures under specific conditions, and thus perform their biological functions. At present, the prediction of protein structure by computer means has become the mainstream method in this field. The de novo prediction is one of the methods for accurately predicting the three-dimensional structure of the protein from a one-dimensional amino acid sequence, but the complexity and the high dimension of the inherent conformation search space are the most important bottleneck of the method.
The folding process of a protein is very complicated, and among factors influencing the folding process, the hydrophobic interaction of amino acids is one of the main roles, so that considering the hydrophilicity and hydrophobicity of the amino acids can be helpful to improve the sampling efficiency of the de novo prediction method. The basic factor for determining the structure of protein is its one-dimensional amino acid sequence, which is folded by coiling to form a protein molecule with a certain space structure, so that the joint consideration of the primary structure of protein, i.e. the one-dimensional amino acid sequence and the secondary structure information, will help to further improve the efficiency and precision of structure prediction.
However, the existing conformational space optimization method has defects in prediction accuracy and sampling efficiency, and therefore, the model can be constructed by combining the above influence factors, so that the improvement of the existing method can be realized.
The invention content is as follows:
in order to overcome the defects of low sampling efficiency and low prediction precision in the conventional protein conformation optimization method, the invention provides a local biased protein structure prediction method with high sampling efficiency and high prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for predicting protein structure based on local bias, the method comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by utilizing a PSIPRED platform, and constructing a fragment library by utilizing a ROSETTA platform;
3) initializing parameters: setting population size Ps, iteration counter G and maximum genetic algebra GmaxInitial population search track length N, cross segment length c, segment length l, variation counter T, maximum count value Tmax
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xiI ∈ { i ═ 1,. 2.., Ps } proceeds as follows:
5.1) for individual xiCarrying out mutation operation:
5.1.1) randomly generating an integer d' e [1, l-m]And then determining the individual xiFragment of (4) is inserted into the window [ d ', m + d']Where m is the window size;
5.1.2) according to the formula
Figure BDA0002628187930000021
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure BDA0002628187930000022
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure BDA0002628187930000023
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskThe epsilon { H, E, L } is a predicted secondary structure type corresponding to the kth residue in a target protein window region, and H, E and L respectively represent an alpha helix, a beta fold and a loop region;
5.1.4) according to the formula
Figure BDA0002628187930000024
Wherein
Figure BDA0002628187930000025
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure BDA0002628187930000026
Representing the secondary structure type corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure BDA0002628187930000027
Respectively calculating the scores of all the fragments in the fragment library and sequencing the fragments from high to low, wherein w1And w2Weight of hydrophobic scale difference and secondary structure score, Δ R, respectivelyhRepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure BDA0002628187930000028
display sheetScoring the secondary structure of the h-th fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) is entered, otherwise step 5.1.7 is entered)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating T ═ 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' ∈ [1, l-c ], determining the intersection region [ d ', d ' + c ];
5.2.2) determining the secondary structure S corresponding to the target protein in the cross region according to the predicted secondary structurepre={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue of the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure BDA0002628187930000031
Wherein the content of the first and second substances,
Figure BDA0002628187930000032
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure BDA0002628187930000033
Wherein
Figure BDA0002628187930000034
Calculating individual x'iScore of middle cross-over fragment
Figure BDA0002628187930000035
Wherein
Figure BDA0002628187930000036
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPiSecondary structure sequence corresponding to middle cross region
Figure BDA0002628187930000037
Figure BDA0002628187930000038
5.2.6) according to the formula
Figure BDA0002628187930000039
Wherein
Figure BDA00026281879300000310
Calculating an individual xjSecondary Structure score of Mesopross fragments
Figure BDA00026281879300000311
5.2.7) comparison
Figure BDA00026281879300000312
And
Figure BDA00026281879300000313
is large or small, if
Figure BDA00026281879300000314
Then x ″)i=x′iAnd go to step 5.3), otherwise, go to step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iCorresponding fragment in (a), generating cross individual x ″)i
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure BDA00026281879300000315
Wherein
Figure BDA00026281879300000316
And
Figure BDA00026281879300000317
Figure BDA00026281879300000318
wherein
Figure BDA00026281879300000319
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iScore of secondary structure of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
The technical conception of the invention is as follows: under the basic framework of an evolutionary algorithm, carrying out variation and intersection based on amino acid hydrophobic scale and secondary structure similarity on each target individual; and guiding the population to update by a Monte Carlo mechanism and an energy function, and further selecting potential conformation to enter the next generation population.
The beneficial effects of the invention are as follows: on one hand, a conformation space sampling strategy is designed through the hydrophobic property of amino acid and secondary structure knowledge, and the searching efficiency is improved; on the other hand, the Monte Carlo mechanism and the energy function jointly guide population updating, and therefore prediction accuracy is greatly improved.
Description of the drawings:
FIG. 1 is a flow chart of a method for predicting protein structure based on local bias;
FIG. 2 is a schematic diagram of the conformational update when the structure prediction of protein 1ail is performed based on a locally biased protein structure prediction method;
FIG. 3 is a three-dimensional structure diagram obtained by performing structure prediction of protein 1ail based on a locally biased protein structure prediction method.
The specific implementation mode is as follows:
the invention is described in further detail below with reference to the accompanying drawings and specific embodiments:
modifications to the embodiments as appropriate to the teachings of the disclosure
Referring to fig. 1 to 3, a method for predicting a protein structure based on local bias, the method comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by using a PSIPRED platform, and constructing a fragment library by using a ROSETTA platform;
3) initializing parameters: setting the population size Ps to be 100, the iteration counter G to be 0 and the maximum genetic algebra Gmax200, 2500 initial population search track length N, 6 cross segment length c, 6 segment length l, 0 variance counter T, and maximum count value Tmax=150;
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xiI ∈ { i ═ 1,. 2.., Ps } proceeds as follows:
5.1) for individual xiCarrying out mutation operation:
5.1.1) random GenerationAn integer d' epsilon [1, l-m ]]And then determining the individual xiFragment of [ d ', m + d ' is inserted into the window ']Where m is the window size;
5.1.2) according to the formula
Figure BDA0002628187930000051
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure BDA0002628187930000052
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure BDA0002628187930000053
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskThe epsilon { H, E, L } is a predicted secondary structure type corresponding to the kth residue in a target protein window region, and H, E and L respectively represent an alpha helix, a beta fold and a loop region;
5.1.4) according to the formula
Figure BDA0002628187930000054
Wherein
Figure BDA0002628187930000055
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure BDA0002628187930000056
Representing the secondary structure type corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure BDA0002628187930000057
The scores of the fragments in the fragment library are calculated respectively and are sorted from high to low. Wherein, w1And w2Respectively the hydrophobic scale difference and the secondary structure score,ΔRhrepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure BDA0002628187930000058
representing the secondary structure score of the h fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) or step 5.1.7)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating T ═ 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' e [1, l-c ], determining an intersection region [ d ', d ' + c ];
5.2.2) determining the secondary structure S corresponding to the target protein in the cross region according to the predicted secondary structurepre={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue of the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure BDA0002628187930000059
Wherein the content of the first and second substances,
Figure BDA00026281879300000510
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure BDA0002628187930000061
Wherein
Figure BDA0002628187930000062
Calculating individual x'iScore of Mega Cross-fragment
Figure BDA0002628187930000063
Wherein, therein
Figure BDA0002628187930000064
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPjSecondary structure sequence corresponding to middle cross region
Figure BDA0002628187930000065
Figure BDA0002628187930000066
5.2.6) according to the formula
Figure BDA0002628187930000067
Wherein
Figure BDA0002628187930000068
Calculating an individual xjSecondary structure of the middle cross-over fragment
Figure BDA0002628187930000069
5.2.7) comparison
Figure BDA00026281879300000610
And
Figure BDA00026281879300000611
is large or small, if
Figure BDA00026281879300000612
Then x "i ═ x' i, and proceed to step 5.3), otherwise, perform step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iMiddle corresponding segmentGenerating a cross number x ″)i
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure BDA00026281879300000613
Wherein
Figure BDA00026281879300000614
And
Figure BDA00026281879300000615
Figure BDA00026281879300000616
wherein
Figure BDA00026281879300000617
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iSecondary structure score of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
Using the method described above, the near day of the protein was obtained using the alpha protein 1ail with a sequence length of 70 as an exampleHowever, the conformation in which the minimum root mean square deviation of 200 individuals in the final population generation is
Figure BDA00026281879300000618
Mean root mean square deviation of
Figure BDA00026281879300000619
The prediction structure is shown in fig. 3.
The above description is the prediction effect of the present invention using 1ail protein as an example, and is not intended to limit the scope of the present invention, and various modifications and improvements can be made without departing from the scope of the present invention.

Claims (1)

1. A method for predicting protein structure based on local bias, comprising the steps of:
1) given input sequence information;
2) predicting the secondary structure information of the target protein by utilizing a PSIPRED platform, and constructing a fragment library by utilizing a ROSETTA platform;
3) initializing parameters: setting population size Ps, iteration counter G and maximum genetic algebra GmaxInitial population search track length N, cross segment length c, segment length l, variation counter T, maximum count value Tmax
4) Initializing a population: starting Ps Monte Carlo tracks, and searching each track for N times to generate Ps initial individuals;
5) for each target individual xiI ∈ { i ═ 1, 2.., Ps } proceeds as follows:
5.1) for individual xiCarrying out mutation operation:
5.1.1) randomly generating an integer d' e [1, l-m]And then determining the individual xiFragment of [ d ', m + d ' is inserted into the window ']Where m is the window size;
5.1.2) according to the formula
Figure FDA0003544582090000011
Calculating the difference of hydrophobic scales of the window fragment and the fragment library fragment, wherein
Figure FDA0003544582090000012
Is an individual xiThe hydrophobicity corresponding to the i-th residue in the window fragment,
Figure FDA0003544582090000013
is the hydrophobicity value of the ith residue of the fragment in the library;
5.1.3) determining the secondary structure S of the target protein in the corresponding window area according to the predicted secondary structurepre={seckK is more than or equal to | d 'and less than or equal to d' + m }, wherein sec iskThe epsilon { H, E, L } is a predicted secondary structure type corresponding to the kth residue in a target protein window region, and H, E and L respectively represent an alpha helix, a beta fold and a loop region;
5.1.4) according to the formula
Figure FDA0003544582090000014
Wherein
Figure FDA0003544582090000015
Calculating the secondary structure scores of the fragments in the fragment library one by one, wherein
Figure FDA0003544582090000016
Representing the type of secondary structure corresponding to the kth residue of the fragment h in the fragment library;
5.1.5) according to the formula
Figure FDA0003544582090000017
Respectively calculating the scores of all the fragments in the fragment library and sequencing the fragments from high to low, wherein w1And w2Weight of hydrophobic scale difference and secondary structure score, Δ R, respectivelyhRepresenting the hydrophobic scale difference between the h-th fragment in the fragment library and the target window,
Figure FDA0003544582090000018
representing the secondary structure score of the h fragment in the fragment library;
5.1.6) randomly selecting one segment from the first n segments with the highest score for the individual xiGenerating fragment assembly, judging whether the fragment insertion is received by Monte Carlo mechanism, and obtaining variant individual x 'if the fragment insertion is received'iStep 5.2) or step 5.1.7)
5.1.7) updating the iteration parameter T if T is less than TmaxReturning to step 5.1.6), otherwise direct fragment assembly generates variant individual x'iAnd updating, wherein T is 0;
5.2) randomly selecting an individual xjJ ∈ {1, 2.,. Ps } and j ≠ i performs the following interleaving operation:
5.2.1) generating a random integer d ' e [1, l-c ], determining an intersection region [ d ', d ' + c ];
5.2.2) determining the secondary structure S corresponding to the target protein in the cross region according to the predicted secondary structurepre′={sec′kL d ' is less than or equal to k is less than or equal to d ' + c, wherein sec 'kE { H, E, L } is a predicted secondary structure type corresponding to the kth residue in the target protein cross region;
5.2.3) determination of individual x 'Using DSSP'iThereby determining the secondary structure sequence corresponding to the cross region
Figure FDA0003544582090000021
Wherein the content of the first and second substances,
Figure FDA0003544582090000022
is x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.4) according to the formula
Figure FDA0003544582090000023
Wherein
Figure FDA0003544582090000024
Calculating individual x'iScore of middle cross-over fragment
Figure FDA0003544582090000025
Wherein
Figure FDA0003544582090000026
Represents x'iThe secondary structure type corresponding to the kth residue in (c);
5.2.5) determination of individuals x Using DSSPjSecondary structure sequence corresponding to middle cross region
Figure FDA0003544582090000027
Figure FDA0003544582090000028
5.2.6) according to the formula
Figure FDA0003544582090000029
Wherein
Figure FDA00035445820900000210
Calculating an individual xjSecondary Structure score of Mesopross fragments
Figure FDA00035445820900000211
5.2.7) comparison
Figure FDA00035445820900000212
And
Figure FDA00035445820900000213
is large or small, if
Figure FDA00035445820900000214
Then x ″)i=x′iAnd go to step 5.3), otherwise, go to step 5.2.8);
5.2.8) with individuals xjCross-fragment of (a) replaces individual x'iCorresponding fragment in (a), generating cross individual x ″)i
5.3) carrying out the following selection operations on the target individuals and the crossed individuals:
5.3.1) generating a random value rn ∈ [0, 1], if rn > 0.5, entering step 5.3.2), otherwise entering step 5.3.3);
5.3.2) calculating the target individuals x respectivelyiAnd crossed individuals x ″)iEnergy E ofiAnd E ″)iIf E ″)i<EiThen x ″)iReplacement of xiBecoming the next generation target individual, otherwise, not performing the replacement operation, and keeping xiAs a next generation target individual, and proceeds to step 6);
5.3.3) according to the formula
Figure FDA00035445820900000215
Wherein
Figure FDA00035445820900000216
And
Figure FDA00035445820900000217
Figure FDA0003544582090000031
wherein
Figure FDA0003544582090000032
Calculating target individuals x respectivelyiAnd crossed individuals x ″)iSecondary structure score of STAnd S ″)TIf S ″)T>STThen x ″)iReplacement of xiBecome the next generation target individual, otherwise, keep xiAs a next generation target individual, and proceeds to step 6);
6) after step 5) is executed for each individual in the population, judging whether G is larger than G or not, wherein the iteration number G is G +1maxIf G > GmaxThen the iteration is stopped and exited, otherwise step 5) is returned to.
CN202010803348.XA 2020-08-11 2020-08-11 Protein structure prediction method based on local bias Active CN111951885B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010803348.XA CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010803348.XA CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Publications (2)

Publication Number Publication Date
CN111951885A CN111951885A (en) 2020-11-17
CN111951885B true CN111951885B (en) 2022-05-03

Family

ID=73331694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010803348.XA Active CN111951885B (en) 2020-08-11 2020-08-11 Protein structure prediction method based on local bias

Country Status (1)

Country Link
CN (1) CN111951885B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967751A (en) * 2021-03-21 2021-06-15 湖南大学 Protein conformation space optimization method based on evolution search

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017011779A1 (en) * 2015-07-16 2017-01-19 Dnastar, Inc. Protein structure prediction system
CN108334746B (en) * 2018-01-15 2021-06-18 浙江工业大学 Protein structure prediction method based on secondary structure similarity
CN109033753B (en) * 2018-06-07 2021-06-18 浙江工业大学 Group protein structure prediction method based on secondary structure fragment assembly
CN109086566B (en) * 2018-07-12 2021-06-18 浙江工业大学 Group protein structure prediction method based on fragment resampling
CN109101785B (en) * 2018-07-12 2021-06-18 浙江工业大学 Protein structure prediction method based on secondary structure similarity selection strategy
CN109300505B (en) * 2018-08-29 2021-05-18 浙江工业大学 Protein structure prediction method based on biased sampling
US11581060B2 (en) * 2019-01-04 2023-02-14 President And Fellows Of Harvard College Protein structures from amino-acid sequences using neural networks

Also Published As

Publication number Publication date
CN111951885A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN107609342B (en) Protein conformation search method based on secondary structure space distance constraint
CN108334746B (en) Protein structure prediction method based on secondary structure similarity
CN107633157B (en) Protein conformation space optimization method based on distribution estimation and copy exchange strategy
CN110010194A (en) A kind of prediction technique of RNA secondary structure
CN109360599B (en) Protein structure prediction method based on residue contact information cross strategy
CA3024017C (en) Neural network architectures for scoring and visualizing biological sequence variations using molecular phenotype, and systems and methods therefor
CN111951885B (en) Protein structure prediction method based on local bias
CN109086566B (en) Group protein structure prediction method based on fragment resampling
Zhao et al. Identifying N6-methyladenosine sites using extreme gradient boosting system optimized by particle swarm optimizer
Di Francesco et al. Fold recognition using predicted secondary structure sequences and hidden Markov models of protein folds
CN109215733B (en) Protein structure prediction method based on residue contact information auxiliary evaluation
CN109378034B (en) Protein prediction method based on distance distribution estimation
CN109033753B (en) Group protein structure prediction method based on secondary structure fragment assembly
CN109346128B (en) Protein structure prediction method based on residue information dynamic selection strategy
CN109300505B (en) Protein structure prediction method based on biased sampling
CN109360597B (en) Group protein structure prediction method based on global and local strategy cooperation
CN109326319B (en) Protein conformation space optimization method based on secondary structure knowledge
CN109360598B (en) Protein structure prediction method based on two-stage sampling
CN108595910B (en) Group protein conformation space optimization method based on diversity index
CN109147867B (en) Group protein structure prediction method based on dynamic segment length
CN109411013B (en) Group protein structure prediction method based on individual specific variation strategy
CN109243526B (en) Protein structure prediction method based on specific fragment crossing
CN109390035B (en) Protein conformation space optimization method based on local structure comparison
CN112967751A (en) Protein conformation space optimization method based on evolution search
CN110556161B (en) Protein structure prediction method based on conformational diversity sampling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant