CN111180004A - Multi-contact information sub-population strategy protein structure prediction method - Google Patents
Multi-contact information sub-population strategy protein structure prediction method Download PDFInfo
- Publication number
- CN111180004A CN111180004A CN201911197621.2A CN201911197621A CN111180004A CN 111180004 A CN111180004 A CN 111180004A CN 201911197621 A CN201911197621 A CN 201911197621A CN 111180004 A CN111180004 A CN 111180004A
- Authority
- CN
- China
- Prior art keywords
- contact
- target
- population
- trial
- confidence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000000455 protein structure prediction Methods 0.000 title description 13
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 36
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 36
- 239000012634 fragment Substances 0.000 claims abstract description 34
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 6
- 238000005457 optimization Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000006303 immediate early viral mRNA transcription Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 101100382574 Bos taurus CASP13 gene Proteins 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
Landscapes
- Spectroscopy & Molecular Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
A method for predicting a protein structure of a multi-contact information sub-population strategy comprises the steps of firstly, initializing a population by utilizing a fragment assembly technology under an evolutionary algorithm framework; then, dividing the population into a plurality of sub-populations, carrying out variation on each individual in the sub-populations, and carrying out cross operation to generate a new conformation; in the selection link, a new structure is selected by using a Rosetta energy function score 3; then, using Scon(C) The new low energy conformations are further screened while preserving the diversity of conformations during selection by monte carlo probability acceptance criteria. The method utilizes the concept of the sub-population and combines the contact information auxiliary structure prediction predicted by a plurality of contact servers, so that the problem of inaccuracy of an energy function can be relieved, and the diversity of the population can be improved. The invention provides a method for predicting the protein structure of a sub-population strategy of multi-contact information, which has good diversity and high prediction precision.
Description
Technical Field
The invention relates to the fields of bioinformatics and computer application, in particular to a method for predicting a protein structure of a multi-contact information sub-population strategy.
Background
Protein structure prediction is a major research content in structural bioinformatics. In the global protein structure prediction competition held by campfon, mexico (CASP13) at 12 months of 2018, AlphaFold, developed by the deep mind team under google, obtained the first total name. The most innovative and breakthrough place of AlphaFold is that the spatial distance relationship of the protein structure is predicted by using a machine learning method, and the spatial distance constraint is used as an energy function to guide the folding of the protein, so that the prediction precision is greatly improved. The work also shows that the deep cross fusion of the fields of computer technology, information technology and life science can effectively drive and accelerate the new discovery of science. However, de novo prediction methods currently face a number of difficulties and challenges.
First, due to the inaccuracy of energy models, the accuracy of inter-residue contact information is one of the key factors that currently restrict the accuracy of de novo protein structure prediction. Although the precision of prediction of contact information among residues reaches an unprecedented new era, the accuracy of the contact information is low, and the contact information predicted by each contact prediction server is uneven, so that the accuracy of the contact prediction and the precision of protein structure prediction do not form a good corresponding relation.
Second, the inherent complexity of spatial optimization of protein conformation makes it a very challenging research topic in the field of de novo protein structure prediction. In order to find unique native protein structures in a huge sampling space by using a computer, an efficient conformational space optimization algorithm must be designed to convert the native protein structures into a practical computational problem. The differential evolution algorithm (DE) has the advantages of simple structure, easy realization, strong robustness, high convergence speed and the like, and is widely applied in the field of protein conformation space optimization. However, as the amino acid sequence increases, the degree of freedom of a protein molecular system also increases, and obtaining a global optimal solution of a large-scale protein conformation space by using a traditional population algorithm under the condition of ensuring population diversity becomes challenging work.
Therefore, the conventional protein structure prediction methods are insufficient in diversity and prediction accuracy, and improvement is required.
Disclosure of Invention
In order to solve the problems of poor diversity and low prediction precision of the conventional protein structure prediction method in the sampling process, the invention firstly uses a plurality of contact prediction servers to predict the obtained contact information and then constructs a high-confidence contact set. Meanwhile, by utilizing the concept of the sub-population, different space constraint models are adopted for different sub-populations to assist the Rosettascore3 energy function to guide conformation selection. The invention provides a sub-population strategy protein structure prediction method of multi-contact information with good diversity and high prediction precision.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for predicting protein structure of multiple contact information sub-population strategy, comprising the following steps:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) predicting 3 contact maps from a Raptorx server (RaptorX. uchicago. edu/ContactMap), a ResTriplet server (zhangglab. ccmb. med. omich. edu/Restriplet) and a DNCON2 server (sysbio. rnet. missouri. edu/DNCON2) respectively according to a target protein sequence, and selecting L/5 contact information from large to small according to the confidence degree of each contact information in each contact map to form a high-confidence contact information set contact 1, a contact 2 and a contact cf3 respectively, wherein L is the length of the target protein sequence;
4) constructing a contact set contf 4 with high confidence according to the contact information of the contf 1, the contf 2 and the contcf 3, wherein the construction rule of the contf 4 is as follows:
4.1) adding contact information for each of contif 1, contif 2 and conticf 3 to conticf 4, respectively, if the residue pair does not overlap for the contact information in contif 1, contif 2 and conticf 3;
4.2) for the contact information in contictf 1, contictf 2 and conticf 3, if the residue pair is repeated, firstly, averaging the confidence degrees of the contact information repeated in contictf 1, contictf 2 and conticf 3, and then adding the average to the contictf 4;
4.3) sorting according to the confidence degree of the contact information in the contact 4 from big to small, and calculating the number Num of contacts in the contact 4;
5) setting parameters, namely a population size NP, a maximum iteration algebra G of the algorithm, a cross factor CR and a temperature factor beta, and setting the iteration algebra G to be 0;
6) population initialization: random fragment assembly to generate NP initial conformations CiI ═ 1,2, …, NP, dividing NP initial constellations equally into 4 sub-populations
7) For each individual in the population CiThe following operations are carried out:
7.1) mixing CiSet as target individual CtargetRandomly selecting two different individuals C from the populationaAnd Cb,Ctarget≠Ca≠CbFrom C, respectivelya、CbIn the method, a 9 segment with different positions is randomly selected and respectively replacedCorresponding position fragment generates variant conformation Cmutant;
7.2) pairs of CmutantOne-time fragment assembly to generate new conformation Cmutant′;
7.3) generating a random number pCR, where pCR ∈ (0,1), if pCR < CR, then from CtargetIn the sequence, randomly selecting a 3-segment to replace to Cmutant' fragment of corresponding position generates test conformation CtrialOtherwise, directly handle Cmutant' As Ctrial;
7.4) computing C using the Rosetta score3 energy functiontarget、CtrialEnergy score3 (C)trial)、score3(Ctarget);
7.5) if score3 (C)trial)>score3(Ctarget) Then C is retainedtarget;
7.6) if score3 (C)trial)<score3(Ctarget) Then C is calculated according to equation (1)trialAnd CtargetIs a space constraint score of Scon(C),Scon(C) Is defined as follows;
wherein m and n are respectively the m-th residue and the n-th residue corresponding to the K-th contact in the high-confidence contact set, K is the number of contacts in the high-confidence contact set, dm,nEuclidean distance of the mth residue from the nth residue in conformation C, Um,nConfidence that the residue pair (m, n) corresponds to a contact in the high-confidence contact set, ifContact 1 for high confidence contact set selection, ifContact 2 for high confidence contact set selection, ifContact 3 for high confidence contact set selection, ifHigh confidence contact set selects contact 4;
7.7) if Scon(Ctrial)<Scon(Ctarget) Then C istrialReplacement CtargetEntering a population;
7.8) if Scon(Ctrial)>Scon(Ctarget) Then C istrialWith probability PacceptReplacement CtargetEntering a population, and if the replacement is unsuccessful, retaining CtargetWherein P isacceptIs defined as follows;
8) g +1, and iteratively executing the steps 5) -8) until G is greater than G;
9) the lowest conformation of Rosetta score3 was exported as the final result.
The technical conception of the invention is as follows: under the framework of an evolutionary algorithm, first, a population is initialized using a fragment assembly technique. Then, dividing the population into a plurality of sub-populations, carrying out variation on each individual in the sub-populations, and carrying out cross operation to generate a new conformation; in the selection step, a new structure is selected by using a Rosetta energy function score3, and then S is usedcon(C) The new low energy conformations are further screened while preserving the diversity of conformations during selection by monte carlo probability acceptance criteria. The method utilizes the concept of the sub-population and combines the contact information auxiliary structure prediction predicted by a plurality of contact servers, so that the problem of inaccuracy of an energy function can be relieved, and the diversity of the population can be improved. The invention provides a method for predicting the protein structure of a sub-population strategy of multi-contact information, which has good diversity and high prediction precision.
The invention has the beneficial effects that: according to different sub-populations, different space constraint fractions are constructed to assist the Rosetta energy function score3 in selecting the conformation, so that the problem of prediction error caused by inaccuracy of the energy function is relieved, and the prediction accuracy is improved.
Drawings
FIG. 1 is a conformational distribution diagram obtained by protein 4UEX sampling by a subgroup strategy protein structure prediction method of multi-contact information.
FIG. 2 is a schematic diagram of conformation update of protein 4UEX in a multi-contact information sub-population strategy protein structure prediction method.
FIG. 3 is a three-dimensional structure predicted by a subgroup strategy protein structure prediction method of multi-contact information on a protein 4UEX structure.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 3, a method for predicting protein structure of multiple contact information sub-population strategy, the method comprising the following steps:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) predicting 3 contact maps from a Raptorx server (RaptorX. uchicago. edu/ContactMap), a ResTriplet server (zhangglab. ccmb. med. omich. edu/Restriplet) and a DNCON2 server (sysbio. rnet. missouri. edu/DNCON2) respectively according to a target protein sequence, and selecting L/5 contact information from large to small according to the confidence degree of each contact information in each contact map to form a high-confidence contact information set contact 1, a contact 2 and a contact cf3 respectively, wherein L is the length of the target protein sequence;
4) constructing a contact set contf 4 with high confidence according to the contact information of the contf 1, the contf 2 and the contcf 3, wherein the construction rule of the contf 4 is as follows:
4.1) adding contact information for each of contif 1, contif 2 and conticf 3 to conticf 4, respectively, if the residue pair does not overlap for the contact information in contif 1, contif 2 and conticf 3;
4.2) for the contact information in contictf 1, contictf 2 and conticf 3, if the residue pair is repeated, firstly, averaging the confidence degrees of the contact information repeated in contictf 1, contictf 2 and conticf 3, and then adding the average to the contictf 4;
4.3) sorting according to the confidence degree of the contact information in the contact 4 from big to small, and calculating the number Num of contacts in the contact 4;
5) setting parameters, namely a population size NP, a maximum iteration algebra G of the algorithm, a cross factor CR and a temperature factor beta, and setting the iteration algebra G to be 0;
6) population initialization: random fragment assembly to generate NP initial conformations CiI ═ 1,2, …, NP, dividing NP initial constellations equally into 4 sub-populations
7) For each individual in the population CiThe following operations are carried out:
7.1) mixing CiSet as target individual CtargetRandomly selecting two different individuals C from the populationaAnd Cb,Ctarget≠Ca≠CbFrom C, respectivelya、CbIn the method, a 9 segment with different positions is randomly selected and respectively replacedCorresponding position fragment generates variant conformation Cmutant;
7.2) pairs of CmutantOne-time fragment assembly to generate new conformation Cmutant′;
7.3) generating a random number pCR, where pCR ∈ (0,1), if pCR < CR, then from CtargetIn the sequence, randomly selecting a 3-segment to replace to Cmutant' fragment of corresponding position generates test conformation CtrialOtherwise, directly handle Cmutant' As Ctrial;
7.4) computing C using the Rosetta score3 energy functiontarget、CtrialEnergy score3 (C)trial)、score3(Ctarget);
7.5) if score3 (C)trial)>score3(Ctarget) Then C is retainedtarget;
7.6) if score3 (C)trial)<score3(Ctarget) Then C is calculated according to equation (1)trialAnd CtargetIs a space constraint score of Scon(C),Scon(C) Is defined as follows;
wherein m and n are respectively the m-th residue and the n-th residue corresponding to the K-th contact in the high-confidence contact set, K is the number of contacts in the high-confidence contact set, dm,nEuclidean distance of the mth residue from the nth residue in conformation C, Um,nConfidence that the residue pair (m, n) corresponds to a contact in the high-confidence contact set, ifContact 1 for high confidence contact set selection, ifContact 2 for high confidence contact set selection, ifContact 3 for high confidence contact set selection, ifHigh confidence contact set selects contact 4;
7.7) if Scon(Ctrial)<Scon(Ctarget) Then C istrialReplacement CtargetEntering a population;
7.8) if Scon(Ctrial)>Scon(Ctarget) Then C istrialWith probability PacceptReplacement CtargetEntering a population, and if the replacement is unsuccessful, retaining CtargetWherein P isacceptIs defined as follows;
8) g +1, and iteratively executing the steps 5) -8) until G is greater than G;
9) the lowest conformation of Rosetta score3 was exported as the final result.
taking α protein 4UEX with the sequence length of 82 as an example, the method for predicting the protein structure of the multi-contact information sub-population strategy comprises the following steps:
1) sequence information for a given protein of interest;
2) obtaining a fragment library file from a ROBETTA server (http:// www.robetta.org /) according to a target protein sequence, wherein the fragment library file comprises a 3 fragment library file and a 9 fragment library file;
3) predicting 3 contact maps from a Raptorx server (RaptorX. uchicago. edu/ContactMap), a ResTriplet server (zhangglab. ccmb. med. omich. edu/Restriplet) and a DNCON2 server (sysbio. rnet. missouri. edu/DNCON2) respectively according to a target protein sequence, and selecting L/5 contact information from large to small according to the confidence degree of each contact information in each contact map to form a high-confidence contact information set contact 1, a contact 2 and a contact cf3 respectively, wherein L is the length of the target protein sequence;
4) constructing a contact set contf 4 with high confidence according to the contact information of the contf 1, the contf 2 and the contcf 3, wherein the construction rule of the contf 4 is as follows:
4.1) adding contact information for each of contif 1, contif 2 and conticf 3 to conticf 4, respectively, if the residue pair does not overlap for the contact information in contif 1, contif 2 and conticf 3;
4.2) for the contact information in contictf 1, contictf 2 and conticf 3, if the residue pair is repeated, firstly, averaging the confidence degrees of the contact information repeated in contictf 1, contictf 2 and conticf 3, and then adding the average to the contictf 4;
4.3) sorting according to the confidence degree of the contact information in the contact 4 from big to small, and calculating the number Num of contacts in the contact 4;
5) setting parameters, wherein the population size NP is 200, the maximum iteration algebra G of the algorithm is 6000, the cross factor CR is 0.5, the temperature factor β is 4, and the iteration algebra G is 0;
6) population initialization: random fragment assembly to generate NP initial conformations CiI ═ 1,2, …, NP, dividing NP initial constellations equally into 4 sub-populations
7) For each individual in the population CiThe following operations are carried out:
7.1) mixing CiSet as target individual CtargetRandomly selecting two different individuals C from the populationaAnd Cb,Ctarget≠Ca≠CbFrom C, respectivelya、CbIn the method, a 9 segment with different positions is randomly selected and respectively replacedCorresponding position fragment generates variant conformation Cmutant;
7.2) pairs of CmutantOne-time fragment assembly to generate new conformation Cmutant′;
7.3) generating a random number pCR, where pCR ∈ (0,1), if pCR < CR, then from CtargetIn the sequence, randomly selecting a 3-segment to replace to Cmutant' fragment of corresponding position generates test conformation CtrialOtherwise, directly handle Cmutant' As Ctrial;
7.4) computing C using the Rosetta score3 energy functiontarget、CtrialEnergy score3 (C)trial)、score3(Ctarget);
7.5) if score3 (C)trial)>score3(Ctarget) Then C is retainedtarget;
7.6) if score3 (C)trial)<score3(Ctarget) Then C is calculated according to equation (1)trialAnd CtargetIs a space constraint score of Scon(C),Scon(C) Is defined as follows;
wherein m and n are respectively the m-th residue and the n-th residue corresponding to the K-th contact in the high-confidence contact set, K is the number of contacts in the high-confidence contact set, dm,nEuclidean distance of the mth residue from the nth residue in conformation C, Um,nConfidence that the residue pair (m, n) corresponds to a contact in the high-confidence contact set, ifContact 1 for high confidence contact set selection, ifContact 2 for high confidence contact set selection, ifContact 3 for high confidence contact set selection, ifHigh confidence contact set selects contact 4;
7.7) if Scon(Ctrial)<Scon(Ctarget) Then C istrialReplacement CtargetEntering a population;
7.8) if Scon(Ctrial)>Scon(Ctarget) Then C istrialWith probability PacceptReplacement CtargetEntering a population, and if the replacement is unsuccessful, retaining CtargetWherein P isacceptIs defined as follows;
8) g +1, and iteratively executing the steps 5) -8) until G is greater than G;
9) the lowest conformation of Rosetta score3 was exported as the final result.
taking alpha protein 4UEX with sequence length of 82 as an example, the near-natural state conformation of the protein is obtained by using the method, and the average root mean square deviation between the structure obtained by running 6000 generations and the natural state structure isMinimum root mean square deviation ofThe predicted three-dimensional structure is shown in fig. 3.
The foregoing illustrates one example of the invention, and it will be apparent that the invention is not limited to the above-described embodiments, but may be practiced with various modifications without departing from the essential spirit of the invention and without departing from the spirit thereof.
Claims (1)
1. A method for predicting a protein structure of a multi-contact information sub-population strategy is characterized by comprising the following steps: the method comprises the following steps:
1) sequence information for a given protein of interest;
2) obtaining fragment library files from a ROBETTA server according to a target protein sequence, wherein the fragment library files comprise 3 fragment library files and 9 fragment library files;
3) respectively predicting 3 contact graphs from a Raptorx server, a Restriplet server and a DNCON2 server according to a target protein sequence, and respectively selecting L/5 contact information from large to small according to the confidence degree of each contact information in each contact graph to respectively form a high-confidence contact information set contact 1, contact 2 and a contact 3, wherein L is the length of the target protein sequence;
4) constructing a contact set contf 4 with high confidence according to the contact information of the contf 1, the contf 2 and the contcf 3, wherein the construction rule of the contf 4 is as follows:
4.1) adding contact information for each of contif 1, contif 2 and conticf 3 to conticf 4, respectively, if the residue pair does not overlap for the contact information in contif 1, contif 2 and conticf 3;
4.2) for the contact information in contictf 1, contictf 2 and conticf 3, if the residue pair is repeated, firstly, averaging the confidence degrees of the contact information repeated in contictf 1, contictf 2 and conticf 3, and then adding the average to the contictf 4;
4.3) sorting according to the confidence degree of the contact information in the contact 4 from big to small, and calculating the number Num of contacts in the contact 4;
5) setting parameters, namely a population size NP, a maximum iteration algebra G of the algorithm, a cross factor CR and a temperature factor beta, and setting the iteration algebra G to be 0;
6) population initialization: random fragment assembly to generate NP initial conformations CiI ═ 1,2, …, NP, dividing NP initial constellations equally into 4 sub-populations
7) For each individual in the population CiThe following operations are carried out:
7.1) mixing CiSet as target individual CtargetRandomly selecting two different individuals C from the populationaAnd Cb,Ctarget≠Ca≠CbFrom C, respectivelya、CbIn the method, a 9 segment with different positions is randomly selected and respectively replacedCorresponding position fragment generates variant conformation Cmutant;
7.2) pairs of CmutantOne-time fragment assembly to generate new conformation Cmutant′;
7.3) generating a random number pCR, where pCR ∈ (0,1), if pCR < CR, then from CtargetIn the sequence, randomly selecting a 3-segment to replace to Cmutant' fragment of corresponding position generates test conformation CtrialOtherwise, directly handle Cmutant' As Ctrial;
7.4) computing C using the Rosetta score3 energy functiontarget、CtrialEnergy score3 (C)trial)、score3(Ctarget);
7.5) if score3 (C)trial)>score3(Ctarget) Then C is retainedtarget;
7.6) if score3 (C)trial)<score3(Ctarget) Then C is calculated according to equation (1)trialAnd CtargetIs a space constraint score of Scon(C),Scon(C) Is defined as follows;
wherein m and n are respectively the m-th residue and the n-th residue corresponding to the K-th contact in the high-confidence contact set, K is the number of contacts in the high-confidence contact set, dm,nEuclidean distance of the mth residue from the nth residue in conformation C, Um,nConfidence that the residue pair (m, n) corresponds to a contact in the high-confidence contact set, ifContact 1 for high confidence contact set selection, ifContact 2 for high confidence contact set selection, ifContact 3 for high confidence contact set selection, ifHigh confidence contact set selects contact 4;
7.7) if Scon(Ctrial)<Scon(Ctarget) Then C istrialReplacement CtargetEntering a population;
7.8) if Scon(Ctrial)>Scon(Ctarget) Then C istrialWith probability PacceptReplacement CtargetEntering a population, and if the replacement is unsuccessful, retaining CtargetWherein P isacceptIs defined as follows;
8) g +1, and iteratively executing the steps 5) -8) until G is greater than G;
9) the lowest conformation of Rosetta score3 was exported as the final result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911197621.2A CN111180004B (en) | 2019-11-29 | 2019-11-29 | Multi-contact information sub-population strategy protein structure prediction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911197621.2A CN111180004B (en) | 2019-11-29 | 2019-11-29 | Multi-contact information sub-population strategy protein structure prediction method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111180004A true CN111180004A (en) | 2020-05-19 |
CN111180004B CN111180004B (en) | 2021-08-03 |
Family
ID=70656268
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911197621.2A Active CN111180004B (en) | 2019-11-29 | 2019-11-29 | Multi-contact information sub-population strategy protein structure prediction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111180004B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112085244A (en) * | 2020-07-21 | 2020-12-15 | 浙江工业大学 | Residue contact map-based multi-objective optimization protein structure prediction method |
CN112908408A (en) * | 2021-03-03 | 2021-06-04 | 江苏海洋大学 | Protein structure prediction method based on evolutionary algorithm and archive updating |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846256A (en) * | 2018-06-07 | 2018-11-20 | 浙江工业大学 | A kind of group's Advances in protein structure prediction based on contact residues information |
CN109215733A (en) * | 2018-08-30 | 2019-01-15 | 浙江工业大学 | A kind of Advances in protein structure prediction based on contact residues information auxiliary evaluation |
CN109360599A (en) * | 2018-08-28 | 2019-02-19 | 浙江工业大学 | A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy |
CN110148437A (en) * | 2019-04-16 | 2019-08-20 | 浙江工业大学 | A kind of Advances in protein structure prediction that contact residues auxiliary strategy is adaptive |
-
2019
- 2019-11-29 CN CN201911197621.2A patent/CN111180004B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846256A (en) * | 2018-06-07 | 2018-11-20 | 浙江工业大学 | A kind of group's Advances in protein structure prediction based on contact residues information |
CN109360599A (en) * | 2018-08-28 | 2019-02-19 | 浙江工业大学 | A kind of Advances in protein structure prediction based on contact residues information Crossover Strategy |
CN109215733A (en) * | 2018-08-30 | 2019-01-15 | 浙江工业大学 | A kind of Advances in protein structure prediction based on contact residues information auxiliary evaluation |
CN110148437A (en) * | 2019-04-16 | 2019-08-20 | 浙江工业大学 | A kind of Advances in protein structure prediction that contact residues auxiliary strategy is adaptive |
Non-Patent Citations (2)
Title |
---|
GUI-JUN ZHANG等: "《Secondary Structure and Contact Guided Differential Evolution for Protein Structure Prediction》", 《IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS》 * |
MU GAO等: "《DESTINI: A deep-learning approach to contact-driven protein structure prediction》", 《SCIENTIFIC REPORTS》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112085244A (en) * | 2020-07-21 | 2020-12-15 | 浙江工业大学 | Residue contact map-based multi-objective optimization protein structure prediction method |
CN112908408A (en) * | 2021-03-03 | 2021-06-04 | 江苏海洋大学 | Protein structure prediction method based on evolutionary algorithm and archive updating |
CN112908408B (en) * | 2021-03-03 | 2023-09-22 | 江苏海洋大学 | Protein structure prediction method based on evolutionary algorithm and archiving update |
Also Published As
Publication number | Publication date |
---|---|
CN111180004B (en) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Su et al. | Improved protein structure prediction using a new multi‐scale network and homologous templates | |
Nguyen et al. | Ultra-large alignments using phylogeny-aware profiles | |
CN108846256B (en) | Group protein structure prediction method based on residue contact information | |
CN107633157B (en) | Protein conformation space optimization method based on distribution estimation and copy exchange strategy | |
CN111180004B (en) | Multi-contact information sub-population strategy protein structure prediction method | |
CN110148437B (en) | Residue contact auxiliary strategy self-adaptive protein structure prediction method | |
Mao et al. | AmoebaContact and GDFold as a pipeline for rapid de novo protein structure prediction | |
CN105760710A (en) | Method for predicting protein structure on basis of two-stage differential evolution algorithm | |
Browning et al. | Fast, accurate local ancestry inference with FLARE | |
Simoncini et al. | Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm | |
CN109872770B (en) | Variable strategy protein structure prediction method combined with displacement degree evaluation | |
Gao et al. | High-performance deep learning toolbox for genome-scale prediction of protein structure and function | |
Feng et al. | Artificial intelligence in bioinformatics: Automated methodology development for protein residue contact map prediction | |
Baldi et al. | A machine learning strategy for protein analysis | |
CN109509510B (en) | Protein structure prediction method based on multi-population ensemble variation strategy | |
CN109346126B (en) | Adaptive protein structure prediction method of lower bound estimation strategy | |
Hong et al. | fastmsa: Accelerating multiple sequence alignment with dense retrieval on protein language | |
Mourad et al. | Designing pooling systems for noisy high-throughput protein-protein interaction experiments using Boolean compressed sensing | |
CN109448786B (en) | Method for predicting protein structure by lower bound estimation dynamic strategy | |
CN109461471B (en) | Adaptive protein structure prediction method based on championship mechanism | |
CN109300505B (en) | Protein structure prediction method based on biased sampling | |
CN110600076B (en) | Protein ATP docking method based on distance and angle information | |
CN116092576A (en) | Protein structure optimization method and device | |
CN110197700B (en) | Protein ATP docking method based on differential evolution | |
CN111161791B (en) | Experimental data-assisted adaptive strategy protein structure prediction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20200519 Assignee: ZHEJIANG ORIENT GENE BIOTECH CO.,LTD. Assignor: JIANG University OF TECHNOLOGY Contract record no.: X2023980053610 Denomination of invention: A Subpopulation Strategy Protein Structure Prediction Method Based on Multivariate Contact Information Granted publication date: 20210803 License type: Common License Record date: 20231222 |
|
EE01 | Entry into force of recordation of patent licensing contract |